Compare commits

...

658 Commits

Author SHA1 Message Date
Linus Torvalds
62fb9874f5 Linux 5.13 2021-06-27 15:21:11 -07:00
Linus Torvalds
b4b27b9eed Revert "signal: Allow tasks to cache one sigqueue struct"
This reverts commits 4bad58ebc8 (and
399f8dd9a8, which tried to fix it).

I do not believe these are correct, and I'm about to release 5.13, so am
reverting them out of an abundance of caution.

The locking is odd, and appears broken.

On the allocation side (in __sigqueue_alloc()), the locking is somewhat
straightforward: it depends on sighand->siglock.  Since one caller
doesn't hold that lock, it further then tests 'sigqueue_flags' to avoid
the case with no locks held.

On the freeing side (in sigqueue_cache_or_free()), there is no locking
at all, and the logic instead depends on 'current' being a single
thread, and not able to race with itself.

To make things more exciting, there's also the data race between freeing
a signal and allocating one, which is handled by using WRITE_ONCE() and
READ_ONCE(), and being mutually exclusive wrt the initial state (ie
freeing will only free if the old state was NULL, while allocating will
obviously only use the value if it was non-NULL, so only one or the
other will actually act on the value).

However, while the free->alloc paths do seem mutually exclusive thanks
to just the data value dependency, it's not clear what the memory
ordering constraints are on it.  Could writes from the previous
allocation possibly be delayed and seen by the new allocation later,
causing logical inconsistencies?

So it's all very exciting and unusual.

And in particular, it seems that the freeing side is incorrect in
depending on "current" being single-threaded.  Yes, 'current' is a
single thread, but in the presense of asynchronous events even a single
thread can have data races.

And such asynchronous events can and do happen, with interrupts causing
signals to be flushed and thus free'd (for example - sending a
SIGCONT/SIGSTOP can happen from interrupt context, and can flush
previously queued process control signals).

So regardless of all the other questions about the memory ordering and
locking for this new cached allocation, the sigqueue_cache_or_free()
assumptions seem to be fundamentally incorrect.

It may be that people will show me the errors of my ways, and tell me
why this is all safe after all.  We can reinstate it if so.  But my
current belief is that the WRITE_ONCE() that sets the cached entry needs
to be a smp_store_release(), and the READ_ONCE() that finds a cached
entry needs to be a smp_load_acquire() to handle memory ordering
correctly.

And the sequence in sigqueue_cache_or_free() would need to either use a
lock or at least be interrupt-safe some way (perhaps by using something
like the percpu 'cmpxchg': it doesn't need to be SMP-safe, but like the
percpu operations it needs to be interrupt-safe).

Fixes: 399f8dd9a8 ("signal: Prevent sigqueue caching after task got released")
Fixes: 4bad58ebc8 ("signal: Allow tasks to cache one sigqueue struct")
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-27 13:32:54 -07:00
Linus Torvalds
625acffd7a Merge tag 's390-5.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Fix a couple of late pt_regs flags handling findings of conversion to
   generic entry.

 - Fix potential register clobbering in stack switch helper.

 - Fix thread/group masks for offline cpus.

 - Fix cleanup of mdev resources when remove callback is invoked in
   vfio-ap code.

* tag 's390-5.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/stack: fix possible register corruption with stack switch helper
  s390/topology: clear thread/group maps for offline cpus
  s390/vfio-ap: clean up mdev resources when remove callback invoked
  s390: clear pt_regs::flags on irq entry
  s390: fix system call restart with multiple signals
2021-06-26 09:50:10 -07:00
Linus Torvalds
b7050b2424 Merge tag 'pinctrl-v5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
 "Two last-minute fixes:

   - Put an fwnode in the errorpath in the SGPIO driver

   - Fix the number of GPIO lines per bank in the STM32 driver"

* tag 'pinctrl-v5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: stm32: fix the reported number of GPIO lines per bank
  pinctrl: microchip-sgpio: Put fwnode in error case during ->probe()
2021-06-25 19:06:24 -07:00
Linus Torvalds
e2f527b58e Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Two small fixes, both in upper layer drivers (scsi disk and cdrom).

  The sd one is fixing a commit changing revalidation that came from the
  block tree a while ago (5.10) and the sr one adds handling of a
  condition we didn't previously handle for manually removed media"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: Call sd_revalidate_disk() for ioctl(BLKRRPART)
  scsi: sr: Return appropriate error code when disk is ejected
2021-06-25 15:59:14 -07:00
Linus Torvalds
7ce32ac6fb Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "24 patches, based on 4a09d388f2.

  Subsystems affected by this patch series: mm (thp, vmalloc, hugetlb,
  memory-failure, and pagealloc), nilfs2, kthread, MAINTAINERS, and
  mailmap"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits)
  mailmap: add Marek's other e-mail address and identity without diacritics
  MAINTAINERS: fix Marek's identity again
  mm/page_alloc: do bulk array bounds check after checking populated elements
  mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array
  mm/hwpoison: do not lock page again when me_huge_page() successfully recovers
  mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned
  mm/memory-failure: use a mutex to avoid memory_failure() races
  mm, futex: fix shared futex pgoff on shmem huge page
  kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
  kthread_worker: split code for canceling the delayed work timer
  mm/vmalloc: unbreak kasan vmalloc support
  KVM: s390: prepare for hugepage vmalloc
  mm/vmalloc: add vmalloc_no_huge
  nilfs2: fix memory leak in nilfs_sysfs_delete_device_group
  mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()
  mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
  mm: page_vma_mapped_walk(): get vma_address_end() earlier
  mm: page_vma_mapped_walk(): use goto instead of while (1)
  mm: page_vma_mapped_walk(): add a level of indentation
  mm: page_vma_mapped_walk(): crossing page table boundary
  ...
2021-06-25 11:05:03 -07:00
Gleb Fotengauer-Malinovskiy
808e9df477 userfaultfd: uapi: fix UFFDIO_CONTINUE ioctl request definition
This ioctl request reads from uffdio_continue structure written by
userspace which justifies _IOC_WRITE flag.  It also writes back to that
structure which justifies _IOC_READ flag.

See NOTEs in include/uapi/asm-generic/ioctl.h for more information.

Fixes: f619147104 ("userfaultfd: add UFFDIO_CONTINUE ioctl")
Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-25 10:53:26 -07:00
Linus Torvalds
55fcd4493d Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Three more driver bugfixes and an annotation fix for the core"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: robotfuzz-osif: fix control-request directions
  i2c: dev: Add __user annotation
  i2c: cp2615: check for allocation failure in cp2615_i2c_recv()
  i2c: i801: Ensure that SMBHSTSTS_INUSE_STS is cleared when leaving i801_access
2021-06-25 10:44:03 -07:00
Linus Torvalds
7764c62f98 Merge tag 'devprop-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework fix from Rafael Wysocki:
 "Fix a NULL pointer dereference introduced by a recent commit and
  occurring when device_remove_software_node() is used with a device
  that has never been registered (Heikki Krogerus)"

* tag 'devprop-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  software node: Handle software node injection to an existing device properly
2021-06-25 10:30:28 -07:00
Linus Torvalds
b960e01474 Merge tag 'for-linus-5.13b-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
 "A fix for a regression introduced in 5.12: when migrating an irq
  related to a Xen user event to another cpu, a race might result
  in a WARN() triggering"

* tag 'for-linus-5.13b-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/events: reset active flag for lateeoi events later
2021-06-25 10:19:01 -07:00
Linus Torvalds
616a99dd14 Merge tag 'for-linus-urgent' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "A selftests fix for ARM, and the fix for page reference count
  underflow. This is a very small fix that was provided by Nick Piggin
  and tested by myself"

* tag 'for-linus-urgent' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: do not allow mapping valid but non-reference-counted pages
  KVM: selftests: Fix mapping length truncation in m{,un}map()
2021-06-25 10:15:35 -07:00
Linus Torvalds
94ca94bbbb Merge tag 'x86_urgent_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
 "Two more urgent FPU fixes:

   - prevent unprivileged userspace from reinitializing supervisor
     states

   - prepare init_fpstate, which is the buffer used when initializing
     FPU state, properly in case the skip-writing-state-components
     XSAVE* variants are used"

* tag 'x86_urgent_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Make init_fpstate correct with optimized XSAVE
  x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()
2021-06-25 10:00:25 -07:00
Linus Torvalds
edf54d9d0a Merge tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "Two regression fixes from the merge window: one in the auth code
  affecting old clusters and one in the filesystem for proper
  propagation of MDS request errors.

  Also included a locking fix for async creates, marked for stable"

* tag 'ceph-for-5.13-rc8' of https://github.com/ceph/ceph-client:
  libceph: set global_id as soon as we get an auth ticket
  libceph: don't pass result into ac->ops->handle_reply()
  ceph: fix error handling in ceph_atomic_open and ceph_lookup
  ceph: must hold snap_rwsem when filling inode for async create
2021-06-25 09:50:30 -07:00
Linus Torvalds
9e736cf7d6 Merge tag 'netfs-fixes-20210621' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull netfs fixes from David Howells:
 "This contains patches to fix netfs_write_begin() and afs_write_end()
  in the following ways:

  (1) In netfs_write_begin(), extract the decision about whether to skip
      a page out to its own helper and have that clear around the region
      to be written, but not clear that region. This requires the
      filesystem to patch it up afterwards if the hole doesn't get
      completely filled.

  (2) Use offset_in_thp() in (1) rather than manually calculating the
      offset into the page.

  (3) Due to (1), afs_write_end() now needs to handle short data write
      into the page by generic_perform_write(). I've adopted an
      analogous approach to ceph of just returning 0 in this case and
      letting the caller go round again.

  It also adds a note that (in the future) the len parameter may extend
  beyond the page allocated. This is because the page allocation is
  deferred to write_begin() and that gets to decide what size of THP to
  allocate."

Jeff Layton points out:
 "The netfs fix in particular fixes a data corruption bug in cephfs"

* tag 'netfs-fixes-20210621' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  netfs: fix test for whether we can skip read when writing beyond EOF
  afs: Fix afs_write_end() to handle short writes
2021-06-25 09:41:29 -07:00
Linus Torvalds
c13e302133 Merge tag 'gpio-fixes-for-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix wake-up interrupt support on gpio-mxc

 - zero the padding bytes in a structure passed to user-space in the
   GPIO character device

 - require HAS_IOPORT_MAP in two drivers that need it to fix a Kbuild
   issue

* tag 'gpio-fixes-for-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: AMD8111 and TQMX86 require HAS_IOPORT_MAP
  gpiolib: cdev: zero padding during conversion to gpioline_info_changed
  gpio: mxc: Fix disabled interrupt wake-up support
2021-06-25 09:32:57 -07:00
Linus Torvalds
e41fc7c8e2 Merge tag 'sound-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Two small changes have been cherry-picked as a last material for 5.13:
  a coverage after UMN revert action and a stale MAINTAINERS entry fix"

* tag 'sound-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  MAINTAINERS: remove Timur Tabi from Freescale SOC sound drivers
  ASoC: rt5645: Avoid upgrading static warnings to errors
2021-06-25 09:20:22 -07:00
Johannes Berg
c6414e1a2b gpio: AMD8111 and TQMX86 require HAS_IOPORT_MAP
Both of these drivers use ioport_map(), so they need to
depend on HAS_IOPORT_MAP. Otherwise, they cannot be built
even with COMPILE_TEST on architectures without an ioport
implementation, such as ARCH=um.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-06-25 12:13:53 +02:00
Marek Behún
72a461adbe mailmap: add Marek's other e-mail address and identity without diacritics
Some of my commits were sent with identities
  Marek Behun <marek.behun@nic.cz>
  Marek Behún <marek.behun@nic.cz>
while the correct one is
  Marek Behún <kabel@kernel.org>

Put this into mailmap so that git shortlog prints all my commits under
one identity.

Link: https://lkml.kernel.org/r/20210616113624.19351-2-kabel@kernel.org
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Marek Behún
ee924d3ddd MAINTAINERS: fix Marek's identity again
Fix my name to use diacritics, since MAINTAINERS supports it.

Fix my e-mail address in MAINTAINERS' marvell10g PHY driver description,
I accidentally put my other e-mail address here.

Link: https://lkml.kernel.org/r/20210616113624.19351-1-kabel@kernel.org
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Mel Gorman
b3b64ebd38 mm/page_alloc: do bulk array bounds check after checking populated elements
Dan Carpenter reported the following

  The patch 0f87d9d30f: "mm/page_alloc: add an array-based interface
  to the bulk page allocator" from Apr 29, 2021, leads to the following
  static checker warning:

        mm/page_alloc.c:5338 __alloc_pages_bulk()
        warn: potentially one past the end of array 'page_array[nr_populated]'

The problem can occur if an array is passed in that is fully populated.
That potentially ends up allocating a single page and storing it past
the end of the array.  This patch returns 0 if the array is fully
populated.

Link: https://lkml.kernel.org/r/20210618125102.GU30378@techsingularity.net
Fixes: 0f87d9d30f ("mm/page_alloc: add an array-based interface to the bulk page allocator")
Signed-off-by: Mel Gorman <mgorman@techsinguliarity.net>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Rasmus Villemoes
b08e50dd64 mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array
In the event that somebody would call this with an already fully
populated page_array, the last loop iteration would do an access beyond
the end of page_array.

It's of course extremely unlikely that would ever be done, but this
triggers my internal static analyzer.  Also, if it really is not
supposed to be invoked this way (i.e., with no NULL entries in
page_array), the nr_populated<nr_pages check could simply be removed
instead.

Link: https://lkml.kernel.org/r/20210507064504.1712559-1-linux@rasmusvillemoes.dk
Fixes: 0f87d9d30f ("mm/page_alloc: add an array-based interface to the bulk page allocator")
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Naoya Horiguchi
ea6d063010 mm/hwpoison: do not lock page again when me_huge_page() successfully recovers
Currently me_huge_page() temporary unlocks page to perform some actions
then locks it again later.  My testcase (which calls hard-offline on
some tail page in a hugetlb, then accesses the address of the hugetlb
range) showed that page allocation code detects this page lock on buddy
page and printed out "BUG: Bad page state" message.

check_new_page_bad() does not consider a page with __PG_HWPOISON as bad
page, so this flag works as kind of filter, but this filtering doesn't
work in this case because the "bad page" is not the actual hwpoisoned
page.  So stop locking page again.  Actions to be taken depend on the
page type of the error, so page unlocking should be done in ->action()
callbacks.  So let's make it assumed and change all existing callbacks
that way.

Link: https://lkml.kernel.org/r/20210609072029.74645-1-nao.horiguchi@gmail.com
Fixes: commit 78bb920344 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Aili Yao
47af12bae1 mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned
When memory_failure() is called with MF_ACTION_REQUIRED on the page that
has already been hwpoisoned, memory_failure() could fail to send SIGBUS
to the affected process, which results in infinite loop of MCEs.

Currently memory_failure() returns 0 if it's called for already
hwpoisoned page, then the caller, kill_me_maybe(), could return without
sending SIGBUS to current process.  An action required MCE is raised
when the current process accesses to the broken memory, so no SIGBUS
means that the current process continues to run and access to the error
page again soon, so running into MCE loop.

This issue can arise for example in the following scenarios:

 - Two or more threads access to the poisoned page concurrently. If
   local MCE is enabled, MCE handler independently handles the MCE
   events. So there's a race among MCE events, and the second or latter
   threads fall into the situation in question.

 - If there was a precedent memory error event and memory_failure() for
   the event failed to unmap the error page for some reason, the
   subsequent memory access to the error page triggers the MCE loop
   situation.

To fix the issue, make memory_failure() return an error code when the
error page has already been hwpoisoned.  This allows memory error
handler to control how it sends signals to userspace.  And make sure
that any process touching a hwpoisoned page should get a SIGBUS even in
"already hwpoisoned" path of memory_failure() as is done in page fault
path.

Link: https://lkml.kernel.org/r/20210521030156.2612074-3-nao.horiguchi@gmail.com
Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jue Wang <juew@google.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Tony Luck
171936ddaf mm/memory-failure: use a mutex to avoid memory_failure() races
Patch series "mm,hwpoison: fix sending SIGBUS for Action Required MCE", v5.

I wrote this patchset to materialize what I think is the current
allowable solution mentioned by the previous discussion [1].  I simply
borrowed Tony's mutex patch and Aili's return code patch, then I queued
another one to find error virtual address in the best effort manner.  I
know that this is not a perfect solution, but should work for some
typical case.

[1]: https://lore.kernel.org/linux-mm/20210331192540.2141052f@alex-virtual-machine/

This patch (of 2):

There can be races when multiple CPUs consume poison from the same page.
The first into memory_failure() atomically sets the HWPoison page flag
and begins hunting for tasks that map this page.  Eventually it
invalidates those mappings and may send a SIGBUS to the affected tasks.

But while all that work is going on, other CPUs see a "success" return
code from memory_failure() and so they believe the error has been
handled and continue executing.

Fix by wrapping most of the internal parts of memory_failure() in a
mutex.

[akpm@linux-foundation.org: make mf_mutex local to memory_failure()]

Link: https://lkml.kernel.org/r/20210521030156.2612074-1-nao.horiguchi@gmail.com
Link: https://lkml.kernel.org/r/20210521030156.2612074-2-nao.horiguchi@gmail.com
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Aili Yao <yaoaili@kingsoft.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jue Wang <juew@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Hugh Dickins
fe19bd3dae mm, futex: fix shared futex pgoff on shmem huge page
If more than one futex is placed on a shmem huge page, it can happen
that waking the second wakes the first instead, and leaves the second
waiting: the key's shared.pgoff is wrong.

When 3.11 commit 13d60f4b6a ("futex: Take hugepages into account when
generating futex_key"), the only shared huge pages came from hugetlbfs,
and the code added to deal with its exceptional page->index was put into
hugetlb source.  Then that was missed when 4.8 added shmem huge pages.

page_to_pgoff() is what others use for this nowadays: except that, as
currently written, it gives the right answer on hugetlbfs head, but
nonsense on hugetlbfs tails.  Fix that by calling hugetlbfs-specific
hugetlb_basepage_index() on PageHuge tails as well as on head.

Yes, it's unconventional to declare hugetlb_basepage_index() there in
pagemap.h, rather than in hugetlb.h; but I do not expect anything but
page_to_pgoff() ever to need it.

[akpm@linux-foundation.org: give hugetlb_basepage_index() prototype the correct scope]

Link: https://lkml.kernel.org/r/b17d946b-d09-326e-b42a-52884c36df32@google.com
Fixes: 800d8c63b2 ("shmem: add huge pages support")
Reported-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Zhang Yi <wetpzy@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Petr Mladek
5fa54346ca kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()
The system might hang with the following backtrace:

	schedule+0x80/0x100
	schedule_timeout+0x48/0x138
	wait_for_common+0xa4/0x134
	wait_for_completion+0x1c/0x2c
	kthread_flush_work+0x114/0x1cc
	kthread_cancel_work_sync.llvm.16514401384283632983+0xe8/0x144
	kthread_cancel_delayed_work_sync+0x18/0x2c
	xxxx_pm_notify+0xb0/0xd8
	blocking_notifier_call_chain_robust+0x80/0x194
	pm_notifier_call_chain_robust+0x28/0x4c
	suspend_prepare+0x40/0x260
	enter_state+0x80/0x3f4
	pm_suspend+0x60/0xdc
	state_store+0x108/0x144
	kobj_attr_store+0x38/0x88
	sysfs_kf_write+0x64/0xc0
	kernfs_fop_write_iter+0x108/0x1d0
	vfs_write+0x2f4/0x368
	ksys_write+0x7c/0xec

It is caused by the following race between kthread_mod_delayed_work()
and kthread_cancel_delayed_work_sync():

CPU0				CPU1

Context: Thread A		Context: Thread B

kthread_mod_delayed_work()
  spin_lock()
  __kthread_cancel_work()
     spin_unlock()
     del_timer_sync()
				kthread_cancel_delayed_work_sync()
				  spin_lock()
				  __kthread_cancel_work()
				    spin_unlock()
				    del_timer_sync()
				    spin_lock()

				  work->canceling++
				  spin_unlock
     spin_lock()
   queue_delayed_work()
     // dwork is put into the worker->delayed_work_list

   spin_unlock()

				  kthread_flush_work()
     // flush_work is put at the tail of the dwork

				    wait_for_completion()

Context: IRQ

  kthread_delayed_work_timer_fn()
    spin_lock()
    list_del_init(&work->node);
    spin_unlock()

BANG: flush_work is not longer linked and will never get proceed.

The problem is that kthread_mod_delayed_work() checks work->canceling
flag before canceling the timer.

A simple solution is to (re)check work->canceling after
__kthread_cancel_work().  But then it is not clear what should be
returned when __kthread_cancel_work() removed the work from the queue
(list) and it can't queue it again with the new @delay.

The return value might be used for reference counting.  The caller has
to know whether a new work has been queued or an existing one was
replaced.

The proper solution is that kthread_mod_delayed_work() will remove the
work from the queue (list) _only_ when work->canceling is not set.  The
flag must be checked after the timer is stopped and the remaining
operations can be done under worker->lock.

Note that kthread_mod_delayed_work() could remove the timer and then
bail out.  It is fine.  The other canceling caller needs to cancel the
timer as well.  The important thing is that the queue (list)
manipulation is done atomically under worker->lock.

Link: https://lkml.kernel.org/r/20210610133051.15337-3-pmladek@suse.com
Fixes: 9a6b06c8d9 ("kthread: allow to modify delayed kthread work")
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reported-by: Martin Liu <liumartin@google.com>
Cc: <jenhaochen@google.com>
Cc: Minchan Kim <minchan@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Petr Mladek
34b3d53447 kthread_worker: split code for canceling the delayed work timer
Patch series "kthread_worker: Fix race between kthread_mod_delayed_work()
and kthread_cancel_delayed_work_sync()".

This patchset fixes the race between kthread_mod_delayed_work() and
kthread_cancel_delayed_work_sync() including proper return value
handling.

This patch (of 2):

Simple code refactoring as a preparation step for fixing a race between
kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync().

It does not modify the existing behavior.

Link: https://lkml.kernel.org/r/20210610133051.15337-2-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: <jenhaochen@google.com>
Cc: Martin Liu <liumartin@google.com>
Cc: Minchan Kim <minchan@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Daniel Axtens
7ca3027b72 mm/vmalloc: unbreak kasan vmalloc support
In commit 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings"),
__vmalloc_node_range was changed such that __get_vm_area_node was no
longer called with the requested/real size of the vmalloc allocation,
but rather with a rounded-up size.

This means that __get_vm_area_node called kasan_unpoision_vmalloc() with
a rounded up size rather than the real size.  This led to it allowing
access to too much memory and so missing vmalloc OOBs and failing the
kasan kunit tests.

Pass the real size and the desired shift into __get_vm_area_node.  This
allows it to round up the size for the underlying allocators while still
unpoisioning the correct quantity of shadow memory.

Adjust the other call-sites to pass in PAGE_SHIFT for the shift value.

Link: https://lkml.kernel.org/r/20210617081330.98629-1-dja@axtens.net
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213335
Fixes: 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Daniel Axtens <dja@axtens.net>
Tested-by: David Gow <davidgow@google.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Claudio Imbrenda
185cca24e9 KVM: s390: prepare for hugepage vmalloc
The Create Secure Configuration Ultravisor Call does not support using
large pages for the virtual memory area.  This is a hardware limitation.

This patch replaces the vzalloc call with an almost equivalent call to
the newly introduced vmalloc_no_huge function, which guarantees that
only small pages will be used for the backing.

The new call will not clear the allocated memory, but that has never
been an actual requirement.

Link: https://lkml.kernel.org/r/20210614132357.10202-3-imbrenda@linux.ibm.com
Fixes: 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:54 -07:00
Claudio Imbrenda
15a64f5a88 mm/vmalloc: add vmalloc_no_huge
Patch series "mm: add vmalloc_no_huge and use it", v4.

Add vmalloc_no_huge() and export it, so modules can allocate memory with
small pages.

Use the newly added vmalloc_no_huge() in KVM on s390 to get around a
hardware limitation.

This patch (of 2):

Commit 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings") added
support for hugepage vmalloc mappings, it also added the flag
VM_NO_HUGE_VMAP for __vmalloc_node_range to request the allocation to be
performed with 0-order non-huge pages.

This flag is not accessible when calling vmalloc, the only option is to
call directly __vmalloc_node_range, which is not exported.

This means that a module can't vmalloc memory with small pages.

Case in point: KVM on s390x needs to vmalloc a large area, and it needs
to be mapped with non-huge pages, because of a hardware limitation.

This patch adds the function vmalloc_no_huge, which works like vmalloc,
but it is guaranteed to always back the mapping using small pages.  This
new function is exported, therefore it is usable by modules.

[akpm@linux-foundation.org: whitespace fixes, per Christoph]

Link: https://lkml.kernel.org/r/20210614132357.10202-1-imbrenda@linux.ibm.com
Link: https://lkml.kernel.org/r/20210614132357.10202-2-imbrenda@linux.ibm.com
Fixes: 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Pavel Skripkin
8fd0c1b064 nilfs2: fix memory leak in nilfs_sysfs_delete_device_group
My local syzbot instance hit memory leak in nilfs2.  The problem was in
missing kobject_put() in nilfs_sysfs_delete_device_group().

kobject_del() does not call kobject_cleanup() for passed kobject and it
leads to leaking duped kobject name if kobject_put() was not called.

Fail log:

  BUG: memory leak
  unreferenced object 0xffff8880596171e0 (size 8):
  comm "syz-executor379", pid 8381, jiffies 4294980258 (age 21.100s)
  hex dump (first 8 bytes):
    6c 6f 6f 70 30 00 00 00                          loop0...
  backtrace:
     kstrdup+0x36/0x70 mm/util.c:60
     kstrdup_const+0x53/0x80 mm/util.c:83
     kvasprintf_const+0x108/0x190 lib/kasprintf.c:48
     kobject_set_name_vargs+0x56/0x150 lib/kobject.c:289
     kobject_add_varg lib/kobject.c:384 [inline]
     kobject_init_and_add+0xc9/0x160 lib/kobject.c:473
     nilfs_sysfs_create_device_group+0x150/0x800 fs/nilfs2/sysfs.c:999
     init_nilfs+0xe26/0x12b0 fs/nilfs2/the_nilfs.c:637

Link: https://lkml.kernel.org/r/20210612140559.20022-1-paskripkin@gmail.com
Fixes: da7141fb78 ("nilfs2: add /sys/fs/nilfs2/<device> group")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
a7a69d8ba8 mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()
Aha! Shouldn't that quick scan over pte_none()s make sure that it holds
ptlock in the PVMW_SYNC case? That too might have been responsible for
BUGs or WARNs in split_huge_page_to_list() or its unmap_page(), though
I've never seen any.

Link: https://lkml.kernel.org/r/1bdf384c-8137-a149-2a1e-475a4791c3c@google.com
Link: https://lore.kernel.org/linux-mm/20210412180659.B9E3.409509F4@e16-tech.com/
Fixes: ace71a19ce ("mm: introduce page_vma_mapped_walk()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Wang Yugui <wangyugui@e16-tech.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
a9a7504d9b mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
Running certain tests with a DEBUG_VM kernel would crash within hours,
on the total_mapcount BUG() in split_huge_page_to_list(), while trying
to free up some memory by punching a hole in a shmem huge page: split's
try_to_unmap() was unable to find all the mappings of the page (which,
on a !DEBUG_VM kernel, would then keep the huge page pinned in memory).

Crash dumps showed two tail pages of a shmem huge page remained mapped
by pte: ptes in a non-huge-aligned vma of a gVisor process, at the end
of a long unmapped range; and no page table had yet been allocated for
the head of the huge page to be mapped into.

Although designed to handle these odd misaligned huge-page-mapped-by-pte
cases, page_vma_mapped_walk() falls short by returning false prematurely
when !pmd_present or !pud_present or !p4d_present or !pgd_present: there
are cases when a huge page may span the boundary, with ptes present in
the next.

Restructure page_vma_mapped_walk() as a loop to continue in these cases,
while keeping its layout much as before.  Add a step_forward() helper to
advance pvmw->address across those boundaries: originally I tried to use
mm's standard p?d_addr_end() macros, but hit the same crash 512 times
less often: because of the way redundant levels are folded together, but
folded differently in different configurations, it was just too
difficult to use them correctly; and step_forward() is simpler anyway.

Link: https://lkml.kernel.org/r/fedb8632-1798-de42-f39e-873551d5bc81@google.com
Fixes: ace71a19ce ("mm: introduce page_vma_mapped_walk()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
a765c417d8 mm: page_vma_mapped_walk(): get vma_address_end() earlier
page_vma_mapped_walk() cleanup: get THP's vma_address_end() at the
start, rather than later at next_pte.

It's a little unnecessary overhead on the first call, but makes for a
simpler loop in the following commit.

Link: https://lkml.kernel.org/r/4542b34d-862f-7cb4-bb22-e0df6ce830a2@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
474466301d mm: page_vma_mapped_walk(): use goto instead of while (1)
page_vma_mapped_walk() cleanup: add a label this_pte, matching next_pte,
and use "goto this_pte", in place of the "while (1)" loop at the end.

Link: https://lkml.kernel.org/r/a52b234a-851-3616-2525-f42736e8934@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
b3807a91ac mm: page_vma_mapped_walk(): add a level of indentation
page_vma_mapped_walk() cleanup: add a level of indentation to much of
the body, making no functional change in this commit, but reducing the
later diff when this is all converted to a loop.

[hughd@google.com: : page_vma_mapped_walk(): add a level of indentation fix]
  Link: https://lkml.kernel.org/r/7f817555-3ce1-c785-e438-87d8efdcaf26@google.com

Link: https://lkml.kernel.org/r/efde211-f3e2-fe54-977-ef481419e7f3@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
4482824874 mm: page_vma_mapped_walk(): crossing page table boundary
page_vma_mapped_walk() cleanup: adjust the test for crossing page table
boundary - I believe pvmw->address is always page-aligned, but nothing
else here assumed that; and remember to reset pvmw->pte to NULL after
unmapping the page table, though I never saw any bug from that.

Link: https://lkml.kernel.org/r/799b3f9c-2a9e-dfef-5d89-26e9f76fd97@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
e2e1d4076c mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block
page_vma_mapped_walk() cleanup: rearrange the !pmd_present() block to
follow the same "return not_found, return not_found, return true"
pattern as the block above it (note: returning not_found there is never
premature, since existence or prior existence of huge pmd guarantees
good alignment).

Link: https://lkml.kernel.org/r/378c8650-1488-2edf-9647-32a53cf2e21@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
3306d3119c mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd
page_vma_mapped_walk() cleanup: re-evaluate pmde after taking lock, then
use it in subsequent tests, instead of repeatedly dereferencing pointer.

Link: https://lkml.kernel.org/r/53fbc9d-891e-46b2-cb4b-468c3b19238e@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
6d0fd59876 mm: page_vma_mapped_walk(): settle PageHuge on entry
page_vma_mapped_walk() cleanup: get the hugetlbfs PageHuge case out of
the way at the start, so no need to worry about it later.

Link: https://lkml.kernel.org/r/e31a483c-6d73-a6bb-26c5-43c3b880a2@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Hugh Dickins
f003c03bd2 mm: page_vma_mapped_walk(): use page for pvmw->page
Patch series "mm: page_vma_mapped_walk() cleanup and THP fixes".

I've marked all of these for stable: many are merely cleanups, but I
think they are much better before the main fix than after.

This patch (of 11):

page_vma_mapped_walk() cleanup: sometimes the local copy of pvwm->page
was used, sometimes pvmw->page itself: use the local copy "page"
throughout.

Link: https://lkml.kernel.org/r/589b358c-febc-c88e-d4c2-7834b37fa7bf@google.com
Link: https://lkml.kernel.org/r/88e67645-f467-c279-bf5e-af4b5c6b13eb@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-24 19:40:53 -07:00
Linus Torvalds
44db63d1ad Merge tag 'drm-fixes-2021-06-25' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "This is a bit bigger than I'd like at this stage, and I guess last
  week was extra quiet, but it's mostly one fix across three drivers to
  wait for buffer move pinning to complete.

  There was one locking change that got reverted so it's just noise.

  Otherwise the amdgpu/nouveau changes are for known regressions, and
  otherwise it's just misc changes in kmb/atmel/vc4 drivers.

  Summary:

  core:
   - auth locking change + brown paper bag revert

  radeon/nouveau/amdgpu/ttm:
   - wait for BO to be pinned after moving it (same fix in three
     drivers)

  amdgpu:
   - Revert GFX9/10 doorbell fixes, we just end up trading one bug for
     another
   - Potential memory corruption fix in framebuffer handling

  nouveau:
   - fix regression checking dma addresses

  kmb:
   - error return fix

  atmel-hlcdc:
   - fix kernel warnings at boot
   - enable async flips

  vc4:
   - fix CPU hang due to power management"

* tag 'drm-fixes-2021-06-25' of git://anongit.freedesktop.org/drm/drm:
  drm/nouveau: fix dma_address check for CPU/GPU sync
  drm/kmb: Fix error return code in kmb_hw_init()
  drm/amdgpu: wait for moving fence after pinning
  drm/radeon: wait for moving fence after pinning
  drm/nouveau: wait for moving fence after pinning v2
  Revert "drm: add a locked version of drm_is_current_master"
  Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue."
  Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell."
  drm/amdgpu: Call drm_framebuffer_init last for framebuffer init
  drm: add a locked version of drm_is_current_master
  drm/atmel-hlcdc: Allow async page flips
  drm/panel: ld9040: reference spi_device_id table
  drm: atmel_hlcdc: Enable the crtc vblank prior to crtc usage.
  drm/vc4: hdmi: Make sure the controller is powered in detect
  drm/vc4: hdmi: Move the HSM clock enable to runtime_pm
2021-06-24 13:27:07 -07:00
Johan Hovold
4ca070ef0d i2c: robotfuzz-osif: fix control-request directions
The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.

Control transfers without a data stage are treated as OUT requests by
the USB stack and should be using usb_sndctrlpipe(). Failing to do so
will now trigger a warning.

Fix the OSIFI2C_SET_BIT_RATE and OSIFI2C_STOP requests which erroneously
used the osif_usb_read() helper and set the IN direction bit.

Reported-by: syzbot+9d7dadd15b8819d73f41@syzkaller.appspotmail.com
Fixes: 83e53a8f12 ("i2c: Add bus driver for for OSIF USB i2c device.")
Cc: stable@vger.kernel.org      # 3.14
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-24 22:08:00 +02:00
Dave Airlie
5e0e7a4076 Merge tag 'drm-misc-fixes-2021-06-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A DMA address check for nouveau, an error code return fix for kmb, fixes
to wait for a moving fence after pinning the BO for amdgpu, nouveau and
radeon, a crtc and async page flip fix for atmel-hlcdc and a cpu hang
fix for vc4.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624190353.wyizoil3wqrrxz5d@gilmour
2021-06-25 06:05:13 +10:00
Andreas Hecht
3265a7e6b4 i2c: dev: Add __user annotation
Fix Sparse warnings:
drivers/i2c/i2c-dev.c:546:19: warning: incorrect type in assignment (different address spaces)
drivers/i2c/i2c-dev.c:549:53: warning: incorrect type in argument 2 (different address spaces)

compat_ptr() returns a pointer tagged __user which gets assigned to a
pointer missing the __user annotation. The same pointer is passed to
copy_from_user() as an argument where it is expected to have the __user
annotation. Fix both by adding the __user annotation to the pointer.

Fixes: 7d5cb45655 ("i2c compat ioctls: move to ->compat_ioctl()")
Signed-off-by: Andreas Hecht <andreas.e.hecht@gmail.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-24 21:47:43 +02:00
Ilya Dryomov
03af4c7bad libceph: set global_id as soon as we get an auth ticket
Commit 61ca49a910 ("libceph: don't set global_id until we get an
auth ticket") delayed the setting of global_id too much.  It is set
only after all tickets are received, but in pre-nautilus clusters an
auth ticket and the service tickets are obtained in separate steps
(for a total of three MAuth replies).  When the service tickets are
requested, global_id is used to build an authorizer; if global_id is
still 0 we never get them and fail to establish the session.

Moving the setting of global_id into protocol implementations.  This
way global_id can be set exactly when an auth ticket is received, not
sooner nor later.

Fixes: 61ca49a910 ("libceph: don't set global_id until we get an auth ticket")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24 21:03:17 +02:00
Ilya Dryomov
3c0d089432 libceph: don't pass result into ac->ops->handle_reply()
There is no result to pass in msgr2 case because authentication
failures are reported through auth_bad_method frame and in MAuth
case an error is returned immediately.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24 21:03:16 +02:00
Linus Torvalds
4a09d388f2 Merge tag 'mmc-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:
 "Use memcpy_to/fromio for dram-access-quirk in the meson-gx host
  driver"

* tag 'mmc-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: meson-gx: use memcpy_to/fromio for dram-access-quirk
2021-06-24 10:53:05 -07:00
Linus Torvalds
7749b0337b Merge tag 'core-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sigqueue cache fix from Ingo Molnar:
 "Fix a memory leak in the recently introduced sigqueue cache"

* tag 'core-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  signal: Prevent sigqueue caching after task got released
2021-06-24 09:06:19 -07:00
Linus Torvalds
666751701b Merge tag 'sched-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "A last minute cgroup bandwidth scheduling fix for a recently
  introduced logic fail which triggered a kernel warning by LTP's
  cfs_bandwidth01 test"

* tag 'sched-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Ensure that the CFS parent is added after unthrottling
2021-06-24 08:58:23 -07:00
Linus Torvalds
df50110004 Merge tag 'perf-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf fix from Ingo Molnar:
 "An LBR buffer fix for code that probably only worked accidentally"

* tag 'perf-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/lbr: Zero the xstate buffer on allocation
2021-06-24 08:55:12 -07:00
Nicholas Piggin
f8be156be1 KVM: do not allow mapping valid but non-reference-counted pages
It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on valid pages if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

This addresses CVE-2021-22543.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 11:55:11 -04:00
Linus Torvalds
c0e457851f Merge tag 'objtool-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Ingo Molnar:
 "Address a number of objtool warnings that got reported.

  No change in behavior intended, but code generation might be impacted
  by commit 1f008d46f1 ("x86: Always inline task_size_max()")"

* tag 'objtool-urgent-2021-06-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/lockdep: Improve noinstr vs errors
  x86: Always inline task_size_max()
  x86/xen: Fix noinstr fail in exc_xen_unknown_trap()
  x86/xen: Fix noinstr fail in xen_pv_evtchn_do_upcall()
  x86/entry: Fix noinstr fail in __do_fast_syscall_32()
  objtool/x86: Ignore __x86_indirect_alt_* symbols
2021-06-24 08:47:33 -07:00
Christian König
d330099115 drm/nouveau: fix dma_address check for CPU/GPU sync
AGP for example doesn't have a dma_address array.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210614110517.1624-1-christian.koenig@amd.com
2021-06-24 15:40:44 +02:00
Juergen Gross
3de218ff39 xen/events: reset active flag for lateeoi events later
In order to avoid a race condition for user events when changing
cpu affinity reset the active flag only when EOI-ing the event.

This is working fine as all user events are lateeoi events. Note that
lateeoi_ack_mask_dynirq() is not modified as there is no explicit call
to xen_irq_lateeoi() expected later.

Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Fixes: b6622798bc ("xen/events: avoid handling the same event on two cpus at the same time")
Tested-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrvsky@oracle.com>
Link: https://lore.kernel.org/r/20210623130913.9405-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-06-24 12:52:36 +02:00
Timur Tabi
5c6d4f9726 MAINTAINERS: remove Timur Tabi from Freescale SOC sound drivers
I haven't touched these drivers in seven years, and none of the
patches sent to me these days affect code that I wrote.  The
other maintainers are doing a very good job without me.

Signed-off-by: Timur Tabi <timur@kernel.org>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20210620160135.28651-1-timur@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
(cherry picked from commit 50b1ce617d)
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-24 12:23:45 +02:00
Mark Brown
10043bb6af ASoC: rt5645: Avoid upgrading static warnings to errors
One of the fixes reverted as part of the UMN fallout was actually fine,
however rather than undoing the revert the process that handled all this
stuff resulted in a patch which attempted to add extra error checks
instead.  Unfortunately this new change wasn't really based on a good
understanding of the subsystem APIs and bypassed the usual patch flow
without ensuring it was reviewed by people with subsystem knowledge and
was merged as a fix rather than during the merge window.

The effect of the new fix is to upgrade what were previously warnings on
static data in the code to hard errors on that data.  If this actually
happens then it would break existing systems, if it doesn't happen then
the change has no effect so this was not a safe change to apply as a fix
to the release candidates.  Since the new code has not been tested and
doesn't in practice improve error handling revert it instead, and also
drop the original revert since the original fix was fine.  This takes
the driver back to what it was in -rc1.

Fixes: 5e70b8e22b ("ASoC: rt5645: add error checking to rt5645_probe function")
Fixes: 1e0ce84215 ("Revert "ASoC: rt5645: fix a NULL pointer dereference")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Phillip Potter <phil@philpotter.co.uk>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210608160713.21040-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
(cherry picked from commit 916cccb507)
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-24 12:22:27 +02:00
Zenghui Yu
309505dd56 KVM: selftests: Fix mapping length truncation in m{,un}map()
max_mem_slots is now declared as uint32_t. The result of (0x200000 * 32767)
is unexpectedly truncated to be 0xffe00000, whilst we actually need to
allocate about, 63GB. Cast max_mem_slots to size_t in both mmap() and
munmap() to fix the length truncation.

We'll otherwise see the failure on arm64 thanks to the access_ok() checking
in __kvm_set_memory_region(), as the unmapped VA happen to go beyond the
task's allowed address space.

 # ./set_memory_region_test
Allowed number of memory slots: 32767
Adding slots 0..32766, each memory region with 2048K size
==== Test Assertion Failure ====
  set_memory_region_test.c:391: ret == 0
  pid=94861 tid=94861 errno=22 - Invalid argument
     1	0x00000000004015a7: test_add_max_memory_regions at set_memory_region_test.c:389
     2	 (inlined by) main at set_memory_region_test.c:426
     3	0x0000ffffb8e67bdf: ?? ??:0
     4	0x00000000004016db: _start at :?
  KVM_SET_USER_MEMORY_REGION IOCTL failed,
  rc: -1 errno: 22 slot: 2615

Fixes: 3bf0fcd754 ("KVM: selftests: Speed up set_memory_region_test")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Message-Id: <20210624070931.565-1-yuzenghui@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 04:04:38 -04:00
Dave Airlie
efea0c12a4 Merge tag 'amd-drm-fixes-5.13-2021-06-21' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.13-2021-06-21:

amdgpu:
- Revert GFX9, 10 doorbell fixes, we just
  end up trading one bug for another
- Potential memory corruption fix in framebuffer handling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210621214132.4004-1-alexander.deucher@amd.com
2021-06-24 17:06:12 +10:00
Thomas Gleixner
7f049fbdd5 perf/x86/intel/lbr: Zero the xstate buffer on allocation
XRSTORS requires a valid xstate buffer to work correctly. XSAVES does not
guarantee to write a fully valid buffer according to the SDM:

  "XSAVES does not write to any parts of the XSAVE header other than the
   XSTATE_BV and XCOMP_BV fields."

XRSTORS triggers a #GP:

  "If bytes 63:16 of the XSAVE header are not all zero."

It's dubious at best how this can work at all when the buffer is not zeroed
before use.

Allocate the buffers with __GFP_ZERO to prevent XRSTORS failure.

Fixes: ce711ea3ca ("perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/87wnr0wo2z.ffs@nanos.tec.linutronix.de
2021-06-24 08:49:03 +02:00
Linus Torvalds
7426cedc7d Merge tag 'spi-fix-v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A couple of small, driver specific fixes that arrived in the past few
  weeks"

* tag 'spi-fix-v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-nxp-fspi: move the register operation after the clock enable
  spi: tegra20-slink: Ensure SPI controller reset is deasserted
2021-06-23 11:29:15 -07:00
Heikki Krogerus
5dca69e26f software node: Handle software node injection to an existing device properly
The function software_node_notify() - the function that creates
and removes the symlinks between the node and the device - was
called unconditionally in device_add_software_node() and
device_remove_software_node(), but it needs to be called in
those functions only in the special case where the node is
added to a device that has already been registered.

This fixes NULL pointer dereference that happens if
device_remove_software_node() is used with device that was
never registered.

Fixes: b622b24519 ("software node: Allow node addition to already existing device")
Reported-and-tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-23 19:34:58 +02:00
Linus Torvalds
7266f2030e Merge tag 'pm-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Revert a recent PCI power management commit that causes initialization
  issues to appear on some systems"

* tag 'pm-5.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "PCI: PM: Do not read power state in pci_enable_device_flags()"
2021-06-23 09:40:55 -07:00
Linus Torvalds
8fd2ed1c01 Merge branch 'stable/for-linus-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb fix from Konrad Rzeszutek Wilk:
 "A fix for the regression for the DMA operations where the offset was
  ignored and corruptions would appear.

  Going forward there will be a cleanups to make the offset and
  alignment logic more clearer and better test-cases to help with this"

* 'stable/for-linus-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: manipulate orig_addr when tlb_addr has offset
2021-06-23 09:04:07 -07:00
Christoph Hellwig
d1b7f92035 scsi: sd: Call sd_revalidate_disk() for ioctl(BLKRRPART)
While the disk state has nothing to do with partitions, BLKRRPART is used
to force a full revalidate after things like a disk format for historical
reasons. Restore that behavior.

Link: https://lore.kernel.org/r/20210617115504.1732350-1-hch@lst.de
Fixes: 471bd0af54 ("sd: use bdev_check_media_change")
Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-22 22:02:20 -04:00
Mimi Zohar
0c18f29aae module: limit enabling module.sig_enforce
Irrespective as to whether CONFIG_MODULE_SIG is configured, specifying
"module.sig_enforce=1" on the boot command line sets "sig_enforce".
Only allow "sig_enforce" to be set when CONFIG_MODULE_SIG is configured.

This patch makes the presence of /sys/module/module/parameters/sig_enforce
dependent on CONFIG_MODULE_SIG=y.

Fixes: fda784e50a ("module: export module signature enforcement status")
Reported-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Mimi Zohar <zohar@linux.ibm.com>
Tested-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-22 11:13:19 -07:00
Zhen Lei
6fd8f323b3 drm/kmb: Fix error return code in kmb_hw_init()
When the call to platform_get_irq() to obtain the IRQ of the lcd fails, the
returned error code should be propagated. However, we currently do not
explicitly assign this error code to 'ret'. As a result, 0 was incorrectly
returned.

Fixes: 7f7b96a8a0 ("drm/kmb: Add support for KeemBay Display")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210513134639.6541-1-thunder.leizhen@huawei.com
2021-06-22 10:33:49 -07:00
Rafael J. Wysocki
4d6035f9bf Revert "PCI: PM: Do not read power state in pci_enable_device_flags()"
Revert commit 4514d991d9 ("PCI: PM: Do not read power state in
pci_enable_device_flags()") that is reported to cause PCI device
initialization issues on some systems.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213481
Link: https://lore.kernel.org/linux-acpi/YNDoGICcg0V8HhpQ@eldamar.lan
Reported-by: Michael <phyre@rogers.com>
Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Fixes: 4514d991d9 ("PCI: PM: Do not read power state in pci_enable_device_flags()")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-22 17:35:18 +02:00
Thomas Gleixner
399f8dd9a8 signal: Prevent sigqueue caching after task got released
syzbot reported a memory leak related to sigqueue caching.

The assumption that a task cannot cache a sigqueue after the signal handler
has been dropped and exit_task_sigqueue_cache() has been invoked turns out
to be wrong.

Such a task can still invoke release_task(other_task), which cleans up the
signals of 'other_task' and ends up in sigqueue_cache_or_free(), which in
turn will cache the signal because task->sigqueue_cache is NULL. That's
obviously bogus because nothing will free the cached signal of that task
anymore, so the cached item is leaked.

This happens when e.g. the last non-leader thread exits and reaps the
zombie leader.

Prevent this by setting tsk::sigqueue_cache to an error pointer value in
exit_task_sigqueue_cache() which forces any subsequent invocation of
sigqueue_cache_or_free() from that task to hand the sigqueue back to the
kmemcache.

Add comments to all relevant places.

Fixes: 4bad58ebc8 ("signal: Allow tasks to cache one sigqueue struct")
Reported-by: syzbot+0bac5fec63d4f399ba98@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/878s32g6j5.ffs@nanos.tec.linutronix.de
2021-06-22 15:55:41 +02:00
Christian König
8ddf5b9bb4 drm/amdgpu: wait for moving fence after pinning
We actually need to wait for the moving fence after pinning
the BO to make sure that the pin is completed.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/
CC: stable@kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-3-christian.koenig@amd.com
2021-06-22 15:29:03 +02:00
Christian König
4b41726aae drm/radeon: wait for moving fence after pinning
We actually need to wait for the moving fence after pinning
the BO to make sure that the pin is completed.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/
CC: stable@kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-2-christian.koenig@amd.com
2021-06-22 15:29:03 +02:00
Christian König
17b11f7179 drm/nouveau: wait for moving fence after pinning v2
We actually need to wait for the moving fence after pinning
the BO to make sure that the pin is completed.

v2: grab the lock while waiting

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/
CC: stable@kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-1-christian.koenig@amd.com
2021-06-22 15:29:03 +02:00
Rik van Riel
fdaba61ef8 sched/fair: Ensure that the CFS parent is added after unthrottling
Ensure that a CFS parent will be in the list whenever one of its children is also
in the list.

A warning on rq->tmp_alone_branch != &rq->leaf_cfs_rq_list has been
reported while running LTP test cfs_bandwidth01.

Odin Ugedal found the root cause:

	$ tree /sys/fs/cgroup/ltp/ -d --charset=ascii
	/sys/fs/cgroup/ltp/
	|-- drain
	`-- test-6851
	    `-- level2
		|-- level3a
		|   |-- worker1
		|   `-- worker2
		`-- level3b
		    `-- worker3

Timeline (ish):
- worker3 gets throttled
- level3b is decayed, since it has no more load
- level2 get throttled
- worker3 get unthrottled
- level2 get unthrottled
  - worker3 is added to list
  - level3b is not added to list, since nr_running==0 and is decayed

 [ Vincent Guittot: Rebased and updated to fix for the reported warning. ]

Fixes: a7b359fc6a ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Acked-by: Odin Ugedal <odin@uged.al>
Link: https://lore.kernel.org/r/20210621174330.11258-1-vincent.guittot@linaro.org
2021-06-22 14:06:57 +02:00
Peter Zijlstra
49faa77759 locking/lockdep: Improve noinstr vs errors
Better handle the failure paths.

  vmlinux.o: warning: objtool: debug_locks_off()+0x23: call to console_verbose() leaves .noinstr.text section
  vmlinux.o: warning: objtool: debug_locks_off()+0x19: call to __kasan_check_write() leaves .noinstr.text section

  debug_locks_off+0x19/0x40:
  instrument_atomic_write at include/linux/instrumented.h:86
  (inlined by) __debug_locks_off at include/linux/debug_locks.h:17
  (inlined by) debug_locks_off at lib/debug_locks.c:41

Fixes: 6eebad1ad3 ("lockdep: __always_inline more for noinstr")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210621120120.784404944@infradead.org
2021-06-22 13:56:43 +02:00
Peter Zijlstra
1f008d46f1 x86: Always inline task_size_max()
Fix:

  vmlinux.o: warning: objtool: handle_bug()+0x10: call to task_size_max() leaves .noinstr.text section

When #UD isn't a BUG, we shouldn't violate noinstr (we'll still
probably die, but that's another story).

Fixes: 025768a966 ("x86/cpu: Use alternative to generate the TASK_SIZE_MAX constant")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210621120120.682468274@infradead.org
2021-06-22 13:56:43 +02:00
Peter Zijlstra
4c9c26f1e6 x86/xen: Fix noinstr fail in exc_xen_unknown_trap()
Fix:

  vmlinux.o: warning: objtool: exc_xen_unknown_trap()+0x7: call to printk() leaves .noinstr.text section

Fixes: 2e92493637 ("x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210621120120.606560778@infradead.org
2021-06-22 13:56:42 +02:00
Peter Zijlstra
84e60065df x86/xen: Fix noinstr fail in xen_pv_evtchn_do_upcall()
Fix:

  vmlinux.o: warning: objtool: xen_pv_evtchn_do_upcall()+0x23: call to irq_enter_rcu() leaves .noinstr.text section

Fixes: 359f01d181 ("x86/entry: Use run_sysvec_on_irqstack_cond() for XEN upcall")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210621120120.532960208@infradead.org
2021-06-22 13:56:42 +02:00
Peter Zijlstra
240001d4e3 x86/entry: Fix noinstr fail in __do_fast_syscall_32()
Fix:

  vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xf5: call to trace_hardirqs_off() leaves .noinstr.text section

Fixes: 5d5675df79 ("x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210621120120.467898710@infradead.org
2021-06-22 13:56:42 +02:00
Jeff Layton
7a971e2c07 ceph: fix error handling in ceph_atomic_open and ceph_lookup
Commit aa60cfc3f7 broke the error handling in these functions such
that they don't handle non-ENOENT errors from ceph_mdsc_do_request
properly.

Move the checking of -ENOENT out of ceph_handle_snapdir and into the
callers, and if we get a different error, return it immediately.

Fixes: aa60cfc3f7 ("ceph: don't use d_add in ceph_handle_snapdir")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-06-22 13:53:28 +02:00
Jeff Layton
27171ae6a0 ceph: must hold snap_rwsem when filling inode for async create
...and add a lockdep assertion for it to ceph_fill_inode().

Cc: stable@vger.kernel.org # v5.7+
Fixes: 9a8d03ca2e ("ceph: attempt to do async create when possible")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-06-22 13:53:28 +02:00
Thomas Gleixner
f9dfb5e390 x86/fpu: Make init_fpstate correct with optimized XSAVE
The XSAVE init code initializes all enabled and supported components with
XRSTOR(S) to init state. Then it XSAVEs the state of the components back
into init_fpstate which is used in several places to fill in the init state
of components.

This works correctly with XSAVE, but not with XSAVEOPT and XSAVES because
those use the init optimization and skip writing state of components which
are in init state. So init_fpstate.xsave still contains all zeroes after
this operation.

There are two ways to solve that:

   1) Use XSAVE unconditionally, but that requires to reshuffle the buffer when
      XSAVES is enabled because XSAVES uses compacted format.

   2) Save the components which are known to have a non-zero init state by other
      means.

Looking deeper, #2 is the right thing to do because all components the
kernel supports have all-zeroes init state except the legacy features (FP,
SSE). Those cannot be hard coded because the states are not identical on all
CPUs, but they can be saved with FXSAVE which avoids all conditionals.

Use FXSAVE to save the legacy FP/SSE components in init_fpstate along with
a BUILD_BUG_ON() which reminds developers to validate that a newly added
component has all zeroes init state. As a bonus remove the now unused
copy_xregs_to_kernel_booting() crutch.

The XSAVE and reshuffle method can still be implemented in the unlikely
case that components are added which have a non-zero init state and no
other means to save them. For now, FXSAVE is just simple and good enough.

  [ bp: Fix a typo or two in the text. ]

Fixes: 6bad06b768 ("x86, xsave: Use xsaveopt in context-switch path when supported")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210618143444.587311343@linutronix.de
2021-06-22 11:06:21 +02:00
Thomas Gleixner
9301982c42 x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()
sanitize_restored_user_xstate() preserves the supervisor states only
when the fx_only argument is zero, which allows unprivileged user space
to put supervisor states back into init state.

Preserve them unconditionally.

 [ bp: Fix a typo or two in the text. ]

Fixes: 5d6b6a6f9b ("x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210618143444.438635017@linutronix.de
2021-06-22 10:51:23 +02:00
Daniel Vetter
f54b3ca7ea Revert "drm: add a locked version of drm_is_current_master"
This reverts commit 1815d9c86e.

Unfortunately this inverts the locking hierarchy, so back to the
drawing board. Full lockdep splat below:

======================================================
WARNING: possible circular locking dependency detected
5.13.0-rc7-CI-CI_DRM_10254+ #1 Not tainted
------------------------------------------------------
kms_frontbuffer/1087 is trying to acquire lock:
ffff88810dcd01a8 (&dev->master_mutex){+.+.}-{3:3}, at: drm_is_current_master+0x1b/0x40
but task is already holding lock:
ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&dev->mode_config.mutex){+.+.}-{3:3}:
       __mutex_lock+0xab/0x970
       drm_client_modeset_probe+0x22e/0xca0
       __drm_fb_helper_initial_config_and_unlock+0x42/0x540
       intel_fbdev_initial_config+0xf/0x20 [i915]
       async_run_entry_fn+0x28/0x130
       process_one_work+0x26d/0x5c0
       worker_thread+0x37/0x380
       kthread+0x144/0x170
       ret_from_fork+0x1f/0x30
-> #1 (&client->modeset_mutex){+.+.}-{3:3}:
       __mutex_lock+0xab/0x970
       drm_client_modeset_commit_locked+0x1c/0x180
       drm_client_modeset_commit+0x1c/0x40
       __drm_fb_helper_restore_fbdev_mode_unlocked+0x88/0xb0
       drm_fb_helper_set_par+0x34/0x40
       intel_fbdev_set_par+0x11/0x40 [i915]
       fbcon_init+0x270/0x4f0
       visual_init+0xc6/0x130
       do_bind_con_driver+0x1e5/0x2d0
       do_take_over_console+0x10e/0x180
       do_fbcon_takeover+0x53/0xb0
       register_framebuffer+0x22d/0x310
       __drm_fb_helper_initial_config_and_unlock+0x36c/0x540
       intel_fbdev_initial_config+0xf/0x20 [i915]
       async_run_entry_fn+0x28/0x130
       process_one_work+0x26d/0x5c0
       worker_thread+0x37/0x380
       kthread+0x144/0x170
       ret_from_fork+0x1f/0x30
-> #0 (&dev->master_mutex){+.+.}-{3:3}:
       __lock_acquire+0x151e/0x2590
       lock_acquire+0xd1/0x3d0
       __mutex_lock+0xab/0x970
       drm_is_current_master+0x1b/0x40
       drm_mode_getconnector+0x37e/0x4a0
       drm_ioctl_kernel+0xa8/0xf0
       drm_ioctl+0x1e8/0x390
       __x64_sys_ioctl+0x6a/0xa0
       do_syscall_64+0x39/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: &dev->master_mutex --> &client->modeset_mutex --> &dev->mode_config.mutex
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&dev->mode_config.mutex);
                               lock(&client->modeset_mutex);
                               lock(&dev->mode_config.mutex);
  lock(&dev->master_mutex);
*** DEADLOCK ***
1 lock held by kms_frontbuffer/1087:
 #0: ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0
stack backtrace:
CPU: 7 PID: 1087 Comm: kms_frontbuffer Not tainted 5.13.0-rc7-CI-CI_DRM_10254+ #1
Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
Call Trace:
 dump_stack+0x7f/0xad
 check_noncircular+0x12e/0x150
 __lock_acquire+0x151e/0x2590
 lock_acquire+0xd1/0x3d0
 __mutex_lock+0xab/0x970
 drm_is_current_master+0x1b/0x40
 drm_mode_getconnector+0x37e/0x4a0
 drm_ioctl_kernel+0xa8/0xf0
 drm_ioctl+0x1e8/0x390
 __x64_sys_ioctl+0x6a/0xa0
 do_syscall_64+0x39/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Note that this broke the intel-gfx CI pretty much across the board
because it has to reboot machines after it hits a lockdep splat.

Testcase: igt/debugfs_test/read_all_entries
Acked-by: Petri Latvala <petri.latvala@intel.com>
Fixes: 1815d9c86e ("drm: add a locked version of drm_is_current_master")
Cc: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622075409.2673805-1-daniel.vetter@ffwll.ch
2021-06-22 10:41:55 +02:00
Gabriel Knezek
cb8f63b8cb gpiolib: cdev: zero padding during conversion to gpioline_info_changed
When userspace requests a GPIO v1 line info changed event,
lineinfo_watch_read() populates and returns the gpioline_info_changed
structure. It contains 5 words of padding at the end which are not
initialized before being returned to userspace.

Zero the structure in gpio_v2_line_info_change_to_v1() before populating
its contents.

Fixes: aad955842d ("gpiolib: cdev: support GPIO_V2_GET_LINEINFO_IOCTL and GPIO_V2_GET_LINEINFO_WATCH_IOCTL")
Signed-off-by: Gabriel Knezek <gabeknez@linux.microsoft.com>
Reviewed-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-06-22 09:54:13 +02:00
Yifan Zhang
ee5468b9f1 Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue."
This reverts commit 4cbbe34807.

Reason for revert: side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-21 17:22:52 -04:00
Yifan Zhang
baacf52a47 Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell."
This reverts commit 1c0b0efd14.

Reason for revert: Side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-21 17:22:06 -04:00
Michel Dänzer
4c6a23188e drm/amdgpu: Call drm_framebuffer_init last for framebuffer init
Once drm_framebuffer_init has returned 0, the framebuffer is hooked up
to the reference counting machinery and can no longer be destroyed with
a simple kfree. Therefore, it must be called last.

If drm_framebuffer_init returns 0 but its caller then returns non-0,
there will likely be memory corruption fireworks down the road.
The following lead me to this fix:

[   12.891228] kernel BUG at lib/list_debug.c:25!
[...]
[   12.891263] RIP: 0010:__list_add_valid+0x4b/0x70
[...]
[   12.891324] Call Trace:
[   12.891330]  drm_framebuffer_init+0xb5/0x100 [drm]
[   12.891378]  amdgpu_display_gem_fb_verify_and_init+0x47/0x120 [amdgpu]
[   12.891592]  ? amdgpu_display_user_framebuffer_create+0x10d/0x1f0 [amdgpu]
[   12.891794]  amdgpu_display_user_framebuffer_create+0x126/0x1f0 [amdgpu]
[   12.891995]  drm_internal_framebuffer_create+0x378/0x3f0 [drm]
[   12.892036]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892075]  drm_mode_addfb2+0x34/0xd0 [drm]
[   12.892115]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892153]  drm_ioctl_kernel+0xe2/0x150 [drm]
[   12.892193]  drm_ioctl+0x3da/0x460 [drm]
[   12.892232]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892274]  amdgpu_drm_ioctl+0x43/0x80 [amdgpu]
[   12.892475]  __se_sys_ioctl+0x72/0xc0
[   12.892483]  do_syscall_64+0x33/0x40
[   12.892491]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: f258907fdd "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-21 17:21:49 -04:00
Jeff Layton
827a746f40 netfs: fix test for whether we can skip read when writing beyond EOF
It's not sufficient to skip reading when the pos is beyond the EOF.
There may be data at the head of the page that we need to fill in
before the write.

Add a new helper function that corrects and clarifies the logic of
when we can skip reads, and have it only zero out the part of the page
that won't have data copied in for the write.

Finally, don't set the page Uptodate after zeroing. It's not up to date
since the write data won't have been copied in yet.

[DH made the following changes:

 - Prefixed the new function with "netfs_".

 - Don't call zero_user_segments() for a full-page write.

 - Altered the beyond-last-page check to avoid a DIV instruction and got
   rid of then-redundant zero-length file check.
]

Fixes: e1b1240c1f ("netfs: Add write_begin helper")
Reported-by: Andrew W Elble <aweits@rit.edu>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: ceph-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20210613233345.113565-1-jlayton@kernel.org/
Link: https://lore.kernel.org/r/162367683365.460125.4467036947364047314.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/162391826758.1173366.11794946719301590013.stgit@warthog.procyon.org.uk/ # v2
2021-06-21 21:24:07 +01:00
David Howells
66e9c6a86b afs: Fix afs_write_end() to handle short writes
Fix afs_write_end() to correctly handle a short copy into the intended
write region of the page.  Two things are necessary:

 (1) If the page is not up to date, then we should just return 0
     (ie. indicating a zero-length copy).  The loop in
     generic_perform_write() will go around again, possibly breaking up the
     iterator into discrete chunks[1].

     This is analogous to commit b9de313cf0
     for ceph.

 (2) The page should not have been set uptodate if it wasn't completely set
     up by netfs_write_begin() (this will be fixed in the next patch), so
     we need to set uptodate here in such a case.

Also remove the assertion that was checking that the page was set uptodate
since it's now set uptodate if it wasn't already a few lines above.  The
assertion was from when uptodate was set elsewhere.

Changes:
v3: Remove the handling of len exceeding the end of the page.

Fixes: 3003bbd069 ("afs: Use the netfs_write_begin() helper")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/YMwVp268KTzTf8cN@zeniv-ca.linux.org.uk/ [1]
Link: https://lore.kernel.org/r/162367682522.460125.5652091227576721609.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/162391825688.1173366.3437507255136307904.stgit@warthog.procyon.org.uk/ # v2
2021-06-21 21:23:36 +01:00
Loic Poulain
3093e6cca3 gpio: mxc: Fix disabled interrupt wake-up support
A disabled/masked interrupt marked as wakeup source must be re-enable
and unmasked in order to be able to wake-up the host. That can be done
by flaging the irqchip with IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND.

Note: It 'sometimes' works without that change, but only thanks to the
lazy generic interrupt disabling (keeping interrupt unmasked).

Reported-by: Michal Koziel <michal.koziel@emlogic.no>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-06-21 20:43:52 +02:00
Linus Torvalds
a96bfed64c Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fix from Russell King:

 - fix gcc 10 compiler regression with cpu_init()

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9081/1: fix gcc-10 thumb2-kernel regression
2021-06-21 09:49:48 -07:00
Desmond Cheong Zhi Xi
1815d9c86e drm: add a locked version of drm_is_current_master
While checking the master status of the DRM file in
drm_is_current_master(), the device's master mutex should be
held. Without the mutex, the pointer fpriv->master may be freed
concurrently by another process calling drm_setmaster_ioctl(). This
could lead to use-after-free errors when the pointer is subsequently
dereferenced in drm_lease_owner().

The callers of drm_is_current_master() from drm_auth.c hold the
device's master mutex, but external callers do not. Hence, we implement
drm_is_current_master_locked() to be used within drm_auth.c, and
modify drm_is_current_master() to grab the device's master mutex
before checking the master status.

Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210620110327.4964-2-desmondcheongzx@gmail.com
2021-06-21 17:42:28 +02:00
Peter Zijlstra
31197d3a0f objtool/x86: Ignore __x86_indirect_alt_* symbols
Because the __x86_indirect_alt* symbols are just that, objtool will
try and validate them as regular symbols, instead of the alternative
replacements that they are.

This goes sideways for FRAME_POINTER=y builds; which generate a fair
amount of warnings.

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/YNCgxwLBiK9wclYJ@hirez.programming.kicks-ass.net
2021-06-21 17:26:57 +02:00
Bumyong Lee
5f89468e2f swiotlb: manipulate orig_addr when tlb_addr has offset
in case of driver wants to sync part of ranges with offset,
swiotlb_tbl_sync_single() copies from orig_addr base to tlb_addr with
offset and ends up with data mismatch.

It was removed from
"swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single",
but said logic has to be added back in.

From Linus's email:
"That commit which the removed the offset calculation entirely, because the old

        (unsigned long)tlb_addr & (IO_TLB_SIZE - 1)

was wrong, but instead of removing it, I think it should have just
fixed it to be

        (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);

instead. That way the slot offset always matches the slot index calculation."

(Unfortunatly that broke NVMe).

The use-case that drivers are hitting is as follow:

1. Get dma_addr_t from dma_map_single()

dma_addr_t tlb_addr = dma_map_single(dev, vaddr, vsize, DMA_TO_DEVICE);

    |<---------------vsize------------->|
    +-----------------------------------+
    |                                   | original buffer
    +-----------------------------------+
  vaddr

 swiotlb_align_offset
     |<----->|<---------------vsize------------->|
     +-------+-----------------------------------+
     |       |                                   | swiotlb buffer
     +-------+-----------------------------------+
          tlb_addr

2. Do something
3. Sync dma_addr_t through dma_sync_single_for_device(..)

dma_sync_single_for_device(dev, tlb_addr + offset, size, DMA_TO_DEVICE);

  Error case.
    Copy data to original buffer but it is from base addr (instead of
  base addr + offset) in original buffer:

 swiotlb_align_offset
     |<----->|<- offset ->|<- size ->|
     +-------+-----------------------------------+
     |       |            |##########|           | swiotlb buffer
     +-------+-----------------------------------+
          tlb_addr

    |<- size ->|
    +-----------------------------------+
    |##########|                        | original buffer
    +-----------------------------------+
  vaddr

The fix is to copy the data to the original buffer and take into
account the offset, like so:

 swiotlb_align_offset
     |<----->|<- offset ->|<- size ->|
     +-------+-----------------------------------+
     |       |            |##########|           | swiotlb buffer
     +-------+-----------------------------------+
          tlb_addr

    |<- offset ->|<- size ->|
    +-----------------------------------+
    |            |##########|           | original buffer
    +-----------------------------------+
  vaddr

[One fix which was Linus's that made more sense to as it created a
symmetry would break NVMe. The reason for that is the:
 unsigned int offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);

would come up with the proper offset, but it would lose the
alignment (which this patch contains).]

Fixes: 16fc3cef33 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
Signed-off-by: Bumyong Lee <bumyong.lee@samsung.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
Reported-by: Horia Geantă <horia.geanta@nxp.com>
Tested-by: Horia Geantă <horia.geanta@nxp.com>
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2021-06-21 08:59:02 -04:00
Heiko Carstens
67147e96a3 s390/stack: fix possible register corruption with stack switch helper
The CALL_ON_STACK macro is used to call a C function from inline
assembly, and therefore must consider the C ABI, which says that only
registers 6-13, and 15 are non-volatile (restored by the called
function).

The inline assembly incorrectly marks all registers used to pass
parameters to the called function as read-only input operands, instead
of operands that are read and written to. This might result in
register corruption depending on usage, compiler, and compile options.

Fix this by marking all operands used to pass parameters as read/write
operands. To keep the code simple even register 6, if used, is marked
as read-write operand.

Fixes: ff340d2472 ("s390: add stack switch helper")
Cc: <stable@kernel.org> # 4.20
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-21 11:19:18 +02:00
Sven Schnelle
9e3d62d55b s390/topology: clear thread/group maps for offline cpus
The current code doesn't clear the thread/group maps for offline
CPUs. This may cause kernel crashes like the one bewlow in common
code that assumes if a CPU has sibblings it is online.

Unable to handle kernel pointer dereference in virtual kernel address space

Call Trace:
 [<000000013a4b8c3c>] blk_mq_map_swqueue+0x10c/0x388
([<000000013a4b8bcc>] blk_mq_map_swqueue+0x9c/0x388)
 [<000000013a4b9300>] blk_mq_init_allocated_queue+0x448/0x478
 [<000000013a4b9416>] blk_mq_init_queue+0x4e/0x90
 [<000003ff8019d3e6>] loop_add+0x106/0x278 [loop]
 [<000003ff801b8148>] loop_init+0x148/0x1000 [loop]
 [<0000000139de4924>] do_one_initcall+0x3c/0x1e0
 [<0000000139ef449a>] do_init_module+0x6a/0x2a0
 [<0000000139ef61bc>] __do_sys_finit_module+0xa4/0xc0
 [<0000000139de9e6e>] do_syscall+0x7e/0xd0
 [<000000013a8e0aec>] __do_syscall+0xbc/0x110
 [<000000013a8ee2e8>] system_call+0x78/0xa0

Fixes: 52aeda7acc ("s390/topology: remove offline CPUs from CPU topology masks")
Cc: <stable@kernel.org> # 5.7+
Reported-by: Marius Hillenbrand <mhillen@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-21 11:19:18 +02:00
Tony Krowiak
8c0795d2a0 s390/vfio-ap: clean up mdev resources when remove callback invoked
The mdev remove callback for the vfio_ap device driver bails out with
-EBUSY if the mdev is in use by a KVM guest (i.e., the KVM pointer in the
struct ap_matrix_mdev is not NULL). The intended purpose was
to prevent the mdev from being removed while in use. There are two
problems with this scenario:

1. Returning a non-zero return code from the remove callback does not
   prevent the removal of the mdev.

2. The KVM pointer in the struct ap_matrix_mdev will always be NULL because
   the remove callback will not get invoked until the mdev fd is closed.
   When the mdev fd is closed, the mdev release callback is invoked and
   clears the KVM pointer from the struct ap_matrix_mdev.

Let's go ahead and remove the check for KVM in the remove callback and
allow the cleanup of mdev resources to proceed.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20210609224634.575156-2-akrowiak@linux.ibm.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-21 11:19:18 +02:00
Sven Schnelle
ca1f4d702d s390: clear pt_regs::flags on irq entry
The current irq entry code doesn't initialize pt_regs::flags. On exit to
user mode arch_do_signal_or_restart() tests whether PIF_SYSCALL is set,
which might yield wrong results.

Fix this by clearing pt_regs::flags in the entry.S irq handler
code.

Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 56e62a7370 ("s390: convert to generic entry")
Cc: <stable@vger.kernel.org> # 5.12
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-21 11:19:18 +02:00
Sven Schnelle
fc66127dc3 s390: fix system call restart with multiple signals
glibc complained with "The futex facility returned an unexpected error
code.". It turned out that the futex syscall returned -ERESTARTSYS because
a signal is pending. arch_do_signal_or_restart() restored the syscall
parameters (nameley regs->gprs[2]) and set PIF_SYSCALL_RESTART. When
another signal is made pending later in the exit loop
arch_do_signal_or_restart() is called again. This function clears
PIF_SYSCALL_RESTART and checks the return code which is set in
regs->gprs[2]. However, regs->gprs[2] was restored in the previous run
and no longer contains -ERESTARTSYS, so PIF_SYSCALL_RESTART isn't set
again and the syscall is skipped.

Fix this by not clearing PIF_SYSCALL_RESTART - it is already cleared in
__do_syscall() when the syscall is restarted.

Reported-by: Bjoern Walk <bwalk@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 56e62a7370 ("s390: convert to generic entry")
Cc: <stable@vger.kernel.org> # 5.12
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-21 11:19:18 +02:00
Linus Torvalds
13311e7425 Linux 5.13-rc7 2021-06-20 15:03:15 -07:00
Dan Carpenter
2269583753 i2c: cp2615: check for allocation failure in cp2615_i2c_recv()
We need to add a check for if the kzalloc() fails.

Fixes: 4a7695429e ("i2c: cp2615: add i2c driver for Silicon Labs' CP2615 Digital Audio Bridge")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Bence Csókás <bence98@sch.bme.hu>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-20 23:13:34 +02:00
Heiner Kallweit
065b6211a8 i2c: i801: Ensure that SMBHSTSTS_INUSE_STS is cleared when leaving i801_access
As explained in [0] currently we may leave SMBHSTSTS_INUSE_STS set,
thus potentially breaking ACPI/BIOS usage of the SMBUS device.

Seems patch [0] needs a little bit more of review effort, therefore
I'd suggest to apply a part of it as quick win. Just clearing
SMBHSTSTS_INUSE_STS when leaving i801_access() should fix the
referenced issue and leaves more time for discussing a more
sophisticated locking handling.

[0] https://www.spinics.net/lists/linux-i2c/msg51558.html

Fixes: 01590f361e ("i2c: i801: Instantiate SPD EEPROMs automatically")
Suggested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Hector Martin <marcan@marcan.st>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Tested-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-06-20 22:58:58 +02:00
Linus Torvalds
cba5e97280 Merge tag 'sched_urgent_for_v5.13_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:
 "A single fix to restore fairness between control groups with equal
  priority"

* tag 'sched_urgent_for_v5.13_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Correctly insert cfs_rq's to list on unthrottle
2021-06-20 09:44:52 -07:00
Linus Torvalds
9df7f15ee9 Merge tag 'irq_urgent_for_v5.13_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Borislav Petkov:
 "A single fix for GICv3 to not take an interrupt in an NMI context"

* tag 'irq_urgent_for_v5.13_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
2021-06-20 09:38:14 -07:00
Linus Torvalds
8363e795eb Merge tag 'x86_urgent_for_v5.13_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
 "A first set of urgent fixes to the FPU/XSTATE handling mess^W code.
  (There's a lot more in the pipe):

   - Prevent corruption of the XSTATE buffer in signal handling by
     validating what is being copied from userspace first.

   - Invalidate other task's preserved FPU registers on XRSTOR failure
     (#PF) because latter can still modify some of them.

   - Restore the proper PKRU value in case userspace modified it

   - Reset FPU state when signal restoring fails

  Other:

   - Map EFI boot services data memory as encrypted in a SEV guest so
     that the guest can access it and actually boot properly

   - Two SGX correctness fixes: proper resources freeing and a NUMA fix"

* tag 'x86_urgent_for_v5.13_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Avoid truncating memblocks for SGX memory
  x86/sgx: Add missing xa_destroy() when virtual EPC is destroyed
  x86/fpu: Reset state for all signal restore failures
  x86/pkru: Write hardware init value to PKRU when xstate is init
  x86/process: Check PF_KTHREAD and not current->mm for kernel threads
  x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer
  x86/fpu: Prevent state corruption in __fpu__restore_sig()
  x86/ioremap: Map EFI-reserved memory as encrypted for SEV
2021-06-20 09:09:58 -07:00
Linus Torvalds
b84a7c286c Merge tag 'powerpc-5.13-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "Fix initrd corruption caused by our recent change to use relative jump
  labels.

  Fix a crash using perf record on systems without a hardware PMU
  backend.

  Rework our 64-bit signal handling slighty to make it more closely
  match the old behaviour, after the recent change to use unsafe user
  accessors.

  Thanks to Anastasia Kovaleva, Athira Rajeev, Christophe Leroy, Daniel
  Axtens, Greg Kurz, and Roman Bolshakov"

* tag 'powerpc-5.13-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/perf: Fix crash in perf_instruction_pointer() when ppmu is not set
  powerpc: Fix initrd corruption with relative jump labels
  powerpc/signal64: Copy siginfo before changing regs->nip
  powerpc/mem: Add back missing header to fix 'no previous prototype' error
2021-06-19 16:50:23 -07:00
Linus Torvalds
913ec3c22e Merge tag 'perf-tools-fixes-for-v5.13-2021-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix refcount usage when processing PERF_RECORD_KSYMBOL.

 - 'perf stat' metric group fixes.

 - Fix 'perf test' non-bash issue with stat bpf counters.

 - Update unistd, in.h and socket.h with the kernel sources, silencing
   perf build warnings.

* tag 'perf-tools-fixes-for-v5.13-2021-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  tools headers UAPI: Sync linux/in.h copy with the kernel sources
  tools headers UAPI: Sync asm-generic/unistd.h with the kernel original
  perf beauty: Update copy of linux/socket.h with the kernel sources
  perf test: Fix non-bash issue with stat bpf counters
  perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL
  perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
  perf metricgroup: Fix find_evsel_group() event selector
2021-06-19 14:50:43 -07:00
Dan Sneddon
e541845ae0 drm/atmel-hlcdc: Allow async page flips
The driver is capable of doing async page flips so we need to tell the
core to allow them.

Signed-off-by: Dan Sneddon <dan.sneddon@microchip.com>
Tested-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210330151721.6616-1-dan.sneddon@microchip.com
2021-06-19 23:06:56 +02:00
Krzysztof Kozlowski
af42167f53 drm/panel: ld9040: reference spi_device_id table
Reference the spi_device_id table to silence W=1 warning:

  drivers/gpu/drm/panel/panel-samsung-ld9040.c:377:35:
    warning: ‘ld9040_ids’ defined but not used [-Wunused-const-variable=]

This also would be needed for matching the driver if booted without
CONFIG_OF (although it's not necessarily real case).

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210526123002.12913-1-krzysztof.kozlowski@canonical.com
2021-06-19 22:30:23 +02:00
Dan Sneddon
e484028bf3 drm: atmel_hlcdc: Enable the crtc vblank prior to crtc usage.
'commit eec44d44a3 ("drm/atmel: Use drm_atomic_helper_commit")'
removed the home-grown handling of atomic commits and exposed an issue
in the crtc atomic commit handling where vblank is expected to be
enabled but hasn't yet, causing kernel warnings during boot.  This patch
cleans up the crtc vblank handling thus removing the warning on boot.

Fixes: eec44d44a3 ("drm/atmel: Use drm_atomic_helper_commit")

Signed-off-by: Dan Sneddon <dan.sneddon@microchip.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602160846.5013-1-dan.sneddon@microchip.com
2021-06-19 22:23:03 +02:00
Linus Torvalds
d9403d307d Merge tag 'riscv-for-linus-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - A build fix to always build modules with the 'medany' code model, as
   the module loader doesn't support 'medlow'.

 - A Kconfig warning fix for the SiFive errata.

 - A pair of fixes that for regressions to the recent memory layout
   changes.

 - A fix for the FU740 device tree.

* tag 'riscv-for-linus-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: dts: fu740: fix cache-controller interrupts
  riscv: Ensure BPF_JIT_REGION_START aligned with PMD size
  riscv: kasan: Fix MODULES_VADDR evaluation due to local variables' name
  riscv: sifive: fix Kconfig errata warning
  riscv32: Use medany C model for modules
2021-06-19 08:45:34 -07:00
Linus Torvalds
e14c779ade Merge tag 's390-5.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Fix zcrypt ioctl hang due to AP queue msg counter dropping below 0
   when pending requests are purged.

 - Two fixes for the machine check handler in the entry code.

* tag 's390-5.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/ap: Fix hanging ioctl caused by wrong msg counter
  s390/mcck: fix invalid KVM guest condition check
  s390/mcck: fix calculation of SIE critical section size
2021-06-19 08:39:13 -07:00
Arnaldo Carvalho de Melo
1792a59eab tools headers UAPI: Sync linux/in.h copy with the kernel sources
To pick the changes in:

  3218274773 ("icmp: don't send out ICMP messages with a source address of 0.0.0.0")

That don't result in any change in tooling, as INADDR_ are not used to
generate id->string tables used by 'perf trace'.

This addresses this build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
  diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h

Cc: David S. Miller <davem@davemloft.net>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:15:22 -03:00
Arnaldo Carvalho de Melo
17d27fc314 tools headers UAPI: Sync asm-generic/unistd.h with the kernel original
To pick the changes in:

  8b1462b67f ("quota: finish disable quotactl_path syscall")

Those headers are used in some arches to generate the syscall table used
in 'perf trace' to translate syscall numbers into strings.

This addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h

Cc: Jan Kara <jack@suse.cz>
Cc: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:12:30 -03:00
Arnaldo Carvalho de Melo
ef83f9efe8 perf beauty: Update copy of linux/socket.h with the kernel sources
To pick the changes in:

  ea6932d70e ("net: make get_net_ns return error if NET_NS is disabled")

That don't result in any changes in the tables generated from that
header.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: Changbin Du <changbin.du@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:09:08 -03:00
Ian Rogers
482698c2f8 perf test: Fix non-bash issue with stat bpf counters
$(( .. )) is a bash feature but the test's interpreter is !/bin/sh,
switch the code to use expr.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210617184216.2075588-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
Riccardo Mancini
c087e9480c perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL
ASan reported a memory leak of BPF-related ksymbols map and dso. The
leak is caused by refount never reaching 0, due to missing __put calls
in the function machine__process_ksymbol_register.

Once the dso is inserted in the map, dso__put() should be called
(map__new2() increases the refcount to 2).

The same thing applies for the map when it's inserted into maps
(maps__insert() increases the refcount to 2).

  $ sudo ./perf record -- sleep 5
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

  =================================================================
  ==297735==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 6992 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x8e4e53 in map__new2 /home/user/linux/tools/perf/util/map.c:216:20
      #2 0x8cf68c in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:778:10
      [...]

  Indirect leak of 8702 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x8728d7 in dso__new_id /home/user/linux/tools/perf/util/dso.c:1256:20
      #2 0x872015 in dso__new /home/user/linux/tools/perf/util/dso.c:1295:9
      #3 0x8cf623 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:774:21
      [...]

  Indirect leak of 1520 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
      #2 0x888954 in map__process_kallsym_symbol /home/user/linux/tools/perf/util/symbol.c:710:8
      [...]

  Indirect leak of 1406 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
      #2 0x8cfbd8 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:803:8
      [...]

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Link: http://lore.kernel.org/lkml/20210612173751.188582-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
John Garry
fe7a98b9d9 perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
The error code is not set at all in the sys event iter function.

This may lead to an uninitialized value of "ret" in
metricgroup__add_metric() when no CPU metric is added.

Fix by properly setting the error code.

It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
if we have no CPU or sys event metric matching, then "has_match" should
be 0 and "ret" is set to -EINVAL.

However gcc cannot detect that it may not have been set after the
map_for_each_metric() loop for CPU metrics, which is strange.

Fixes: be335ec28e ("perf metricgroup: Support adding metrics for system PMUs")
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623335580-187317-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
John Garry
fc96ec4d5d perf metricgroup: Fix find_evsel_group() event selector
The following command segfaults on my x86 broadwell:

  $ ./perf stat  -M frontend_bound,retiring,backend_bound,bad_speculation sleep 1
  WARNING: grouped events cpus do not match, disabling group:
    anon group { raw 0x10e }
    anon group { raw 0x10e }
  perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)' failed.
  Aborted (core dumped)

The issue shows itself as a use-after-free in evlist__check_cpu_maps(),
whereby the leader of an event selector (evsel) has been deleted (yet we
still attempt to verify for an evsel).

Fundamentally the problem comes from metricgroup__setup_events() ->
find_evsel_group(), and has developed from the previous fix attempt in
commit 9c880c24cb ("perf metricgroup: Fix for metrics containing
duration_time").

The problem now is that the logic in checking if an evsel is in the same
group is subtly broken for the "cycles" event. For the "cycles" event,
the pmu_name is NULL; however the logic in find_evsel_group() may set an
event matched against "cycles" as used, when it should not be.

This leads to a condition where an evsel is set, yet its leader is not.

Fix the check for evsel pmu_name by not matching evsels when either has a
NULL pmu_name.

There is still a pre-existing metric issue whereby the ordering of the
metrics may break the 'stat' function, as discussed at:
https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/

Fixes: 9c880c24cb ("perf metricgroup: Fix for metrics containing duration_time")
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # On a Thinkpad T450S
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623335580-187317-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
David Abdurachmanov
7ede12b01b riscv: dts: fu740: fix cache-controller interrupts
The order of interrupt numbers is incorrect.

The order for FU740 is: DirError, DataError, DataFail, DirFail

From SiFive FU740-C000 Manual:
19 - L2 Cache DirError
20 - L2 Cache DirFail
21 - L2 Cache DataError
22 - L2 Cache DataFail

Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-19 00:11:53 -07:00
Jisheng Zhang
3a02764c37 riscv: Ensure BPF_JIT_REGION_START aligned with PMD size
Andreas reported commit fc8504765e ("riscv: bpf: Avoid breaking W^X")
breaks booting with one kind of defconfig, I reproduced a kernel panic
with the defconfig:

[    0.138553] Unable to handle kernel paging request at virtual address ffffffff81201220
[    0.139159] Oops [#1]
[    0.139303] Modules linked in:
[    0.139601] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5-default+ #1
[    0.139934] Hardware name: riscv-virtio,qemu (DT)
[    0.140193] epc : __memset+0xc4/0xfc
[    0.140416]  ra : skb_flow_dissector_init+0x1e/0x82
[    0.140609] epc : ffffffff8029806c ra : ffffffff8033be78 sp : ffffffe001647da0
[    0.140878]  gp : ffffffff81134b08 tp : ffffffe001654380 t0 : ffffffff81201158
[    0.141156]  t1 : 0000000000000002 t2 : 0000000000000154 s0 : ffffffe001647dd0
[    0.141424]  s1 : ffffffff80a43250 a0 : ffffffff81201220 a1 : 0000000000000000
[    0.141654]  a2 : 000000000000003c a3 : ffffffff81201258 a4 : 0000000000000064
[    0.141893]  a5 : ffffffff8029806c a6 : 0000000000000040 a7 : ffffffffffffffff
[    0.142126]  s2 : ffffffff81201220 s3 : 0000000000000009 s4 : ffffffff81135088
[    0.142353]  s5 : ffffffff81135038 s6 : ffffffff8080ce80 s7 : ffffffff80800438
[    0.142584]  s8 : ffffffff80bc6578 s9 : 0000000000000008 s10: ffffffff806000ac
[    0.142810]  s11: 0000000000000000 t3 : fffffffffffffffc t4 : 0000000000000000
[    0.143042]  t5 : 0000000000000155 t6 : 00000000000003ff
[    0.143220] status: 0000000000000120 badaddr: ffffffff81201220 cause: 000000000000000f
[    0.143560] [<ffffffff8029806c>] __memset+0xc4/0xfc
[    0.143859] [<ffffffff8061e984>] init_default_flow_dissectors+0x22/0x60
[    0.144092] [<ffffffff800010fc>] do_one_initcall+0x3e/0x168
[    0.144278] [<ffffffff80600df0>] kernel_init_freeable+0x1c8/0x224
[    0.144479] [<ffffffff804868a8>] kernel_init+0x12/0x110
[    0.144658] [<ffffffff800022de>] ret_from_exception+0x0/0xc
[    0.145124] ---[ end trace f1e9643daa46d591 ]---

After some investigation, I think I found the root cause: commit
2bfc6cd81b ("move kernel mapping outside of linear mapping") moves
BPF JIT region after the kernel:

| #define BPF_JIT_REGION_START	PFN_ALIGN((unsigned long)&_end)

The &_end is unlikely aligned with PMD size, so the front bpf jit
region sits with part of kernel .data section in one PMD size mapping.
But kernel is mapped in PMD SIZE, when bpf_jit_binary_lock_ro() is
called to make the first bpf jit prog ROX, we will make part of kernel
.data section RO too, so when we write to, for example memset the
.data section, MMU will trigger a store page fault.

To fix the issue, we need to ensure the BPF JIT region is PMD size
aligned. This patch acchieve this goal by restoring the BPF JIT region
to original position, I.E the 128MB before kernel .text section. The
modification to kasan_init.c is inspired by Alexandre.

Fixes: fc8504765e ("riscv: bpf: Avoid breaking W^X")
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18 21:10:05 -07:00
Jisheng Zhang
314b781706 riscv: kasan: Fix MODULES_VADDR evaluation due to local variables' name
commit 2bfc6cd81b ("riscv: Move kernel mapping outside of linear
mapping") makes use of MODULES_VADDR to populate kernel, BPF, modules
mapping. Currently, MODULES_VADDR is defined as below for RV64:

| #define MODULES_VADDR   (PFN_ALIGN((unsigned long)&_end) - SZ_2G)

But kasan_init() has two local variables which are also named as _start,
_end, so MODULES_VADDR is evaluated with the local variable _end
rather than the global "_end" as we expected. Fix this issue by
renaming the two local variables.

Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18 21:09:56 -07:00
ManYi Li
7dd753ca59 scsi: sr: Return appropriate error code when disk is ejected
Handle a reported media event code of 3. This indicates that the media has
been removed from the drive and user intervention is required to proceed.
Return DISK_EVENT_EJECT_REQUEST in that case.

Link: https://lore.kernel.org/r/20210611094402.23884-1-limanyi@uniontech.com
Signed-off-by: ManYi Li <limanyi@uniontech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-18 22:15:47 -04:00
Linus Torvalds
9ed13a17e3 Merge tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.13-rc7, including fixes from wireless, bpf,
  bluetooth, netfilter and can.

  Current release - regressions:

   - mlxsw: spectrum_qdisc: Pass handle, not band number to find_class()
     to fix modifying offloaded qdiscs

   - lantiq: net: fix duplicated skb in rx descriptor ring

   - rtnetlink: fix regression in bridge VLAN configuration, empty info
     is not an error, bot-generated "fix" was not needed

   - libbpf: s/rx/tx/ typo on umem->rx_ring_setup_done to fix umem
     creation

  Current release - new code bugs:

   - ethtool: fix NULL pointer dereference during module EEPROM dump via
     the new netlink API

   - mlx5e: don't update netdev RQs with PTP-RQ, the special purpose
     queue should not be visible to the stack

   - mlx5e: select special PTP queue only for SKBTX_HW_TSTAMP skbs

   - mlx5e: verify dev is present in get devlink port ndo, avoid a panic

  Previous releases - regressions:

   - neighbour: allow NUD_NOARP entries to be force GCed

   - further fixes for fallout from reorg of WiFi locking (staging:
     rtl8723bs, mac80211, cfg80211)

   - skbuff: fix incorrect msg_zerocopy copy notifications

   - mac80211: fix NULL ptr deref for injected rate info

   - Revert "net/mlx5: Arm only EQs with EQEs" it may cause missed IRQs

  Previous releases - always broken:

   - bpf: more speculative execution fixes

   - netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local

   - udp: fix race between close() and udp_abort() resulting in a panic

   - fix out of bounds when parsing TCP options before packets are
     validated (in netfilter: synproxy, tc: sch_cake and mptcp)

   - mptcp: improve operation under memory pressure, add missing
     wake-ups

   - mptcp: fix double-lock/soft lookup in subflow_error_report()

   - bridge: fix races (null pointer deref and UAF) in vlan tunnel
     egress

   - ena: fix DMA mapping function issues in XDP

   - rds: fix memory leak in rds_recvmsg

  Misc:

   - vrf: allow larger MTUs

   - icmp: don't send out ICMP messages with a source address of 0.0.0.0

   - cdc_ncm: switch to eth%d interface naming"

* tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (139 commits)
  net: ethernet: fix potential use-after-free in ec_bhf_remove
  selftests/net: Add icmp.sh for testing ICMP dummy address responses
  icmp: don't send out ICMP messages with a source address of 0.0.0.0
  net: ll_temac: Avoid ndo_start_xmit returning NETDEV_TX_BUSY
  net: ll_temac: Fix TX BD buffer overwrite
  net: ll_temac: Add memory-barriers for TX BD access
  net: ll_temac: Make sure to free skb when it is completely used
  MAINTAINERS: add Guvenc as SMC maintainer
  bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path
  bnxt_en: Fix TQM fastpath ring backing store computation
  bnxt_en: Rediscover PHY capabilities after firmware reset
  cxgb4: fix wrong shift.
  mac80211: handle various extensible elements correctly
  mac80211: reset profile_periodicity/ema_ap
  cfg80211: avoid double free of PMSR request
  cfg80211: make certificate generation more robust
  mac80211: minstrel_ht: fix sample time check
  net: qed: Fix memcpy() overflow of qed_dcbx_params()
  net: cdc_eem: fix tx fixup skb leak
  net: hamradio: fix memory leak in mkiss_close
  ...
2021-06-18 18:55:29 -07:00
Linus Torvalds
6fab154a33 Merge tag 'for-5.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "One more fix, for a space accounting bug in zoned mode. It happens
  when a block group is switched back rw->ro and unusable bytes (due to
  zoned constraints) are subtracted twice.

  It has user visible effects so I consider it important enough for late
  -rc inclusion and backport to stable"

* tag 'for-5.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: zoned: fix negative space_info->bytes_readonly
2021-06-18 16:39:03 -07:00
Linus Torvalds
728a748b3f Merge tag 'pci-v5.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:

 - Clear 64-bit flag for host bridge windows below 4GB to fix a resource
   allocation regression added in -rc1 (Punit Agrawal)

 - Fix tegra194 MCFG quirk build regressions added in -rc1 (Jon Hunter)

 - Avoid secondary bus resets on TI KeyStone C667X devices (Antti
   Järvinen)

 - Avoid secondary bus resets on some NVIDIA GPUs (Shanker Donthineni)

 - Work around FLR erratum on Huawei Intelligent NIC VF (Chiqijun)

 - Avoid broken ATS on AMD Navi14 GPU (Evan Quan)

 - Trust Broadcom BCM57414 NIC to isolate functions even though it
   doesn't advertise ACS support (Sriharsha Basavapatna)

 - Work around AMD RS690 BIOSes that don't configure DMA above 4GB
   (Mikel Rychliski)

 - Fix panic during PIO transfer on Aardvark controller (Pali Rohár)

* tag 'pci-v5.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: aardvark: Fix kernel panic during PIO transfer
  PCI: Add AMD RS690 quirk to enable 64-bit DMA
  PCI: Add ACS quirk for Broadcom BCM57414 NIC
  PCI: Mark AMD Navi14 GPU ATS as broken
  PCI: Work around Huawei Intelligent NIC VF FLR erratum
  PCI: Mark some NVIDIA GPUs to avoid bus reset
  PCI: Mark TI C667X to avoid bus reset
  PCI: tegra194: Fix MCFG quirk build regressions
  PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB
2021-06-18 13:54:11 -07:00
Matthew Wilcox (Oracle)
9620ad86d0 afs: Re-enable freezing once a page fault is interrupted
If a task is killed during a page fault, it does not currently call
sb_end_pagefault(), which means that the filesystem cannot be frozen
at any time thereafter.  This may be reported by lockdep like this:

====================================
WARNING: fsstress/10757 still has locks held!
5.13.0-rc4-build4+ #91 Not tainted
------------------------------------
1 lock held by fsstress/10757:
 #0: ffff888104eac530
 (
sb_pagefaults

as filesystem freezing is modelled as a lock.

Fix this by removing all the direct returns from within the function,
and using 'ret' to indicate whether we were interrupted or successful.

Fixes: 1cf7a1518a ("afs: Implement shared-writeable mmap")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20210616154900.1958373-1-willy@infradead.org/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-18 13:49:07 -07:00
Pavel Skripkin
9cca0c2d70 net: ethernet: fix potential use-after-free in ec_bhf_remove
static void ec_bhf_remove(struct pci_dev *dev)
{
...
	struct ec_bhf_priv *priv = netdev_priv(net_dev);

	unregister_netdev(net_dev);
	free_netdev(net_dev);

	pci_iounmap(dev, priv->dma_io);
	pci_iounmap(dev, priv->io);
...
}

priv is netdev private data, but it is used
after free_netdev(). It can cause use-after-free when accessing priv
pointer. So, fix it by moving free_netdev() after pci_iounmap()
calls.

Fixes: 6af55ff52b ("Driver for Beckhoff CX5020 EtherCAT master module.")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 13:01:17 -07:00
David S. Miller
0d1dc9e1f4 Merge tag 'mac80211-for-net-2021-06-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
A couple of straggler fixes:
 * a minstrel HT sample check fix
 * peer measurement could double-free on races
 * certificate file generation at build time could
   sometimes hang
 * some parameters weren't reset between connections
   in mac80211
 * some extensible elements were treated as non-
   extensible, possibly causuing bad connections
   (or failures) if the AP adds data
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:22:55 -07:00
Toke Høiland-Jørgensen
7e9838b791 selftests/net: Add icmp.sh for testing ICMP dummy address responses
This adds a new icmp.sh selftest for testing that the kernel will respond
correctly with an ICMP unreachable message with the dummy (192.0.0.8)
source address when there are no IPv4 addresses configured to use as source
addresses.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:13:24 -07:00
Toke Høiland-Jørgensen
3218274773 icmp: don't send out ICMP messages with a source address of 0.0.0.0
When constructing ICMP response messages, the kernel will try to pick a
suitable source address for the outgoing packet. However, if no IPv4
addresses are configured on the system at all, this will fail and we end up
producing an ICMP message with a source address of 0.0.0.0. This can happen
on a box routing IPv4 traffic via v6 nexthops, for instance.

Since 0.0.0.0 is not generally routable on the internet, there's a good
chance that such ICMP messages will never make it back to the sender of the
original packet that the ICMP message was sent in response to. This, in
turn, can create connectivity and PMTUd problems for senders. Fortunately,
RFC7600 reserves a dummy address to be used as a source for ICMP
messages (192.0.0.8/32), so let's teach the kernel to substitute that
address as a last resort if the regular source address selection procedure
fails.

Below is a quick example reproducing this issue with network namespaces:

ip netns add ns0
ip l add type veth peer netns ns0
ip l set dev veth0 up
ip a add 10.0.0.1/24 dev veth0
ip a add fc00:dead:cafe:42::1/64 dev veth0
ip r add 10.1.0.0/24 via inet6 fc00:dead:cafe:42::2
ip -n ns0 l set dev veth0 up
ip -n ns0 a add fc00:dead:cafe:42::2/64 dev veth0
ip -n ns0 r add 10.0.0.0/24 via inet6 fc00:dead:cafe:42::1
ip netns exec ns0 sysctl -w net.ipv4.icmp_ratelimit=0
ip netns exec ns0 sysctl -w net.ipv4.ip_forward=1
tcpdump -tpni veth0 -c 2 icmp &
ping -w 1 10.1.0.1 > /dev/null
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 29, seq 1, length 64
IP 0.0.0.0 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92
2 packets captured
2 packets received by filter
0 packets dropped by kernel

With this patch the above capture changes to:
IP 10.0.0.1 > 10.1.0.1: ICMP echo request, id 31127, seq 1, length 64
IP 192.0.0.8 > 10.0.0.1: ICMP net 10.1.0.1 unreachable, length 92

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Juliusz Chroboczek <jch@irif.fr>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:13:24 -07:00
Esben Haabendal
f639634119 net: ll_temac: Avoid ndo_start_xmit returning NETDEV_TX_BUSY
As documented in Documentation/networking/driver.rst, the ndo_start_xmit
method must not return NETDEV_TX_BUSY under any normal circumstances, and
as recommended, we simply stop the tx queue in advance, when there is a
risk that the next xmit would cause a NETDEV_TX_BUSY return.

Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:11:51 -07:00
Esben Haabendal
c364df2489 net: ll_temac: Fix TX BD buffer overwrite
Just as the initial check, we need to ensure num_frag+1 buffers available,
as that is the number of buffers we are going to use.

This fixes a buffer overflow, which might be seen during heavy network
load. Complete lockup of TEMAC was reproducible within about 10 minutes of
a particular load.

Fixes: 84823ff80f ("net: ll_temac: Fix race condition causing TX hang")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:11:51 -07:00
Esben Haabendal
28d9fab458 net: ll_temac: Add memory-barriers for TX BD access
Add a couple of memory-barriers to ensure correct ordering of read/write
access to TX BDs.

In xmit_done, we should ensure that reading the additional BD fields are
only done after STS_CTRL_APP0_CMPLT bit is set.

When xmit_done marks the BD as free by setting APP0=0, we need to ensure
that the other BD fields are reset first, so we avoid racing with the xmit
path, which writes to the same fields.

Finally, making sure to read APP0 of next BD after the current BD, ensures
that we see all available buffers.

Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:11:51 -07:00
Esben Haabendal
6aa32217a9 net: ll_temac: Make sure to free skb when it is completely used
With the skb pointer piggy-backed on the TX BD, we have a simple and
efficient way to free the skb buffer when the frame has been transmitted.
But in order to avoid freeing the skb while there are still fragments from
the skb in use, we need to piggy-back on the TX BD of the skb, not the
first.

Without this, we are doing use-after-free on the DMA side, when the first
BD of a multi TX BD packet is seen as completed in xmit_done, and the
remaining BDs are still being processed.

Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:11:51 -07:00
Karsten Graul
35036d69b9 MAINTAINERS: add Guvenc as SMC maintainer
Add Guvenc as maintainer for Shared Memory Communications (SMC)
Sockets.

Cc: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:05:52 -07:00
David S. Miller
b6a258c10e Merge branch 'bnxt_en-fixes'
Michael Chan says:

====================
bnxt_en: Bug fixes

This patchset includes 3 small bug fixes to reinitialize PHY capabilities
after firmware reset, setup the chip's internal TQM fastpath ring
backing memory properly for RoCE traffic, and to free ethtool related
memory if driver probe fails.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:00:27 -07:00
Somnath Kotur
03400aaa69 bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path
bnxt_ethtool_init() may have allocated some memory and we need to
call bnxt_ethtool_free() to properly unwind if bnxt_init_one()
fails.

Fixes: 7c38091814 ("bnxt_en: Refactor bnxt_init_one() and turn on TPA support on 57500 chips.")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:00:27 -07:00
Rukhsana Ansari
c12e1643d2 bnxt_en: Fix TQM fastpath ring backing store computation
TQM fastpath ring needs to be sized to store both the requester
and responder side of RoCE QPs in TQM for supporting bi-directional
tests.  Fix bnxt_alloc_ctx_mem() to multiply the RoCE QPs by a factor of
2 when computing the number of entries for TQM fastpath ring.  This
fixes an RX pipeline stall issue when running bi-directional max
RoCE QP tests.

Fixes: c7dd7ab4b2 ("bnxt_en: Improve TQM ring context memory sizing formulas.")
Signed-off-by: Rukhsana Ansari <rukhsana.ansari@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:00:27 -07:00
Michael Chan
0afd6a4e80 bnxt_en: Rediscover PHY capabilities after firmware reset
There is a missing bnxt_probe_phy() call in bnxt_fw_init_one() to
rediscover the PHY capabilities after a firmware reset.  This can cause
some PHY related functionalities to fail after a firmware reset.  For
example, in multi-host, the ability for any host to configure the PHY
settings may be lost after a firmware reset.

Fixes: ec5d31e3c1 ("bnxt_en: Handle firmware reset status during IF_UP.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 12:00:27 -07:00
Pavel Machek
39eb028183 cxgb4: fix wrong shift.
While fixing coverity warning, commit dd2c796773 introduced typo in
shift value. Fix that.

Signed-off-by: Pavel Machek (CIP) <pavel@denx.de>
Fixes: dd2c796773 ("cxgb4: Fix unintentional sign extension issues")
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 11:56:33 -07:00
Linus Torvalds
b1edae0d5f Merge tag 'arc-5.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:

 - ARCv2 userspace ABI not populating a few registers

 - Unbork CONFIG_HARDENED_USERCOPY for ARC

* tag 'arc-5.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: fix CONFIG_HARDENED_USERCOPY
  ARCv2: save ABI registers across signal handling
2021-06-18 11:09:23 -07:00
Linus Torvalds
89fec74203 Merge tag 'trace-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Have recordmcount check for valid st_shndx otherwise some archs may
   have invalid references for the mcount location.

 - Two fixes done for mapping pids to task names. Traces were not
   showing the names of tasks when they should have.

 - Fix to trace_clock_global() to prevent it from going backwards

* tag 'trace-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Do no increment trace_clock_global() by one
  tracing: Do not stop recording comms if the trace file is being read
  tracing: Do not stop recording cmdlines when tracing is off
  recordmcount: Correct st_shndx handling
2021-06-18 10:57:09 -07:00
Linus Torvalds
0f4022a490 Merge tag 'printk-for-5.13-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fixup from Petr Mladek:
 "Fix misplaced EXPORT_SYMBOL(vsprintf)"

* tag 'printk-for-5.13-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Move EXPORT_SYMBOL() closer to vprintk definition
2021-06-18 10:50:41 -07:00
Linus Torvalds
944293bcee Merge tag 'pm-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Remove recently added frequency invariance support from the CPPC
  cpufreq driver, because it has turned out to be problematic and it
  cannot be fixed properly on time for 5.13 (Viresh Kumar)"

* tag 'pm-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "cpufreq: CPPC: Add support for frequency invariance"
2021-06-18 10:42:36 -07:00
Linus Torvalds
e2c8f8e57b Merge tag 'usb-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are three small USB fixes for reported problems for 5.13-rc7.
  They include:

   - disable autosuspend for a cypress USB hub

   - fix the battery charger detection for the chipidea driver

   - fix a kernel panic in the dwc3 driver due to a previous change in
     5.13-rc1.

  All have been in linux-next with no reported problems"

* tag 'usb-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: core: hub: Disable autosuspend for Cypress CY7C65632
  usb: chipidea: imx: Fix Battery Charger 1.2 CDP detection
  usb: dwc3: core: fix kernel panic when do reboot
2021-06-18 10:39:32 -07:00
Fan Du
28e5e44aa3 x86/mm: Avoid truncating memblocks for SGX memory
tl;dr:

Several SGX users reported seeing the following message on NUMA systems:

  sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0.

This turned out to be the memblock code mistakenly throwing away SGX
memory.

=== Full Changelog ===

The 'max_pfn' variable represents the highest known RAM address.  It can
be used, for instance, to quickly determine for which physical addresses
there is mem_map[] space allocated.  The numa_meminfo code makes an
effort to throw out ("trim") all memory blocks which are above 'max_pfn'.

SGX memory is not considered RAM (it is marked as "Reserved" in the
e820) and is not taken into account by max_pfn. Despite this, SGX memory
areas have NUMA affinity and are enumerated in the ACPI SRAT table. The
existing SGX code uses the numa_meminfo mechanism to look up the NUMA
affinity for its memory areas.

In cases where SGX memory was above max_pfn (usually just the one EPC
section in the last highest NUMA node), the numa_memblock is truncated
at 'max_pfn', which is below the SGX memory.  When the SGX code tries to
look up the affinity of this memory, it fails and produces an error message:

  sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0.

and assigns the memory to NUMA node 0.

Instead of silently truncating the memory block at 'max_pfn' and
dropping the SGX memory, add the truncated portion to
'numa_reserved_meminfo'.  This allows the SGX code to later determine
the NUMA affinity of its 'Reserved' area.

Before, numa_meminfo looked like this (from 'crash'):

  blk = { start =          0x0, end = 0x2080000000, nid = 0x0 }
        { start = 0x2080000000, end = 0x4000000000, nid = 0x1 }

numa_reserved_meminfo is empty.

With this, numa_meminfo looks like this:

  blk = { start =          0x0, end = 0x2080000000, nid = 0x0 }
        { start = 0x2080000000, end = 0x4000000000, nid = 0x1 }

and numa_reserved_meminfo has an entry for node 1's SGX memory:

  blk =  { start = 0x4000000000, end = 0x4080000000, nid = 0x1 }

 [ daveh: completely rewrote/reworked changelog ]

Fixes: 5d30f92e76 ("x86/NUMA: Provide a range-to-target_node lookup facility")
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Fan Du <fan.du@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20210617194657.0A99CB22@viggo.jf.intel.com
2021-06-18 19:37:01 +02:00
Linus Torvalds
c3bf96eaa4 Merge tag 'drm-fixes-2021-06-18' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Not much happening in fixes land this week only one PR for two amdgpu
  powergating fixes was waiting for me, maybe something will show up
  over the weekend, maybe not.

  amdgpu:

   - GFX9 and 10 powergating fixes"

* tag 'drm-fixes-2021-06-18' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell.
  drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue.
2021-06-18 10:36:18 -07:00
Pali Rohár
f18139966d PCI: aardvark: Fix kernel panic during PIO transfer
Trying to start a new PIO transfer by writing value 0 in PIO_START register
when previous transfer has not yet completed (which is indicated by value 1
in PIO_START) causes an External Abort on CPU, which results in kernel
panic:

    SError Interrupt on CPU0, code 0xbf000002 -- SError
    Kernel panic - not syncing: Asynchronous SError Interrupt

To prevent kernel panic, it is required to reject a new PIO transfer when
previous one has not finished yet.

If previous PIO transfer is not finished yet, the kernel may issue a new
PIO request only if the previous PIO transfer timed out.

In the past the root cause of this issue was incorrectly identified (as it
often happens during link retraining or after link down event) and special
hack was implemented in Trusted Firmware to catch all SError events in EL3,
to ignore errors with code 0xbf000002 and not forwarding any other errors
to kernel and instead throw panic from EL3 Trusted Firmware handler.

Links to discussion and patches about this issue:
https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50
https://lore.kernel.org/linux-pci/20190316161243.29517-1-repk@triplefau.lt/
https://lore.kernel.org/linux-pci/971be151d24312cc533989a64bd454b4@www.loen.fr/
https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541

But the real cause was the fact that during link retraining or after link
down event the PIO transfer may take longer time, up to the 1.44s until it
times out. This increased probability that a new PIO transfer would be
issued by kernel while previous one has not finished yet.

After applying this change into the kernel, it is possible to revert the
mentioned TF-A hack and SError events do not have to be caught in TF-A EL3.

Link: https://lore.kernel.org/r/20210608203655.31228-1-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org # 7fbcb5da81 ("PCI: aardvark: Don't rely on jiffies while holding spinlock")
2021-06-18 10:32:35 -05:00
Mikel Rychliski
cacf994a91 PCI: Add AMD RS690 quirk to enable 64-bit DMA
Although the AMD RS690 chipset has 64-bit DMA support, BIOS implementations
sometimes fail to configure the memory limit registers correctly.

The Acer F690GVM mainboard uses this chipset and a Marvell 88E8056 NIC. The
sky2 driver programs the NIC to use 64-bit DMA, which will not work:

  sky2 0000:02:00.0: error interrupt status=0x8
  sky2 0000:02:00.0 eth0: tx timeout
  sky2 0000:02:00.0 eth0: transmit ring 0 .. 22 report=0 done=0

Other drivers required by this mainboard either don't support 64-bit DMA,
or have it disabled using driver specific quirks. For example, the ahci
driver has quirks to enable or disable 64-bit DMA depending on the BIOS
version (see ahci_sb600_enable_64bit() in ahci.c). This ahci quirk matches
against the SB600 SATA controller, but the real issue is almost certainly
with the RS690 PCI host that it was commonly attached to.

To avoid this issue in all drivers with 64-bit DMA support, fix the
configuration of the PCI host. If the kernel is aware of physical memory
above 4GB, but the BIOS never configured the PCI host with this
information, update the registers with our values.

[bhelgaas: drop PCI_DEVICE_ID_ATI_RS690 definition]
Link: https://lore.kernel.org/r/20210611214823.4898-1-mikel@mikelr.com
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-06-18 10:32:35 -05:00
Sriharsha Basavapatna
db2f77e2bd PCI: Add ACS quirk for Broadcom BCM57414 NIC
The Broadcom BCM57414 NIC may be a multi-function device.  While it does
not advertise an ACS capability, peer-to-peer transactions are not possible
between the individual functions, so it is safe to treat them as fully
isolated.

Add an ACS quirk for this device so the functions can be in independent
IOMMU groups and attached individually to userspace applications using
VFIO.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/1621645997-16251-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2021-06-18 10:32:35 -05:00
Evan Quan
e8946a53e2 PCI: Mark AMD Navi14 GPU ATS as broken
Observed unexpected GPU hang during runpm stress test on 0x7341 rev 0x00.
Further debugging shows broken ATS is related.

Disable ATS on this part.  Similar issues on other devices:

  a2da5d8cc0 ("PCI: Mark AMD Raven iGPU ATS as broken in some platforms")
  45beb31d3a ("PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken")
  5e89cd303e ("PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken")

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20210602021255.939090-1-evan.quan@amd.com
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Cc: stable@vger.kernel.org
2021-06-18 10:32:35 -05:00
Chiqijun
ce00322c23 PCI: Work around Huawei Intelligent NIC VF FLR erratum
pcie_flr() starts a Function Level Reset (FLR), waits 100ms (the maximum
time allowed for FLR completion by PCIe r5.0, sec 6.6.2), and waits for the
FLR to complete.  It assumes the FLR is complete when a config read returns
valid data.

When we do an FLR on several Huawei Intelligent NIC VFs at the same time,
firmware on the NIC processes them serially.  The VF may respond to config
reads before the firmware has completed its reset processing.  If we bind a
driver to the VF (e.g., by assigning the VF to a virtual machine) in the
interval between the successful config read and completion of the firmware
reset processing, the NIC VF driver may fail to load.

Prevent this driver failure by waiting for the NIC firmware to complete its
reset processing.  Not all NIC firmware supports this feature.

[bhelgaas: commit log]
Link: https://support.huawei.com/enterprise/en/doc/EDOC1100063073/87950645/vm-oss-occasionally-fail-to-load-the-in200-driver-when-the-vf-performs-flr
Link: https://lore.kernel.org/r/20210414132301.1793-1-chiqijun@huawei.com
Signed-off-by: Chiqijun <chiqijun@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
2021-06-18 10:32:35 -05:00
Shanker Donthineni
4c207e7121 PCI: Mark some NVIDIA GPUs to avoid bus reset
Some NVIDIA GPU devices do not work with SBR.  Triggering SBR leaves the
device inoperable for the current system boot. It requires a system
hard-reboot to get the GPU device back to normal operating condition
post-SBR. For the affected devices, enable NO_BUS_RESET quirk to avoid the
issue.

This issue will be fixed in the next generation of hardware.

Link: https://lore.kernel.org/r/20210608054857.18963-8-ameynarkhede03@gmail.com
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Cc: stable@vger.kernel.org
2021-06-18 10:32:35 -05:00
Antti Järvinen
b5cf198e74 PCI: Mark TI C667X to avoid bus reset
Some TI KeyStone C667X devices do not support bus/hot reset.  The PCIESS
automatically disables LTSSM when Secondary Bus Reset is received and
device stops working.  Prevent bus reset for these devices.  With this
change, the device can be assigned to VMs with VFIO, but it will leak state
between VMs.

Reference: https://e2e.ti.com/support/processors/f/791/t/954382
Link: https://lore.kernel.org/r/20210315102606.17153-1-antti.jarvinen@gmail.com
Signed-off-by: Antti Järvinen <antti.jarvinen@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com>
Cc: stable@vger.kernel.org
2021-06-18 10:32:35 -05:00
Jon Hunter
a512360f45 PCI: tegra194: Fix MCFG quirk build regressions
7f10074474 ("PCI: tegra: Add Tegra194 MCFG quirks for ECAM errata")
caused a few build regressions:

  - 7f10074474 removed the Makefile rule for CONFIG_PCIE_TEGRA194, so
    pcie-tegra.c can no longer be built as a module.  Restore that rule.

  - 7f10074474 added "#ifdef CONFIG_PCIE_TEGRA194" around the native
    driver, but that's only set when the driver is built-in (for a module,
    CONFIG_PCIE_TEGRA194_MODULE is defined).

    The ACPI quirk is completely independent of the rest of the native
    driver, so move the quirk to its own file and remove the #ifdef in the
    native driver.

  - 7f10074474 added symbols that are always defined but used only when
    CONFIG_PCIEASPM, which causes warnings when CONFIG_PCIEASPM is not set:

      drivers/pci/controller/dwc/pcie-tegra194.c:259:18: warning: ‘event_cntr_data_offset’ defined but not used [-Wunused-const-variable=]
      drivers/pci/controller/dwc/pcie-tegra194.c:250:18: warning: ‘event_cntr_ctrl_offset’ defined but not used [-Wunused-const-variable=]
      drivers/pci/controller/dwc/pcie-tegra194.c:243:27: warning: ‘pcie_gen_freq’ defined but not used [-Wunused-const-variable=]

Fixes: 7f10074474 ("PCI: tegra: Add Tegra194 MCFG quirks for ECAM errata")
Link: https://lore.kernel.org/r/20210610064134.336781-1-jonathanh@nvidia.com
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
2021-06-18 10:32:34 -05:00
Punit Agrawal
3bd6b8271e PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB
Alexandru and Qu reported this resource allocation failure on ROCKPro64 v2
and ROCK Pi 4B, both based on the RK3399:

  pci_bus 0000:00: root bus resource [mem 0xfa000000-0xfbdfffff 64bit]
  pci 0000:00:00.0: PCI bridge to [bus 01]
  pci 0000:00:00.0: BAR 14: no space for [mem size 0x00100000]
  pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]

"BAR 14" is the PCI bridge's 32-bit non-prefetchable window, and our PCI
allocation code isn't smart enough to allocate it in a host bridge window
marked as 64-bit, even though this should work fine.

A DT host bridge description includes the windows from the CPU address
space to the PCI bus space.  On a few architectures (microblaze, powerpc,
sparc), the DT may also describe PCI devices themselves, including their
BARs.

Before 9d57e61bf7 ("of/pci: Add IORESOURCE_MEM_64 to resource flags for
64-bit memory addresses"), of_bus_pci_get_flags() ignored the fact that
some DT addresses described 64-bit windows and BARs.  That was a problem
because the virtio virtual NIC has a 32-bit BAR and a 64-bit BAR, and the
driver couldn't distinguish them.

9d57e61bf7 set IORESOURCE_MEM_64 for those 64-bit DT ranges, which fixed
the virtio driver.  But it also set IORESOURCE_MEM_64 for host bridge
windows, which exposed the fact that the PCI allocator isn't smart enough
to put 32-bit resources in those 64-bit windows.

Clear IORESOURCE_MEM_64 from host bridge windows since we don't need that
information.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Fixes: 9d57e61bf7 ("of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses")
Link: https://lore.kernel.org/r/20210614230457.752811-1-punitagrawal@gmail.com
Reported-at: https://lore.kernel.org/lkml/7a1e2ebc-f7d8-8431-d844-41a9c36a8911@arm.com/
Reported-at: https://lore.kernel.org/lkml/YMyTUv7Jsd89PGci@m4/T/#u
Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reported-by: Qu Wenruo <wqu@suse.com>
Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Domenico Andreoli <domenico.andreoli@linux.com>
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
2021-06-18 10:31:37 -05:00
Steven Rostedt (VMware)
89529d8b8f tracing: Do no increment trace_clock_global() by one
The trace_clock_global() tries to make sure the events between CPUs is
somewhat in order. A global value is used and updated by the latest read
of a clock. If one CPU is ahead by a little, and is read by another CPU, a
lock is taken, and if the timestamp of the other CPU is behind, it will
simply use the other CPUs timestamp.

The lock is also only taken with a "trylock" due to tracing, and strange
recursions can happen. The lock is not taken at all in NMI context.

In the case where the lock is not able to be taken, the non synced
timestamp is returned. But it will not be less than the saved global
timestamp.

The problem arises because when the time goes "backwards" the time
returned is the saved timestamp plus 1. If the lock is not taken, and the
plus one to the timestamp is returned, there's a small race that can cause
the time to go backwards!

	CPU0				CPU1
	----				----
				trace_clock_global() {
				    ts = clock() [ 1000 ]
				    trylock(clock_lock) [ success ]
				    global_ts = ts; [ 1000 ]

				    <interrupted by NMI>
 trace_clock_global() {
    ts = clock() [ 999 ]
    if (ts < global_ts)
	ts = global_ts + 1 [ 1001 ]

    trylock(clock_lock) [ fail ]

    return ts [ 1001]
 }
				    unlock(clock_lock);
				    return ts; [ 1000 ]
				}

 trace_clock_global() {
    ts = clock() [ 1000 ]
    if (ts < global_ts) [ false 1000 == 1000 ]

    trylock(clock_lock) [ success ]
    global_ts = ts; [ 1000 ]
    unlock(clock_lock)

    return ts; [ 1000 ]
 }

The above case shows to reads of trace_clock_global() on the same CPU, but
the second read returns one less than the first read. That is, time when
backwards, and this is not what is allowed by trace_clock_global().

This was triggered by heavy tracing and the ring buffer checker that tests
for the clock going backwards:

 Ring buffer clock went backwards: 20613921464 -> 20613921463
 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 0 at kernel/trace/ring_buffer.c:3412 check_buffer+0x1b9/0x1c0
 Modules linked in:
 [..]
 [CPU: 2]TIME DOES NOT MATCH expected:20620711698 actual:20620711697 delta:6790234 before:20613921463 after:20613921463
   [20613915818] PAGE TIME STAMP
   [20613915818] delta:0
   [20613915819] delta:1
   [20613916035] delta:216
   [20613916465] delta:430
   [20613916575] delta:110
   [20613916749] delta:174
   [20613917248] delta:499
   [20613917333] delta:85
   [20613917775] delta:442
   [20613917921] delta:146
   [20613918321] delta:400
   [20613918568] delta:247
   [20613918768] delta:200
   [20613919306] delta:538
   [20613919353] delta:47
   [20613919980] delta:627
   [20613920296] delta:316
   [20613920571] delta:275
   [20613920862] delta:291
   [20613921152] delta:290
   [20613921464] delta:312
   [20613921464] delta:0 TIME EXTEND
   [20613921464] delta:0

This happened more than once, and always for an off by one result. It also
started happening after commit aafe104aa9 was added.

Cc: stable@vger.kernel.org
Fixes: aafe104aa9 ("tracing: Restructure trace_clock_global() to never block")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-18 09:10:00 -04:00
Steven Rostedt (VMware)
4fdd595e4f tracing: Do not stop recording comms if the trace file is being read
A while ago, when the "trace" file was opened, tracing was stopped, and
code was added to stop recording the comms to saved_cmdlines, for mapping
of the pids to the task name.

Code has been added that only records the comm if a trace event occurred,
and there's no reason to not trace it if the trace file is opened.

Cc: stable@vger.kernel.org
Fixes: 7ffbd48d5c ("tracing: Cache comms only after an event occurred")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-18 09:10:00 -04:00
Steven Rostedt (VMware)
85550c83da tracing: Do not stop recording cmdlines when tracing is off
The saved_cmdlines is used to map pids to the task name, such that the
output of the tracing does not just show pids, but also gives a human
readable name for the task.

If the name is not mapped, the output looks like this:

    <...>-1316          [005] ...2   132.044039: ...

Instead of this:

    gnome-shell-1316    [005] ...2   132.044039: ...

The names are updated when tracing is running, but are skipped if tracing
is stopped. Unfortunately, this stops the recording of the names if the
top level tracer is stopped, and not if there's other tracers active.

The recording of a name only happens when a new event is written into a
ring buffer, so there is no need to test if tracing is on or not. If
tracing is off, then no event is written and no need to test if tracing is
off or not.

Remove the check, as it hides the names of tasks for events in the
instance buffers.

Cc: stable@vger.kernel.org
Fixes: 7ffbd48d5c ("tracing: Cache comms only after an event occurred")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-18 09:10:00 -04:00
Peter Zijlstra
fb780761e7 recordmcount: Correct st_shndx handling
One should only use st_shndx when >SHN_UNDEF and <SHN_LORESERVE. When
SHN_XINDEX, then use .symtab_shndx. Otherwise use 0.

This handles the case: st_shndx >= SHN_LORESERVE && st_shndx != SHN_XINDEX.

Link: https://lore.kernel.org/lkml/20210607023839.26387-1-mark-pk.tsai@mediatek.com/
Link: https://lkml.kernel.org/r/20210616154126.2794-1-mark-pk.tsai@mediatek.com

Reported-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[handle endianness of sym->st_shndx]
Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-18 09:09:17 -04:00
Fabien Dessenne
67e2996f72 pinctrl: stm32: fix the reported number of GPIO lines per bank
Each GPIO bank supports a variable number of lines which is usually 16, but
is less in some cases : this is specified by the last argument of the
"gpio-ranges" bank node property.
Report to the framework, the actual number of lines, so the libgpiod
gpioinfo command lists the actually existing GPIO lines.

Fixes: 1dc9d28915 ("pinctrl: stm32: add possibility to use gpio-ranges to declare bank range")
Signed-off-by: Fabien Dessenne <fabien.dessenne@foss.st.com>
Link: https://lore.kernel.org/r/20210617144629.2557693-1-fabien.dessenne@foss.st.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-06-18 14:56:54 +02:00
Johannes Berg
652e8363bb mac80211: handle various extensible elements correctly
Various elements are parsed with a requirement to have an
exact size, when really we should only check that they have
the minimum size that we need. Check only that and therefore
ignore any additional data that they might carry.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210618133832.cd101f8040a4.Iadf0e9b37b100c6c6e79c7b298cc657c2be9151a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18 13:25:49 +02:00
Johannes Berg
bbc6f03ff2 mac80211: reset profile_periodicity/ema_ap
Apparently we never clear these values, so they'll remain set
since the setting of them is conditional. Clear the values in
the relevant other cases.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210618133832.316e32d136a9.I2a12e51814258e1e1b526103894f4b9f19a91c8d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18 13:25:49 +02:00
Avraham Stern
0288e5e16a cfg80211: avoid double free of PMSR request
If cfg80211_pmsr_process_abort() moves all the PMSR requests that
need to be freed into a local list before aborting and freeing them.
As a result, it is possible that cfg80211_pmsr_complete() will run in
parallel and free the same PMSR request.

Fix it by freeing the request in cfg80211_pmsr_complete() only if it
is still in the original pmsr list.

Cc: stable@vger.kernel.org
Fixes: 9bb7e0f24e ("cfg80211: add peer measurement with FTM initiator API")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210618133832.1fbef57e269a.I00294bebdb0680b892f8d1d5c871fd9dbe785a5e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18 13:25:24 +02:00
Johannes Berg
b5642479b0 cfg80211: make certificate generation more robust
If all net/wireless/certs/*.hex files are deleted, the build
will hang at this point since the 'cat' command will have no
arguments. Do "echo | cat - ..." so that even if the "..."
part is empty, the whole thing won't hang.

Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210618133832.c989056c3664.Ic3b77531d00b30b26dcd69c64e55ae2f60c3f31e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18 13:25:15 +02:00
Felix Fietkau
1236af327a mac80211: minstrel_ht: fix sample time check
We need to skip sampling if the next sample time is after jiffies, not before.
This patch fixes an issue where in some cases only very little sampling (or none
at all) is performed, leading to really bad data rates

Fixes: 80d55154b2 ("mac80211: minstrel_ht: significantly redesign the rate probing strategy")
Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210617103854.61875-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-18 11:35:29 +02:00
Andy Shevchenko
76b7f8fae3 pinctrl: microchip-sgpio: Put fwnode in error case during ->probe()
device_for_each_child_node() bumps a reference counting of a returned variable.
We have to balance it whenever we return to the caller.

Fixes: 7e5ea974e6 ("pinctrl: pinctrl-microchip-sgpio: Add pinctrl driver for Microsemi Serial GPIO")
Cc: Lars Povlsen <lars.povlsen@microchip.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20210606191940.29312-1-andy.shevchenko@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-06-18 11:17:47 +02:00
Athira Rajeev
60b7ed54a4 powerpc/perf: Fix crash in perf_instruction_pointer() when ppmu is not set
On systems without any specific PMU driver support registered, running
perf record causes Oops.

The relevant portion from call trace:

  BUG: Kernel NULL pointer dereference on read at 0x00000040
  Faulting instruction address: 0xc0021f0c
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE PAGE_SIZE=4K PREEMPT CMPCPRO
  SAF3000 DIE NOTIFICATION
  CPU: 0 PID: 442 Comm: null_syscall Not tainted 5.13.0-rc6-s3k-dev-01645-g7649ee3d2957 #5164
  NIP:  c0021f0c LR: c00e8ad8 CTR: c00d8a5c
  NIP perf_instruction_pointer+0x10/0x60
  LR  perf_prepare_sample+0x344/0x674
  Call Trace:
    perf_prepare_sample+0x7c/0x674 (unreliable)
    perf_event_output_forward+0x3c/0x94
    __perf_event_overflow+0x74/0x14c
    perf_swevent_hrtimer+0xf8/0x170
    __hrtimer_run_queues.constprop.0+0x160/0x318
    hrtimer_interrupt+0x148/0x3b0
    timer_interrupt+0xc4/0x22c
    Decrementer_virt+0xb8/0xbc

During perf record session, perf_instruction_pointer() is called to
capture the sample IP. This function in core-book3s accesses
ppmu->flags. If a platform specific PMU driver is not registered, ppmu
is set to NULL and accessing its members results in a crash. Fix this
crash by checking if ppmu is set.

Fixes: 2ca13a4cc5 ("powerpc/perf: Use regs->nip when SIAR is zero")
Cc: stable@vger.kernel.org # v5.11+
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1623952506-1431-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-06-18 16:30:36 +10:00
Dave Airlie
c55338d34c Merge tag 'amd-drm-fixes-5.13-2021-06-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.13-2021-06-16:

amdgpu:
- GFX9 and 10 powergating fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616204913.4368-1-alexander.deucher@amd.com
2021-06-18 11:15:04 +10:00
Linus Torvalds
fd0aa1a456 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "Miscellaneous bugfixes.

  The main interesting one is a NULL pointer dereference reported by
  syzkaller ("KVM: x86: Immediately reset the MMU context when the SMM
  flag is cleared")"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Fix kvm_check_cap() assertion
  KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMU
  KVM: X86: Fix x86_emulator slab cache leak
  KVM: SVM: Call SEV Guest Decommission if ASID binding fails
  KVM: x86: Immediately reset the MMU context when the SMM flag is cleared
  KVM: x86: Fix fall-through warnings for Clang
  KVM: SVM: fix doc warnings
  KVM: selftests: Fix compiling errors when initializing the static structure
  kvm: LAPIC: Restore guard to prevent illegal APIC register access
2021-06-17 13:14:53 -07:00
Kees Cook
1c200f832e net: qed: Fix memcpy() overflow of qed_dcbx_params()
The source (&dcbx_info->operational.params) and dest
(&p_hwfn->p_dcbx_info->set.config.params) are both struct qed_dcbx_params
(560 bytes), not struct qed_dcbx_admin_params (564 bytes), which is used
as the memcpy() size.

However it seems that struct qed_dcbx_operational_params
(dcbx_info->operational)'s layout matches struct qed_dcbx_admin_params
(p_hwfn->p_dcbx_info->set.config)'s 4 byte difference (3 padding, 1 byte
for "valid").

On the assumption that the size is wrong (rather than the source structure
type), adjust the memcpy() size argument to be 4 bytes smaller and add
a BUILD_BUG_ON() to validate any changes to the structure sizes.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-17 12:14:51 -07:00
Linyu Yuan
c3b26fdf1b net: cdc_eem: fix tx fixup skb leak
when usbnet transmit a skb, eem fixup it in eem_tx_fixup(),
if skb_copy_expand() failed, it return NULL,
usbnet_start_xmit() will have no chance to free original skb.

fix it by free orginal skb in eem_tx_fixup() first,
then check skb clone status, if failed, return NULL to usbnet.

Fixes: 9f722c0978 ("usbnet: CDC EEM support (v5)")
Signed-off-by: Linyu Yuan <linyyuan@codeaurora.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-17 11:30:25 -07:00
David S. Miller
bc39f6792e Merge tag 'mlx5-fixes-2021-06-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2021-06-16

This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-17 11:26:30 -07:00
Pavel Skripkin
7edcc68230 net: hamradio: fix memory leak in mkiss_close
My local syzbot instance hit memory leak in
mkiss_open()[1]. The problem was in missing
free_netdev() in mkiss_close().

In mkiss_open() netdevice is allocated and then
registered, but in mkiss_close() netdevice was
only unregistered, but not freed.

Fail log:

BUG: memory leak
unreferenced object 0xffff8880281ba000 (size 4096):
  comm "syz-executor.1", pid 11443, jiffies 4295046091 (age 17.660s)
  hex dump (first 32 bytes):
    61 78 30 00 00 00 00 00 00 00 00 00 00 00 00 00  ax0.............
    00 27 fa 2a 80 88 ff ff 00 00 00 00 00 00 00 00  .'.*............
  backtrace:
    [<ffffffff81a27201>] kvmalloc_node+0x61/0xf0
    [<ffffffff8706e7e8>] alloc_netdev_mqs+0x98/0xe80
    [<ffffffff84e64192>] mkiss_open+0xb2/0x6f0 [1]
    [<ffffffff842355db>] tty_ldisc_open+0x9b/0x110
    [<ffffffff84236488>] tty_set_ldisc+0x2e8/0x670
    [<ffffffff8421f7f3>] tty_ioctl+0xda3/0x1440
    [<ffffffff81c9f273>] __x64_sys_ioctl+0x193/0x200
    [<ffffffff8911263a>] do_syscall_64+0x3a/0xb0
    [<ffffffff89200068>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff8880141a9a00 (size 96):
  comm "syz-executor.1", pid 11443, jiffies 4295046091 (age 17.660s)
  hex dump (first 32 bytes):
    e8 a2 1b 28 80 88 ff ff e8 a2 1b 28 80 88 ff ff  ...(.......(....
    98 92 9c aa b0 40 02 00 00 00 00 00 00 00 00 00  .....@..........
  backtrace:
    [<ffffffff8709f68b>] __hw_addr_create_ex+0x5b/0x310
    [<ffffffff8709fb38>] __hw_addr_add_ex+0x1f8/0x2b0
    [<ffffffff870a0c7b>] dev_addr_init+0x10b/0x1f0
    [<ffffffff8706e88b>] alloc_netdev_mqs+0x13b/0xe80
    [<ffffffff84e64192>] mkiss_open+0xb2/0x6f0 [1]
    [<ffffffff842355db>] tty_ldisc_open+0x9b/0x110
    [<ffffffff84236488>] tty_set_ldisc+0x2e8/0x670
    [<ffffffff8421f7f3>] tty_ioctl+0xda3/0x1440
    [<ffffffff81c9f273>] __x64_sys_ioctl+0x193/0x200
    [<ffffffff8911263a>] do_syscall_64+0x3a/0xb0
    [<ffffffff89200068>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff8880219bfc00 (size 512):
  comm "syz-executor.1", pid 11443, jiffies 4295046091 (age 17.660s)
  hex dump (first 32 bytes):
    00 a0 1b 28 80 88 ff ff 80 8f b1 8d ff ff ff ff  ...(............
    80 8f b1 8d ff ff ff ff 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81a27201>] kvmalloc_node+0x61/0xf0
    [<ffffffff8706eec7>] alloc_netdev_mqs+0x777/0xe80
    [<ffffffff84e64192>] mkiss_open+0xb2/0x6f0 [1]
    [<ffffffff842355db>] tty_ldisc_open+0x9b/0x110
    [<ffffffff84236488>] tty_set_ldisc+0x2e8/0x670
    [<ffffffff8421f7f3>] tty_ioctl+0xda3/0x1440
    [<ffffffff81c9f273>] __x64_sys_ioctl+0x193/0x200
    [<ffffffff8911263a>] do_syscall_64+0x3a/0xb0
    [<ffffffff89200068>] entry_SYSCALL_64_after_hwframe+0x44/0xae

BUG: memory leak
unreferenced object 0xffff888029b2b200 (size 256):
  comm "syz-executor.1", pid 11443, jiffies 4295046091 (age 17.660s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff81a27201>] kvmalloc_node+0x61/0xf0
    [<ffffffff8706f062>] alloc_netdev_mqs+0x912/0xe80
    [<ffffffff84e64192>] mkiss_open+0xb2/0x6f0 [1]
    [<ffffffff842355db>] tty_ldisc_open+0x9b/0x110
    [<ffffffff84236488>] tty_set_ldisc+0x2e8/0x670
    [<ffffffff8421f7f3>] tty_ioctl+0xda3/0x1440
    [<ffffffff81c9f273>] __x64_sys_ioctl+0x193/0x200
    [<ffffffff8911263a>] do_syscall_64+0x3a/0xb0
    [<ffffffff89200068>] entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 815f62bf74 ("[PATCH] SMP rewrite of mkiss")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-17 11:24:54 -07:00
Christophe JAILLET
c19c8c0e66 be2net: Fix an error handling path in 'be_probe()'
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it
must be undone by a corresponding 'pci_disable_pcie_error_reporting()'
call, as already done in the remove function.

Fixes: d6b6d98778 ("be2net: use PCIe AER capability")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-17 11:24:06 -07:00
Fuad Tabba
d8ac05ea13 KVM: selftests: Fix kvm_check_cap() assertion
KVM_CHECK_EXTENSION ioctl can return any negative value on error,
and not necessarily -1. Change the assertion to reflect that.

Signed-off-by: Fuad Tabba <tabba@google.com>
Message-Id: <20210615150443.1183365-1-tabba@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-17 13:06:57 -04:00
Linus Torvalds
39519f6a56 Merge tag 'fixes_for_v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and fanotify fixes from Jan Kara:
 "A fixup finishing disabling of quotactl_path() syscall (I've missed
  archs using different way to declare syscalls) and a fix of an fd leak
  in error handling path of fanotify"

* tag 'fixes_for_v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: finish disable quotactl_path syscall
  fanotify: fix copy_event_to_user() fid error clean up
2021-06-17 09:49:48 -07:00
Andrew Lunn
a7d8d1c7a7 usb: core: hub: Disable autosuspend for Cypress CY7C65632
The Cypress CY7C65632 appears to have an issue with auto suspend and
detecting devices, not too dissimilar to the SMSC 5534B hub. It is
easiest to reproduce by connecting multiple mass storage devices to
the hub at the same time. On a Lenovo Yoga, around 1 in 3 attempts
result in the devices not being detected. It is however possible to
make them appear using lsusb -v.

Disabling autosuspend for this hub resolves the issue.

Fixes: 1208f9e1d7 ("USB: hub: Fix the broken detection of USB3 device in SMSC hub")
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210614155524.2228800-1-andrew@lunn.ch
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-17 15:34:21 +02:00
Thomas Gleixner
a13d0f8d11 Merge tag 'irqchip-fixes-5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip fixes from Marc Zyngier:

- Fix GICv3 NMI handling where an IRQ could be mistakenly handled
  as a NMI, with disatrous effects

Link: https://lore.kernel.org/r/20210610171127.2404752-1-maz@kernel.org
2021-06-17 15:22:31 +02:00
Naohiro Aota
f9f28e5bd0 btrfs: zoned: fix negative space_info->bytes_readonly
Consider we have a using block group on zoned btrfs.

|<- ZU ->|<- used ->|<---free--->|
                     `- Alloc offset
ZU: Zone unusable

Marking the block group read-only will migrate the zone unusable bytes
to the read-only bytes. So, we will have this.

|<- RO ->|<- used ->|<--- RO --->|

RO: Read only

When marking it back to read-write, btrfs_dec_block_group_ro()
subtracts the above "RO" bytes from the
space_info->bytes_readonly. And, it moves the zone unusable bytes back
and again subtracts those bytes from the space_info->bytes_readonly,
leading to negative bytes_readonly.

This can be observed in the output as eg.:

  Data, single: total=512.00MiB, used=165.21MiB, zone_unusable=16.00EiB
  Data, single: total=536870912, used=173256704, zone_unusable=18446744073603186688

This commit fixes the issue by reordering the operations.

Link: https://github.com/naota/linux/issues/37
Reported-by: David Sterba <dsterba@suse.com>
Fixes: 169e0da91a ("btrfs: zoned: track unusable bytes for zones")
CC: stable@vger.kernel.org # 5.12+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-17 11:12:14 +02:00
Aya Levin
0232fc2ddc net/mlx5: Reset mkey index on creation
Reset only the index part of the mkey and keep the variant part. On
devlink reload, driver recreates mkeys, so the mkey index may change.
Trying to preserve the variant part of the mkey, driver mistakenly
merged the mkey index with current value. In case of a devlink reload,
current value of index part is dirty, so the index may be corrupted.

Fixes: 54c62e13ad ("{IB,net}/mlx5: Setup mkey variant before mr create command invocation")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-16 15:36:49 -07:00
Dmytro Linkin
a5ae8fc905 net/mlx5e: Don't create devices during unload flow
Running devlink reload command for port in switchdev mode cause
resources to corrupt: driver can't release allocated EQ and reclaim
memory pages, because "rdma" auxiliary device had add CQs which blocks
EQ from deletion.
Erroneous sequence happens during reload-down phase, and is following:

1. detach device - suspends auxiliary devices which support it, destroys
   others. During this step "eth-rep" and "rdma-rep" are destroyed,
   "eth" - suspended.
2. disable SRIOV - moves device to legacy mode; as part of disablement -
   rescans drivers. This step adds "rdma" auxiliary device.
3. destroy EQ table - <failure>.

Driver shouldn't create any device during unload flows. To handle that
implement MLX5_PRIV_FLAGS_DETACH flag, set it on device detach and unset
on device attach. If flag is set do no-op on drivers rescan.

Fixes: a925b5e309 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus")
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-16 15:36:47 -07:00
Alex Vesker
65fb7d109a net/mlx5: DR, Fix STEv1 incorrect L3 decapsulation padding
Decapsulation L3 on small inner packets which are less than
64 Bytes was done incorrectly. In small packets there is an
extra padding added in L2 which should not be included in L3
length. The issue was that after decapL3 the extra L2 padding
caused an update on the L3 length.

To avoid this issue the new header is pushed to the beginning
of the packet (offset 0) which should not cause a HW reparse
and update the L3 length.

Fixes: c349b4137c ("net/mlx5: DR, Add STEv1 modify header logic")
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-16 15:36:45 -07:00
Parav Pandit
c7d6c19b3b net/mlx5: SF_DEV, remove SF device on invalid state
When auxiliary bus autoprobe is disabled and SF is in ACTIVE state,
on SF port deletion it transitions from ACTIVE->ALLOCATED->INVALID.

When VHCA event handler queries the state, it is already transition
to INVALID state.

In this scenario, event handler missed to delete the SF device.

Fix it by deleting the SF when SF state is INVALID.

Fixes: 90d010b863 ("net/mlx5: SF, Add auxiliary device support")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-16 15:36:42 -07:00
Parav Pandit
ca36fc4d77 net/mlx5: E-Switch, Allow setting GUID for host PF vport
E-switch should be able to set the GUID of host PF vport.
Currently it returns an error. This results in below error
when user attempts to configure MAC address of the PF of an
external controller.

$ devlink port function set pci/0000:03:00.0/196608 \
   hw_addr 00:00:00:11:22:33

mlx5_core 0000:03:00.0: mlx5_esw_set_vport_mac_locked:1876:(pid 6715):\
"Failed to set vport 0 node guid, err = -22.
RDMA_CM will not function properly for this VF."

Check for zero vport is no longer needed.

Fixes: 330077d14d ("net/mlx5: E-switch, Supporting setting devlink port function mac address")
Signed-off-by: Yuval Avnery <yuvalav@nvidia.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Reviewed-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-16 15:36:40 -07:00
Parav Pandit
bbc8222dc4 net/mlx5: E-Switch, Read PF mac address
External controller PF's MAC address is not read from the device during
vport setup. Fail to read this results in showing all zeros to user
while the factory programmed MAC is a valid value.

$ devlink port show eth1 -jp
{
    "port": {
        "pci/0000:03:00.0/196608": {
            "type": "eth",
            "netdev": "eth1",
            "flavour": "pcipf",
            "controller": 1,
            "pfnum": 0,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Hence, read it when enabling a vport.

After the fix,

$ devlink port show eth1 -jp
{
    "port": {
        "pci/0000:03:00.0/196608": {
            "type": "eth",
            "netdev": "eth1",
            "flavour": "pcipf",
            "controller": 1,
            "pfnum": 0,
            "splittable": false,
            "function": {
                "hw_addr": "98:03:9b:a0:60:11"
            }
        }
    }
}

Fixes: f099fde16d ("net/mlx5: E-switch, Support querying port function mac address")
Signed-off-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-16 15:36:38 -07:00
Leon Romanovsky
2058cc9c80 net/mlx5: Check that driver was probed prior attaching the device
The device can be requested to be attached despite being not probed.
This situation is possible if devlink reload races with module removal,
and the following kernel panic is an outcome of such race.

 mlx5_core 0000:00:09.0: firmware version: 4.7.9999
 mlx5_core 0000:00:09.0: 0.000 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x255 link)
 BUG: unable to handle page fault for address: fffffffffffffff0
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 3218067 P4D 3218067 PUD 321a067 PMD 0
 Oops: 0000 [#1] SMP KASAN NOPTI
 CPU: 7 PID: 250 Comm: devlink Not tainted 5.12.0-rc2+ #2836
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:mlx5_attach_device+0x80/0x280 [mlx5_core]
 Code: f8 48 c1 e8 03 42 80 3c 38 00 0f 85 80 01 00 00 48 8b 45 68 48 8d 78 f0 48 89 fe 48 c1 ee 03 42 80 3c 3e 00 0f 85 70 01 00 00 <48> 8b 40 f0 48 85 c0 74 0d 48 89 ef ff d0 85 c0 0f 85 84 05 0e 00
 RSP: 0018:ffff8880129675f0 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff827407f1
 RDX: 1ffff110011336cf RSI: 1ffffffffffffffe RDI: fffffffffffffff0
 RBP: ffff888008e0c000 R08: 0000000000000008 R09: ffffffffa0662ee7
 R10: fffffbfff40cc5dc R11: 0000000000000000 R12: ffff88800ea002e0
 R13: ffffed1001d459f7 R14: ffffffffa05ef4f8 R15: dffffc0000000000
 FS:  00007f51dfeaf740(0000) GS:ffff88806d5c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: fffffffffffffff0 CR3: 000000000bc82006 CR4: 0000000000370ea0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  mlx5_load_one+0x117/0x1d0 [mlx5_core]
  devlink_reload+0x2d5/0x520
  ? devlink_remote_reload_actions_performed+0x30/0x30
  ? mutex_trylock+0x24b/0x2d0
  ? devlink_nl_cmd_reload+0x62b/0x1070
  devlink_nl_cmd_reload+0x66d/0x1070
  ? devlink_reload+0x520/0x520
  ? devlink_nl_pre_doit+0x64/0x4d0
  genl_family_rcv_msg_doit+0x1e9/0x2f0
  ? mutex_lock_io_nested+0x1130/0x1130
  ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
  ? security_capable+0x51/0x90
  genl_rcv_msg+0x27f/0x4a0
  ? genl_get_cmd+0x3c0/0x3c0
  ? lock_acquire+0x1a9/0x6d0
  ? devlink_reload+0x520/0x520
  ? lock_release+0x6c0/0x6c0
  netlink_rcv_skb+0x11d/0x340
  ? genl_get_cmd+0x3c0/0x3c0
  ? netlink_ack+0x9f0/0x9f0
  ? lock_release+0x1f9/0x6c0
  genl_rcv+0x24/0x40
  netlink_unicast+0x433/0x700
  ? netlink_attachskb+0x730/0x730
  ? _copy_from_iter_full+0x178/0x650
  ? __alloc_skb+0x113/0x2b0
  netlink_sendmsg+0x6f1/0xbd0
  ? netlink_unicast+0x700/0x700
  ? netlink_unicast+0x700/0x700
  sock_sendmsg+0xb0/0xe0
  __sys_sendto+0x193/0x240
  ? __x64_sys_getpeername+0xb0/0xb0
  ? copy_page_range+0x2300/0x2300
  ? __up_read+0x1a1/0x7b0
  ? do_user_addr_fault+0x219/0xdc0
  __x64_sys_sendto+0xdd/0x1b0
  ? syscall_enter_from_user_mode+0x1d/0x50
  do_syscall_64+0x2d/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f51dffb514a
 Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
 RSP: 002b:00007ffcaef22e78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f51dffb514a
 RDX: 0000000000000030 RSI: 000055750daf2440 RDI: 0000000000000003
 RBP: 000055750daf2410 R08: 00007f51e0081200 R09: 000000000000000c
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 Modules linked in: mlx5_core(-) ptp pps_core ib_ipoib rdma_ucm rdma_cm iw_cm ib_cm ib_umad ib_uverbs ib_core [last unloaded: mlx5_ib]
 CR2: fffffffffffffff0
 ---[ end trace 7789831bfe74fa42 ]---

Fixes: a925b5e309 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-16 15:36:35 -07:00
Leon Romanovsky
94a4b8414d net/mlx5: Fix error path for set HCA defaults
In the case of the failure to execute mlx5_core_set_hca_defaults(),
we used wrong goto label to execute error unwind flow.

Fixes: 5bef709d76 ("net/mlx5: Enable host PF HCA after eswitch is initialized")
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-16 15:36:32 -07:00
Harald Freudenberger
e73a99f328 s390/ap: Fix hanging ioctl caused by wrong msg counter
When a AP queue is switched to soft offline, all pending
requests are purged out of the pending requests list and
'received' by the upper layer like zcrypt device drivers.
This is also done for requests which are already enqueued
into the firmware queue. A request in a firmware queue
may eventually produce an response message, but there is
no waiting process any more. However, the response was
counted with the queue_counter and as this counter was
reset to 0 with the offline switch, the pending response
caused the queue_counter to get negative. The next request
increased this counter to 0 (instead of 1) which caused
the ap code to assume there is nothing to receive and so
the response for this valid request was never tried to
fetch from the firmware queue.

This all caused a queue to not work properly after a
switch offline/online and in the end processes to hang
forever when trying to send a crypto request after an
queue offline/online switch cicle.

Fixed by a) making sure the counter does not drop below 0
and b) on a successful enqueue of a message has at least
a value of 1.

Additionally a warning is emitted, when a reply can't get
assigned to a waiting process. This may be normal operation
(process had timeout or has been killed) but may give a
hint that something unexpected happened (like this odd
behavior described above).

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-16 23:32:02 +02:00
Yifan Zhang
1c0b0efd14 drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-16 16:04:20 -04:00
Yifan Zhang
4cbbe34807 drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-16 16:04:20 -04:00
Kees Cook
da5ac772cf r8169: Avoid memcpy() over-reading of ETH_SS_STATS
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally reading across neighboring array fields.

The memcpy() is copying the entire structure, not just the first array.
Adjust the source argument so the compiler can do appropriate bounds
checking.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 13:02:07 -07:00
Kees Cook
224004fbb0 sh_eth: Avoid memcpy() over-reading of ETH_SS_STATS
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally reading across neighboring array fields.

The memcpy() is copying the entire structure, not just the first array.
Adjust the source argument so the compiler can do appropriate bounds
checking.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 13:02:07 -07:00
Kees Cook
99718abdc0 r8152: Avoid memcpy() over-reading of ETH_SS_STATS
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally reading across neighboring array fields.

The memcpy() is copying the entire structure, not just the first array.
Adjust the source argument so the compiler can do appropriate bounds
checking.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 13:02:06 -07:00
Andrea Righi
1b29df0e2e selftests: net: use bash to run udpgro_fwd test case
udpgro_fwd.sh contains many bash specific operators ("[[", "local -r"),
but it's using /bin/sh; in some distro /bin/sh is mapped to /bin/dash,
that doesn't support such operators.

Force the test to use /bin/bash explicitly and prevent false positive
test failures.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:56:10 -07:00
Eric Dumazet
a494bd642d net/af_unix: fix a data-race in unix_dgram_sendmsg / unix_release_sock
While unix_may_send(sk, osk) is called while osk is locked, it appears
unix_release_sock() can overwrite unix_peer() after this lock has been
released, making KCSAN unhappy.

Changing unix_release_sock() to access/change unix_peer()
before lock is released should fix this issue.

BUG: KCSAN: data-race in unix_dgram_sendmsg / unix_release_sock

write to 0xffff88810465a338 of 8 bytes by task 20852 on cpu 1:
 unix_release_sock+0x4ed/0x6e0 net/unix/af_unix.c:558
 unix_release+0x2f/0x50 net/unix/af_unix.c:859
 __sock_release net/socket.c:599 [inline]
 sock_close+0x6c/0x150 net/socket.c:1258
 __fput+0x25b/0x4e0 fs/file_table.c:280
 ____fput+0x11/0x20 fs/file_table.c:313
 task_work_run+0xae/0x130 kernel/task_work.c:164
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
 exit_to_user_mode_prepare+0x156/0x190 kernel/entry/common.c:209
 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
 syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:302
 do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff88810465a338 of 8 bytes by task 20888 on cpu 0:
 unix_may_send net/unix/af_unix.c:189 [inline]
 unix_dgram_sendmsg+0x923/0x1610 net/unix/af_unix.c:1712
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
 ___sys_sendmsg net/socket.c:2404 [inline]
 __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490
 __do_sys_sendmmsg net/socket.c:2519 [inline]
 __se_sys_sendmmsg net/socket.c:2516 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0xffff888167905400 -> 0x0000000000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 20888 Comm: syz-executor.0 Not tainted 5.13.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:51:55 -07:00
Andrea Righi
0fd158b89b selftests: net: veth: make test compatible with dash
veth.sh is a shell script that uses /bin/sh; some distro (Ubuntu for
example) use dash as /bin/sh and in this case the test reports the
following error:

 # ./veth.sh: 21: local: -r: bad variable name
 # ./veth.sh: 21: local: -r: bad variable name

This happens because dash doesn't support the option "-r" with local.

Moreover, in case of missing bpf object, the script is exiting -1, that
is an illegal number for dash:

 exit: Illegal number: -1

Change the script to be compatible both with bash and dash and prevent
the errors above.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:50:24 -07:00
David S. Miller
1d2ac2033d Merge branch 'net-packet-data-races'
Eric Dumazet says:

====================
net/packet: annotate data races

KCSAN sent two reports about data races in af_packet.
Nothing serious, but worth fixing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:48:18 -07:00
Eric Dumazet
e032f7c9c7 net/packet: annotate accesses to po->ifindex
Like prior patch, we need to annotate lockless accesses to po->ifindex
For instance, packet_getname() is reading po->ifindex (twice) while
another thread is able to change po->ifindex.

KCSAN reported:

BUG: KCSAN: data-race in packet_do_bind / packet_getname

write to 0xffff888143ce3cbc of 4 bytes by task 25573 on cpu 1:
 packet_do_bind+0x420/0x7e0 net/packet/af_packet.c:3191
 packet_bind+0xc3/0xd0 net/packet/af_packet.c:3255
 __sys_bind+0x200/0x290 net/socket.c:1637
 __do_sys_bind net/socket.c:1648 [inline]
 __se_sys_bind net/socket.c:1646 [inline]
 __x64_sys_bind+0x3d/0x50 net/socket.c:1646
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff888143ce3cbc of 4 bytes by task 25578 on cpu 0:
 packet_getname+0x5b/0x1a0 net/packet/af_packet.c:3525
 __sys_getsockname+0x10e/0x1a0 net/socket.c:1887
 __do_sys_getsockname net/socket.c:1902 [inline]
 __se_sys_getsockname net/socket.c:1899 [inline]
 __x64_sys_getsockname+0x3e/0x50 net/socket.c:1899
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x00000000 -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 25578 Comm: syz-executor.5 Not tainted 5.13.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:48:18 -07:00
Eric Dumazet
c7d2ef5dd4 net/packet: annotate accesses to po->bind
tpacket_snd(), packet_snd(), packet_getname() and packet_seq_show()
can read po->num without holding a lock. This means other threads
can change po->num at the same time.

KCSAN complained about this known fact [1]
Add READ_ONCE()/WRITE_ONCE() to address the issue.

[1] BUG: KCSAN: data-race in packet_do_bind / packet_sendmsg

write to 0xffff888131a0dcc0 of 2 bytes by task 24714 on cpu 0:
 packet_do_bind+0x3ab/0x7e0 net/packet/af_packet.c:3181
 packet_bind+0xc3/0xd0 net/packet/af_packet.c:3255
 __sys_bind+0x200/0x290 net/socket.c:1637
 __do_sys_bind net/socket.c:1648 [inline]
 __se_sys_bind net/socket.c:1646 [inline]
 __x64_sys_bind+0x3d/0x50 net/socket.c:1646
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff888131a0dcc0 of 2 bytes by task 24719 on cpu 1:
 packet_snd net/packet/af_packet.c:2899 [inline]
 packet_sendmsg+0x317/0x3570 net/packet/af_packet.c:3040
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
 ___sys_sendmsg net/socket.c:2404 [inline]
 __sys_sendmsg+0x1ed/0x270 net/socket.c:2433
 __do_sys_sendmsg net/socket.c:2442 [inline]
 __se_sys_sendmsg net/socket.c:2440 [inline]
 __x64_sys_sendmsg+0x42/0x50 net/socket.c:2440
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000 -> 0x1200

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 24719 Comm: syz-executor.5 Not tainted 5.13.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:48:18 -07:00
David S. Miller
e82a35aead Merge tag 'linux-can-fixes-for-5.13-20210616' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:

====================
pull-request: can 2021-06-16

this is a pull request of 4 patches for net/master.

The first patch is by Oleksij Rempel and fixes a Use-after-Free found
by syzbot in the j1939 stack.

The next patch is by Tetsuo Handa and fixes hung task detected by
syzbot in the bcm, raw and isotp protocols.

Norbert Slusarek's patch fixes a infoleak in bcm's struct
bcm_msg_head.

Pavel Skripkin's patch fixes a memory leak in the mcba_usb driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:44:11 -07:00
Chengyang Fan
d8e2973029 net: ipv4: fix memory leak in ip_mc_add1_src
BUG: memory leak
unreferenced object 0xffff888101bc4c00 (size 32):
  comm "syz-executor527", pid 360, jiffies 4294807421 (age 19.329s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    01 00 00 00 00 00 00 00 ac 14 14 bb 00 00 02 00 ................
  backtrace:
    [<00000000f17c5244>] kmalloc include/linux/slab.h:558 [inline]
    [<00000000f17c5244>] kzalloc include/linux/slab.h:688 [inline]
    [<00000000f17c5244>] ip_mc_add1_src net/ipv4/igmp.c:1971 [inline]
    [<00000000f17c5244>] ip_mc_add_src+0x95f/0xdb0 net/ipv4/igmp.c:2095
    [<000000001cb99709>] ip_mc_source+0x84c/0xea0 net/ipv4/igmp.c:2416
    [<0000000052cf19ed>] do_ip_setsockopt net/ipv4/ip_sockglue.c:1294 [inline]
    [<0000000052cf19ed>] ip_setsockopt+0x114b/0x30c0 net/ipv4/ip_sockglue.c:1423
    [<00000000477edfbc>] raw_setsockopt+0x13d/0x170 net/ipv4/raw.c:857
    [<00000000e75ca9bb>] __sys_setsockopt+0x158/0x270 net/socket.c:2117
    [<00000000bdb993a8>] __do_sys_setsockopt net/socket.c:2128 [inline]
    [<00000000bdb993a8>] __se_sys_setsockopt net/socket.c:2125 [inline]
    [<00000000bdb993a8>] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2125
    [<000000006a1ffdbd>] do_syscall_64+0x40/0x80 arch/x86/entry/common.c:47
    [<00000000b11467c4>] entry_SYSCALL_64_after_hwframe+0x44/0xae

In commit 24803f38a5 ("igmp: do not remove igmp souce list info when set
link down"), the ip_mc_clear_src() in ip_mc_destroy_dev() was removed,
because it was also called in igmpv3_clear_delrec().

Rough callgraph:

inetdev_destroy
-> ip_mc_destroy_dev
     -> igmpv3_clear_delrec
        -> ip_mc_clear_src
-> RCU_INIT_POINTER(dev->ip_ptr, NULL)

However, ip_mc_clear_src() called in igmpv3_clear_delrec() doesn't
release in_dev->mc_list->sources. And RCU_INIT_POINTER() assigns the
NULL to dev->ip_ptr. As a result, in_dev cannot be obtained through
inetdev_by_index() and then in_dev->mc_list->sources cannot be released
by ip_mc_del1_src() in the sock_close. Rough call sequence goes like:

sock_close
-> __sock_release
   -> inet_release
      -> ip_mc_drop_socket
         -> inetdev_by_index
         -> ip_mc_leave_src
            -> ip_mc_del_src
               -> ip_mc_del1_src

So we still need to call ip_mc_clear_src() in ip_mc_destroy_dev() to free
in_dev->mc_list->sources.

Fixes: 24803f38a5 ("igmp: do not remove igmp souce list info ...")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Chengyang Fan <cy.fan@huawei.com>
Acked-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:41:01 -07:00
David S. Miller
c0d982bf82 Merge branch 'fec-ptp-fixes'
Joakim Zhang says:

====================
net: fixes for fec ptp

Small fixes for fec ptp.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:39:21 -07:00
Joakim Zhang
d23765646e net: fec_ptp: fix issue caused by refactor the fec_devtype
Commit da722186f6 ("net: fec: set GPR bit on suspend by DT configuration.")
refactor the fec_devtype, need adjust ptp driver accordingly.

Fixes: da722186f6 ("net: fec: set GPR bit on suspend by DT configuration.")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:39:03 -07:00
Fugang Duan
cb3cefe3f3 net: fec_ptp: add clock rate zero check
Add clock rate zero check to fix coverity issue of "divide by 0".

Fixes: commit 85bd1798b2 ("net: fec: fix spin_lock dead lock")
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:39:03 -07:00
Dongliang Mu
56b786d866 net: usb: fix possible use-after-free in smsc75xx_bind
The commit 46a8b29c63 ("net: usb: fix memory leak in smsc75xx_bind")
fails to clean up the work scheduled in smsc75xx_reset->
smsc75xx_set_multicast, which leads to use-after-free if the work is
scheduled to start after the deallocation. In addition, this patch
also removes a dangling pointer - dev->data[0].

This patch calls cancel_work_sync to cancel the scheduled work and set
the dangling pointer to NULL.

Fixes: 46a8b29c63 ("net: usb: fix memory leak in smsc75xx_bind")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:36:09 -07:00
Joakim Zhang
8f269102ba net: stmmac: disable clocks in stmmac_remove_config_dt()
Platform drivers may call stmmac_probe_config_dt() to parse dt, could
call stmmac_remove_config_dt() in error handing after dt parsed, so need
disable clocks in stmmac_remove_config_dt().

Go through all platforms drivers which use stmmac_probe_config_dt(),
none of them disable clocks manually, so it's safe to disable them in
stmmac_remove_config_dt().

Fixes: commit d2ed0a7755 ("net: ethernet: stmmac: fix of-node and fixed-link-phydev leaks")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:20:58 -07:00
Linus Torvalds
70585216fe Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "18 patches.

  Subsystems affected by this patch series: mm (memory-failure, swap,
  slub, hugetlb, memory-failure, slub, thp, sparsemem), and coredump"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/sparse: fix check_usemap_section_nr warnings
  mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
  mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()
  mm/thp: fix page_address_in_vma() on file THP tails
  mm/thp: fix vma_address() if virtual address below file offset
  mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
  mm/thp: make is_huge_zero_pmd() safe and quicker
  mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
  mm, thp: use head page in __migration_entry_wait()
  mm/slub.c: include swab.h
  crash_core, vmcoreinfo: append 'SECTION_SIZE_BITS' to vmcoreinfo
  mm/memory-failure: make sure wait for page writeback in memory_failure
  mm/hugetlb: expand restore_reserve_on_error functionality
  mm/slub: actually fix freelist pointer vs redzoning
  mm/slub: fix redzoning for small allocations
  mm/slub: clarify verification reporting
  mm/swap: fix pte_same_as_swp() not removing uffd-wp bit when compare
  mm,hwpoison: fix race with hugetlb page allocation
2021-06-16 09:40:28 -07:00
Miles Chen
ccbd6283a9 mm/sparse: fix check_usemap_section_nr warnings
I see a "virt_to_phys used for non-linear address" warning from
check_usemap_section_nr() on arm64 platforms.

In current implementation of NODE_DATA, if CONFIG_NEED_MULTIPLE_NODES=y,
pglist_data is dynamically allocated and assigned to node_data[].

For example, in arch/arm64/include/asm/mmzone.h:

  extern struct pglist_data *node_data[];
  #define NODE_DATA(nid)          (node_data[(nid)])

If CONFIG_NEED_MULTIPLE_NODES=n, pglist_data is defined as a global
variable named "contig_page_data".

For example, in include/linux/mmzone.h:

  extern struct pglist_data contig_page_data;
  #define NODE_DATA(nid)          (&contig_page_data)

If CONFIG_DEBUG_VIRTUAL is not enabled, __pa() can handle both
dynamically allocated linear addresses and symbol addresses.  However,
if (CONFIG_DEBUG_VIRTUAL=y && CONFIG_NEED_MULTIPLE_NODES=n) we can see
the "virt_to_phys used for non-linear address" warning because that
&contig_page_data is not a linear address on arm64.

Warning message:

  virt_to_phys used for non-linear address: (contig_page_data+0x0/0x1c00)
  WARNING: CPU: 0 PID: 0 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x58/0x68
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Tainted: G        W         5.13.0-rc1-00074-g1140ab592e2e #3
  Hardware name: linux,dummy-virt (DT)
  pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO BTYPE=--)
  Call trace:
     __virt_to_phys+0x58/0x68
     check_usemap_section_nr+0x50/0xfc
     sparse_init_nid+0x1ac/0x28c
     sparse_init+0x1c4/0x1e0
     bootmem_init+0x60/0x90
     setup_arch+0x184/0x1f0
     start_kernel+0x78/0x488

To fix it, create a small function to handle both translation.

Link: https://lkml.kernel.org/r/1623058729-27264-1-git-send-email-miles.chen@mediatek.com
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Kazu <k-hagio-ab@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:43 -07:00
Yang Shi
504e070dc0 mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split
When debugging the bug reported by Wang Yugui [1], try_to_unmap() may
fail, but the first VM_BUG_ON_PAGE() just checks page_mapcount() however
it may miss the failure when head page is unmapped but other subpage is
mapped.  Then the second DEBUG_VM BUG() that check total mapcount would
catch it.  This may incur some confusion.

As this is not a fatal issue, so consolidate the two DEBUG_VM checks
into one VM_WARN_ON_ONCE_PAGE().

[1] https://lore.kernel.org/linux-mm/20210412180659.B9E3.409509F4@e16-tech.com/

Link: https://lkml.kernel.org/r/d0f0db68-98b8-ebfb-16dc-f29df24cf012@google.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Hugh Dickins
22061a1ffa mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()
There is a race between THP unmapping and truncation, when truncate sees
pmd_none() and skips the entry, after munmap's zap_huge_pmd() cleared
it, but before its page_remove_rmap() gets to decrement
compound_mapcount: generating false "BUG: Bad page cache" reports that
the page is still mapped when deleted.  This commit fixes that, but not
in the way I hoped.

The first attempt used try_to_unmap(page, TTU_SYNC|TTU_IGNORE_MLOCK)
instead of unmap_mapping_range() in truncate_cleanup_page(): it has
often been an annoyance that we usually call unmap_mapping_range() with
no pages locked, but there apply it to a single locked page.
try_to_unmap() looks more suitable for a single locked page.

However, try_to_unmap_one() contains a VM_BUG_ON_PAGE(!pvmw.pte,page):
it is used to insert THP migration entries, but not used to unmap THPs.
Copy zap_huge_pmd() and add THP handling now? Perhaps, but their TLB
needs are different, I'm too ignorant of the DAX cases, and couldn't
decide how far to go for anon+swap.  Set that aside.

The second attempt took a different tack: make no change in truncate.c,
but modify zap_huge_pmd() to insert an invalidated huge pmd instead of
clearing it initially, then pmd_clear() between page_remove_rmap() and
unlocking at the end.  Nice.  But powerpc blows that approach out of the
water, with its serialize_against_pte_lookup(), and interesting pgtable
usage.  It would need serious help to get working on powerpc (with a
minor optimization issue on s390 too).  Set that aside.

Just add an "if (page_mapped(page)) synchronize_rcu();" or other such
delay, after unmapping in truncate_cleanup_page()? Perhaps, but though
that's likely to reduce or eliminate the number of incidents, it would
give less assurance of whether we had identified the problem correctly.

This successful iteration introduces "unmap_mapping_page(page)" instead
of try_to_unmap(), and goes the usual unmap_mapping_range_tree() route,
with an addition to details.  Then zap_pmd_range() watches for this
case, and does spin_unlock(pmd_lock) if so - just like
page_vma_mapped_walk() now does in the PVMW_SYNC case.  Not pretty, but
safe.

Note that unmap_mapping_page() is doing a VM_BUG_ON(!PageLocked) to
assert its interface; but currently that's only used to make sure that
page->mapping is stable, and zap_pmd_range() doesn't care if the page is
locked or not.  Along these lines, in invalidate_inode_pages2_range()
move the initial unmap_mapping_range() out from under page lock, before
then calling unmap_mapping_page() under page lock if still mapped.

Link: https://lkml.kernel.org/r/a2a4a148-cdd8-942c-4ef8-51b77f643dbe@google.com
Fixes: fc127da085 ("truncate: handle file thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Jue Wang
31657170de mm/thp: fix page_address_in_vma() on file THP tails
Anon THP tails were already supported, but memory-failure may need to
use page_address_in_vma() on file THP tails, which its page->mapping
check did not permit: fix it.

hughd adds: no current usage is known to hit the issue, but this does
fix a subtle trap in a general helper: best fixed in stable sooner than
later.

Link: https://lkml.kernel.org/r/a0d9b53-bf5d-8bab-ac5-759dc61819c1@google.com
Fixes: 800d8c63b2 ("shmem: add huge pages support")
Signed-off-by: Jue Wang <juew@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Hugh Dickins
494334e43c mm/thp: fix vma_address() if virtual address below file offset
Running certain tests with a DEBUG_VM kernel would crash within hours,
on the total_mapcount BUG() in split_huge_page_to_list(), while trying
to free up some memory by punching a hole in a shmem huge page: split's
try_to_unmap() was unable to find all the mappings of the page (which,
on a !DEBUG_VM kernel, would then keep the huge page pinned in memory).

When that BUG() was changed to a WARN(), it would later crash on the
VM_BUG_ON_VMA(end < vma->vm_start || start >= vma->vm_end, vma) in
mm/internal.h:vma_address(), used by rmap_walk_file() for
try_to_unmap().

vma_address() is usually correct, but there's a wraparound case when the
vm_start address is unusually low, but vm_pgoff not so low:
vma_address() chooses max(start, vma->vm_start), but that decides on the
wrong address, because start has become almost ULONG_MAX.

Rewrite vma_address() to be more careful about vm_pgoff; move the
VM_BUG_ON_VMA() out of it, returning -EFAULT for errors, so that it can
be safely used from page_mapped_in_vma() and page_address_in_vma() too.

Add vma_address_end() to apply similar care to end address calculation,
in page_vma_mapped_walk() and page_mkclean_one() and try_to_unmap_one();
though it raises a question of whether callers would do better to supply
pvmw->end to page_vma_mapped_walk() - I chose not, for a smaller patch.

An irritation is that their apparent generality breaks down on KSM
pages, which cannot be located by the page->index that page_to_pgoff()
uses: as commit 4b0ece6fa0 ("mm: migrate: fix remove_migration_pte()
for ksm pages") once discovered.  I dithered over the best thing to do
about that, and have ended up with a VM_BUG_ON_PAGE(PageKsm) in both
vma_address() and vma_address_end(); though the only place in danger of
using it on them was try_to_unmap_one().

Sidenote: vma_address() and vma_address_end() now use compound_nr() on a
head page, instead of thp_size(): to make the right calculation on a
hugetlbfs page, whether or not THPs are configured.  try_to_unmap() is
used on hugetlbfs pages, but perhaps the wrong calculation never
mattered.

Link: https://lkml.kernel.org/r/caf1c1a3-7cfb-7f8f-1beb-ba816e932825@google.com
Fixes: a8fa41ad2f ("mm, rmap: check all VMAs that PTE-mapped THP can be part of")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Hugh Dickins
732ed55823 mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
Stressing huge tmpfs often crashed on unmap_page()'s VM_BUG_ON_PAGE
(!unmap_success): with dump_page() showing mapcount:1, but then its raw
struct page output showing _mapcount ffffffff i.e.  mapcount 0.

And even if that particular VM_BUG_ON_PAGE(!unmap_success) is removed,
it is immediately followed by a VM_BUG_ON_PAGE(compound_mapcount(head)),
and further down an IS_ENABLED(CONFIG_DEBUG_VM) total_mapcount BUG():
all indicative of some mapcount difficulty in development here perhaps.
But the !CONFIG_DEBUG_VM path handles the failures correctly and
silently.

I believe the problem is that once a racing unmap has cleared pte or
pmd, try_to_unmap_one() may skip taking the page table lock, and emerge
from try_to_unmap() before the racing task has reached decrementing
mapcount.

Instead of abandoning the unsafe VM_BUG_ON_PAGE(), and the ones that
follow, use PVMW_SYNC in try_to_unmap_one() in this case: adding
TTU_SYNC to the options, and passing that from unmap_page().

When CONFIG_DEBUG_VM, or for non-debug too? Consensus is to do the same
for both: the slight overhead added should rarely matter, except perhaps
if splitting sparsely-populated multiply-mapped shmem.  Once confident
that bugs are fixed, TTU_SYNC here can be removed, and the race
tolerated.

Link: https://lkml.kernel.org/r/c1e95853-8bcd-d8fd-55fa-e7f2488e78f@google.com
Fixes: fec89c109f ("thp: rewrite freeze_page()/unfreeze_page() with generic rmap walkers")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Hugh Dickins
3b77e8c8cd mm/thp: make is_huge_zero_pmd() safe and quicker
Most callers of is_huge_zero_pmd() supply a pmd already verified
present; but a few (notably zap_huge_pmd()) do not - it might be a pmd
migration entry, in which the pfn is encoded differently from a present
pmd: which might pass the is_huge_zero_pmd() test (though not on x86,
since L1TF forced us to protect against that); or perhaps even crash in
pmd_page() applied to a swap-like entry.

Make it safe by adding pmd_present() check into is_huge_zero_pmd()
itself; and make it quicker by saving huge_zero_pfn, so that
is_huge_zero_pmd() will not need to do that pmd_page() lookup each time.

__split_huge_pmd_locked() checked pmd_trans_huge() before: that worked,
but is unnecessary now that is_huge_zero_pmd() checks present.

Link: https://lkml.kernel.org/r/21ea9ca-a1f5-8b90-5e88-95fb1c49bbfa@google.com
Fixes: e71769ae52 ("mm: enable thp migration for shmem thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jue Wang <juew@google.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Hugh Dickins
99fa8a4820 mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
Patch series "mm/thp: fix THP splitting unmap BUGs and related", v10.

Here is v2 batch of long-standing THP bug fixes that I had not got
around to sending before, but prompted now by Wang Yugui's report
https://lore.kernel.org/linux-mm/20210412180659.B9E3.409509F4@e16-tech.com/

Wang Yugui has tested a rollup of these fixes applied to 5.10.39, and
they have done no harm, but have *not* fixed that issue: something more
is needed and I have no idea of what.

This patch (of 7):

Stressing huge tmpfs page migration racing hole punch often crashed on
the VM_BUG_ON(!pmd_present) in pmdp_huge_clear_flush(), with DEBUG_VM=y
kernel; or shortly afterwards, on a bad dereference in
__split_huge_pmd_locked() when DEBUG_VM=n.  They forgot to allow for pmd
migration entries in the non-anonymous case.

Full disclosure: those particular experiments were on a kernel with more
relaxed mmap_lock and i_mmap_rwsem locking, and were not repeated on the
vanilla kernel: it is conceivable that stricter locking happens to avoid
those cases, or makes them less likely; but __split_huge_pmd_locked()
already allowed for pmd migration entries when handling anonymous THPs,
so this commit brings the shmem and file THP handling into line.

And while there: use old_pmd rather than _pmd, as in the following
blocks; and make it clearer to the eye that the !vma_is_anonymous()
block is self-contained, making an early return after accounting for
unmapping.

Link: https://lkml.kernel.org/r/af88612-1473-2eaa-903-8d1a448b26@google.com
Link: https://lkml.kernel.org/r/dd221a99-efb3-cd1d-6256-7e646af29314@google.com
Fixes: e71769ae52 ("mm: enable thp migration for shmem thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jue Wang <juew@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Xu Yu
ffc90cbb29 mm, thp: use head page in __migration_entry_wait()
We notice that hung task happens in a corner but practical scenario when
CONFIG_PREEMPT_NONE is enabled, as follows.

Process 0                       Process 1                     Process 2..Inf
split_huge_page_to_list
    unmap_page
        split_huge_pmd_address
                                __migration_entry_wait(head)
                                                              __migration_entry_wait(tail)
    remap_page (roll back)
        remove_migration_ptes
            rmap_walk_anon
                cond_resched

Where __migration_entry_wait(tail) is occurred in kernel space, e.g.,
copy_to_user in fstat, which will immediately fault again without
rescheduling, and thus occupy the cpu fully.

When there are too many processes performing __migration_entry_wait on
tail page, remap_page will never be done after cond_resched.

This makes __migration_entry_wait operate on the compound head page,
thus waits for remap_page to complete, whether the THP is split
successfully or roll back.

Note that put_and_wait_on_page_locked helps to drop the page reference
acquired with get_page_unless_zero, as soon as the page is on the wait
queue, before actually waiting.  So splitting the THP is only prevented
for a brief interval.

Link: https://lkml.kernel.org/r/b9836c1dd522e903891760af9f0c86a2cce987eb.1623144009.git.xuyu@linux.alibaba.com
Fixes: ba98828088 ("thp: add option to setup migration entries during PMD split")
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Gang Deng <gavin.dg@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Andrew Morton
1b3865d016 mm/slub.c: include swab.h
Fixes build with CONFIG_SLAB_FREELIST_HARDENED=y.

Hopefully.  But it's the right thing to do anwyay.

Fixes: 1ad53d9fa3 ("slub: improve bit diffusion for freelist ptr obfuscation")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213417
Reported-by: <vannguye@cisco.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Pingfan Liu
4f5aecdff2 crash_core, vmcoreinfo: append 'SECTION_SIZE_BITS' to vmcoreinfo
As mentioned in kernel commit 1d50e5d0c5 ("crash_core, vmcoreinfo:
Append 'MAX_PHYSMEM_BITS' to vmcoreinfo"), SECTION_SIZE_BITS in the
formula:

    #define SECTIONS_SHIFT    (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Besides SECTIONS_SHIFT, SECTION_SIZE_BITS is also used to calculate
PAGES_PER_SECTION in makedumpfile just like kernel.

Unfortunately, this arch-dependent macro SECTION_SIZE_BITS changes, e.g.
recently in kernel commit f0b13ee232 ("arm64/sparsemem: reduce
SECTION_SIZE_BITS").  But user space wants a stable interface to get
this info.  Such info is impossible to be deduced from a crashdump
vmcore.  Hence append SECTION_SIZE_BITS to vmcoreinfo.

Link: https://lkml.kernel.org/r/20210608103359.84907-1-kernelfans@gmail.com
Link: http://lists.infradead.org/pipermail/kexec/2021-June/022676.html
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Boris Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Anderson <anderson@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
yangerkun
e8675d291a mm/memory-failure: make sure wait for page writeback in memory_failure
Our syzkaller trigger the "BUG_ON(!list_empty(&inode->i_wb_list))" in
clear_inode:

  kernel BUG at fs/inode.c:519!
  Internal error: Oops - BUG: 0 [#1] SMP
  Modules linked in:
  Process syz-executor.0 (pid: 249, stack limit = 0x00000000a12409d7)
  CPU: 1 PID: 249 Comm: syz-executor.0 Not tainted 4.19.95
  Hardware name: linux,dummy-virt (DT)
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : clear_inode+0x280/0x2a8
  lr : clear_inode+0x280/0x2a8
  Call trace:
    clear_inode+0x280/0x2a8
    ext4_clear_inode+0x38/0xe8
    ext4_free_inode+0x130/0xc68
    ext4_evict_inode+0xb20/0xcb8
    evict+0x1a8/0x3c0
    iput+0x344/0x460
    do_unlinkat+0x260/0x410
    __arm64_sys_unlinkat+0x6c/0xc0
    el0_svc_common+0xdc/0x3b0
    el0_svc_handler+0xf8/0x160
    el0_svc+0x10/0x218
  Kernel panic - not syncing: Fatal exception

A crash dump of this problem show that someone called __munlock_pagevec
to clear page LRU without lock_page: do_mmap -> mmap_region -> do_munmap
-> munlock_vma_pages_range -> __munlock_pagevec.

As a result memory_failure will call identify_page_state without
wait_on_page_writeback.  And after truncate_error_page clear the mapping
of this page.  end_page_writeback won't call sb_clear_inode_writeback to
clear inode->i_wb_list.  That will trigger BUG_ON in clear_inode!

Fix it by checking PageWriteback too to help determine should we skip
wait_on_page_writeback.

Link: https://lkml.kernel.org/r/20210604084705.3729204-1-yangerkun@huawei.com
Fixes: 0bc1f8b068 ("hwpoison: fix the handling path of the victimized page frame that belong to non-LRU")
Signed-off-by: yangerkun <yangerkun@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Mike Kravetz
846be08578 mm/hugetlb: expand restore_reserve_on_error functionality
The routine restore_reserve_on_error is called to restore reservation
information when an error occurs after page allocation.  The routine
alloc_huge_page modifies the mapping reserve map and potentially the
reserve count during allocation.  If code calling alloc_huge_page
encounters an error after allocation and needs to free the page, the
reservation information needs to be adjusted.

Currently, restore_reserve_on_error only takes action on pages for which
the reserve count was adjusted(HPageRestoreReserve flag).  There is
nothing wrong with these adjustments.  However, alloc_huge_page ALWAYS
modifies the reserve map during allocation even if the reserve count is
not adjusted.  This can cause issues as observed during development of
this patch [1].

One specific series of operations causing an issue is:

 - Create a shared hugetlb mapping
   Reservations for all pages created by default

 - Fault in a page in the mapping
   Reservation exists so reservation count is decremented

 - Punch a hole in the file/mapping at index previously faulted
   Reservation and any associated pages will be removed

 - Allocate a page to fill the hole
   No reservation entry, so reserve count unmodified
   Reservation entry added to map by alloc_huge_page

 - Error after allocation and before instantiating the page
   Reservation entry remains in map

 - Allocate a page to fill the hole
   Reservation entry exists, so decrement reservation count

This will cause a reservation count underflow as the reservation count
was decremented twice for the same index.

A user would observe a very large number for HugePages_Rsvd in
/proc/meminfo.  This would also likely cause subsequent allocations of
hugetlb pages to fail as it would 'appear' that all pages are reserved.

This sequence of operations is unlikely to happen, however they were
easily reproduced and observed using hacked up code as described in [1].

Address the issue by having the routine restore_reserve_on_error take
action on pages where HPageRestoreReserve is not set.  In this case, we
need to remove any reserve map entry created by alloc_huge_page.  A new
helper routine vma_del_reservation assists with this operation.

There are three callers of alloc_huge_page which do not currently call
restore_reserve_on error before freeing a page on error paths.  Add
those missing calls.

[1] https://lore.kernel.org/linux-mm/20210528005029.88088-1-almasrymina@google.com/

Link: https://lkml.kernel.org/r/20210607204510.22617-1-mike.kravetz@oracle.com
Fixes: 96b96a96dd ("mm/hugetlb: fix huge page reservation leak in private mapping error paths"
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Kees Cook
e41a49fadb mm/slub: actually fix freelist pointer vs redzoning
It turns out that SLUB redzoning ("slub_debug=Z") checks from
s->object_size rather than from s->inuse (which is normally bumped to
make room for the freelist pointer), so a cache created with an object
size less than 24 would have the freelist pointer written beyond
s->object_size, causing the redzone to be corrupted by the freelist
pointer.  This was very visible with "slub_debug=ZF":

  BUG test (Tainted: G    B            ): Right Redzone overwritten
  -----------------------------------------------------------------------------

  INFO: 0xffff957ead1c05de-0xffff957ead1c05df @offset=1502. First byte 0x1a instead of 0xbb
  INFO: Slab 0xffffef3950b47000 objects=170 used=170 fp=0x0000000000000000 flags=0x8000000000000200
  INFO: Object 0xffff957ead1c05d8 @offset=1496 fp=0xffff957ead1c0620

  Redzone  (____ptrval____): bb bb bb bb bb bb bb bb               ........
  Object   (____ptrval____): 00 00 00 00 00 f6 f4 a5               ........
  Redzone  (____ptrval____): 40 1d e8 1a aa                        @....
  Padding  (____ptrval____): 00 00 00 00 00 00 00 00               ........

Adjust the offset to stay within s->object_size.

(Note that no caches of in this size range are known to exist in the
kernel currently.)

Link: https://lkml.kernel.org/r/20210608183955.280836-4-keescook@chromium.org
Link: https://lore.kernel.org/linux-mm/20200807160627.GA1420741@elver.google.com/
Link: https://lore.kernel.org/lkml/0f7dd7b2-7496-5e2d-9488-2ec9f8e90441@suse.cz/Fixes: 89b83f282d (slub: avoid redzone when choosing freepointer location)
Link: https://lore.kernel.org/lkml/CANpmjNOwZ5VpKQn+SYWovTkFB4VsT-RPwyENBmaK0dLcpqStkA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Marco Elver <elver@google.com>
Reported-by: "Lin, Zhenpeng" <zplin@psu.edu>
Tested-by: Marco Elver <elver@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Kees Cook
74c1d3e081 mm/slub: fix redzoning for small allocations
The redzone area for SLUB exists between s->object_size and s->inuse
(which is at least the word-aligned object_size).  If a cache were
created with an object_size smaller than sizeof(void *), the in-object
stored freelist pointer would overwrite the redzone (e.g.  with boot
param "slub_debug=ZF"):

  BUG test (Tainted: G    B            ): Right Redzone overwritten
  -----------------------------------------------------------------------------

  INFO: 0xffff957ead1c05de-0xffff957ead1c05df @offset=1502. First byte 0x1a instead of 0xbb
  INFO: Slab 0xffffef3950b47000 objects=170 used=170 fp=0x0000000000000000 flags=0x8000000000000200
  INFO: Object 0xffff957ead1c05d8 @offset=1496 fp=0xffff957ead1c0620

  Redzone  (____ptrval____): bb bb bb bb bb bb bb bb    ........
  Object   (____ptrval____): f6 f4 a5 40 1d e8          ...@..
  Redzone  (____ptrval____): 1a aa                      ..
  Padding  (____ptrval____): 00 00 00 00 00 00 00 00    ........

Store the freelist pointer out of line when object_size is smaller than
sizeof(void *) and redzoning is enabled.

Additionally remove the "smaller than sizeof(void *)" check under
CONFIG_DEBUG_VM in kmem_cache_sanity_check() as it is now redundant:
SLAB and SLOB both handle small sizes.

(Note that no caches within this size range are known to exist in the
kernel currently.)

Link: https://lkml.kernel.org/r/20210608183955.280836-3-keescook@chromium.org
Fixes: 81819f0fc8 ("SLUB core")
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Lin, Zhenpeng" <zplin@psu.edu>
Cc: Marco Elver <elver@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Kees Cook
8669dbab2a mm/slub: clarify verification reporting
Patch series "Actually fix freelist pointer vs redzoning", v4.

This fixes redzoning vs the freelist pointer (both for middle-position
and very small caches).  Both are "theoretical" fixes, in that I see no
evidence of such small-sized caches actually be used in the kernel, but
that's no reason to let the bugs continue to exist, especially since
people doing local development keep tripping over it.  :)

This patch (of 3):

Instead of repeating "Redzone" and "Poison", clarify which sides of
those zones got tripped.  Additionally fix column alignment in the
trailer.

Before:

  BUG test (Tainted: G    B            ): Redzone overwritten
  ...
  Redzone (____ptrval____): bb bb bb bb bb bb bb bb      ........
  Object (____ptrval____): f6 f4 a5 40 1d e8            ...@..
  Redzone (____ptrval____): 1a aa                        ..
  Padding (____ptrval____): 00 00 00 00 00 00 00 00      ........

After:

  BUG test (Tainted: G    B            ): Right Redzone overwritten
  ...
  Redzone  (____ptrval____): bb bb bb bb bb bb bb bb      ........
  Object   (____ptrval____): f6 f4 a5 40 1d e8            ...@..
  Redzone  (____ptrval____): 1a aa                        ..
  Padding  (____ptrval____): 00 00 00 00 00 00 00 00      ........

The earlier commits that slowly resulted in the "Before" reporting were:

  d86bd1bece ("mm/slub: support left redzone")
  ffc79d2880 ("slub: use print_hex_dump")
  2492268472 ("SLUB: change error reporting format to follow lockdep loosely")

Link: https://lkml.kernel.org/r/20210608183955.280836-1-keescook@chromium.org
Link: https://lkml.kernel.org/r/20210608183955.280836-2-keescook@chromium.org
Link: https://lore.kernel.org/lkml/cfdb11d7-fb8e-e578-c939-f7f5fb69a6bd@suse.cz/
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Marco Elver <elver@google.com>
Cc: "Lin, Zhenpeng" <zplin@psu.edu>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Peter Xu
099dd6878b mm/swap: fix pte_same_as_swp() not removing uffd-wp bit when compare
I found it by pure code review, that pte_same_as_swp() of unuse_vma()
didn't take uffd-wp bit into account when comparing ptes.
pte_same_as_swp() returning false negative could cause failure to
swapoff swap ptes that was wr-protected by userfaultfd.

Link: https://lkml.kernel.org/r/20210603180546.9083-1-peterx@redhat.com
Fixes: f45ec5ff16 ("userfaultfd: wp: support swap and page migration")
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[5.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Naoya Horiguchi
25182f05ff mm,hwpoison: fix race with hugetlb page allocation
When hugetlb page fault (under overcommitting situation) and
memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following
race:

    CPU0:                           CPU1:

                                    gather_surplus_pages()
                                      page = alloc_surplus_huge_page()
    memory_failure_hugetlb()
      get_hwpoison_page(page)
        __get_hwpoison_page(page)
          get_page_unless_zero(page)
                                      zero = put_page_testzero(page)
                                      VM_BUG_ON_PAGE(!zero, page)
                                      enqueue_huge_page(h, page)
      put_page(page)

__get_hwpoison_page() only checks the page refcount before taking an
additional one for memory error handling, which is not enough because
there's a time window where compound pages have non-zero refcount during
hugetlb page initialization.

So make __get_hwpoison_page() check page status a bit more for hugetlb
pages with get_hwpoison_huge_page().  Checking hugetlb-specific flags
under hugetlb_lock makes sure that the hugetlb page is not transitive.
It's notable that another new function, HWPoisonHandlable(), is helpful
to prevent a race against other transitive page states (like a generic
compound page just before PageHuge becomes true).

Link: https://lkml.kernel.org/r/20210603233632.2964832-2-nao.horiguchi@gmail.com
Fixes: ead07f6a86 ("mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-16 09:24:42 -07:00
Linus Torvalds
6b00bc639f Merge tag 'dmaengine-fix-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
 "A bunch of driver fixes, notably:

   - More idxd fixes for driver unregister, error handling and bus
     assignment

   - HAS_IOMEM depends fix for few drivers

   - lock fix in pl330 driver

   - xilinx drivers fixes for initialize registers, missing dependencies
     and limiting descriptor IDs

   - mediatek descriptor management fixes"

* tag 'dmaengine-fix-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: mediatek: use GFP_NOWAIT instead of GFP_ATOMIC in prep_dma
  dmaengine: mediatek: do not issue a new desc if one is still current
  dmaengine: mediatek: free the proper desc in desc_free handler
  dmaengine: ipu: fix doc warning in ipu_irq.c
  dmaengine: rcar-dmac: Fix PM reference leak in rcar_dmac_probe()
  dmaengine: idxd: Fix missing error code in idxd_cdev_open()
  dmaengine: stedma40: add missing iounmap() on error in d40_probe()
  dmaengine: SF_PDMA depends on HAS_IOMEM
  dmaengine: QCOM_HIDMA_MGMT depends on HAS_IOMEM
  dmaengine: ALTERA_MSGDMA depends on HAS_IOMEM
  dmaengine: idxd: Add missing cleanup for early error out in probe call
  dmaengine: xilinx: dpdma: Limit descriptor IDs to 16 bits
  dmaengine: xilinx: dpdma: Add missing dependencies to Kconfig
  dmaengine: stm32-mdma: fix PM reference leak in stm32_mdma_alloc_chan_resourc()
  dmaengine: zynqmp_dma: Fix PM reference leak in zynqmp_dma_alloc_chan_resourc()
  dmaengine: xilinx: dpdma: initialize registers before request_irq
  dmaengine: pl330: fix wrong usage of spinlock flags in dma_cyclc
  dmaengine: fsl-dpaa2-qdma: Fix error return code in two functions
  dmaengine: idxd: add missing dsa driver unregister
  dmaengine: idxd: add engine 'struct device' missing bus type assignment
2021-06-16 09:03:52 -07:00
Linus Torvalds
cc9aaa2b07 Merge tag 'clang-features-v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull clang LTO fix from Kees Cook:
 "It seems Clang has been scrubbing through the missing LTO IR flags for
  Clang 13, and the last of these 'only with LTO' flags is fixed now.

  I've asked that they please consider making these changes in a less
  'break all the Clang kernel builds' kind of way in the future. :P

  Summary:

   - The '-warn-stack-size' option under LTO has moved in Clang 13 (Tor
     Vic)"

* tag 'clang-features-v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  Makefile: lto: Pass -warn-stack-size only on LLD < 13.0.0
2021-06-16 08:57:44 -07:00
Maxime Ripard
9984d6664c drm/vc4: hdmi: Make sure the controller is powered in detect
If the HPD GPIO is not available and drm_probe_ddc fails, we end up
reading the HDMI_HOTPLUG register, but the controller might be powered
off resulting in a CPU hang. Make sure we have the power domain and the
HSM clock powered during the detect cycle to prevent the hang from
happening.

Fixes: 4f6e3d66ac ("drm/vc4: Add runtime PM support to the HDMI encoder driver")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210525091059.234116-4-maxime@cerno.tech
2021-06-16 14:24:23 +02:00
Maxime Ripard
411efa18e4 drm/vc4: hdmi: Move the HSM clock enable to runtime_pm
In order to access the HDMI controller, we need to make sure the HSM
clock is enabled. If we were to access it with the clock disabled, the
CPU would completely hang, resulting in an hard crash.

Since we have different code path that would require it, let's move that
clock enable / disable to runtime_pm that will take care of the
reference counting for us.

Fixes: 4f6e3d66ac ("drm/vc4: Add runtime PM support to the HDMI encoder driver")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210525091059.234116-3-maxime@cerno.tech
2021-06-16 14:24:11 +02:00
Pavel Skripkin
91c0255717 can: mcba_usb: fix memory leak in mcba_usb
Syzbot reported memory leak in SocketCAN driver for Microchip CAN BUS
Analyzer Tool. The problem was in unfreed usb_coherent.

In mcba_usb_start() 20 coherent buffers are allocated and there is
nothing, that frees them:

1) In callback function the urb is resubmitted and that's all
2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
   is not set (see mcba_usb_start) and this flag cannot be used with
   coherent buffers.

Fail log:
| [ 1354.053291][ T8413] mcba_usb 1-1:0.0 can0: device disconnected
| [ 1367.059384][ T8420] kmemleak: 20 new suspected memory leaks (see /sys/kernel/debug/kmem)

So, all allocated buffers should be freed with usb_free_coherent()
explicitly

NOTE:
The same pattern for allocating and freeing coherent buffers
is used in drivers/net/can/usb/kvaser_usb/kvaser_usb_core.c

Fixes: 51f3baad7d ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer")
Link: https://lore.kernel.org/r/20210609215833.30393-1-paskripkin@gmail.com
Cc: linux-stable <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+57281c762a3922e14dfe@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-16 12:52:18 +02:00
Norbert Slusarek
5e87ddbe39 can: bcm: fix infoleak in struct bcm_msg_head
On 64-bit systems, struct bcm_msg_head has an added padding of 4 bytes between
struct members count and ival1. Even though all struct members are initialized,
the 4-byte hole will contain data from the kernel stack. This patch zeroes out
struct bcm_msg_head before usage, preventing infoleaks to userspace.

Fixes: ffd980f976 ("[CAN]: Add broadcast manager (bcm) protocol")
Link: https://lore.kernel.org/r/trinity-7c1b2e82-e34f-4885-8060-2cd7a13769ce-1623532166177@3c-app-gmx-bs52
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Norbert Slusarek <nslusarek@gmx.net>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-16 12:52:18 +02:00
Tetsuo Handa
8d0caedb75 can: bcm/raw/isotp: use per module netdevice notifier
syzbot is reporting hung task at register_netdevice_notifier() [1] and
unregister_netdevice_notifier() [2], for cleanup_net() might perform
time consuming operations while CAN driver's raw/bcm/isotp modules are
calling {register,unregister}_netdevice_notifier() on each socket.

Change raw/bcm/isotp modules to call register_netdevice_notifier() from
module's __init function and call unregister_netdevice_notifier() from
module's __exit function, as with gw/j1939 modules are doing.

Link: https://syzkaller.appspot.com/bug?id=391b9498827788b3cc6830226d4ff5be87107c30 [1]
Link: https://syzkaller.appspot.com/bug?id=1724d278c83ca6e6df100a2e320c10d991cf2bce [2]
Link: https://lore.kernel.org/r/54a5f451-05ed-f977-8534-79e7aa2bcc8f@i-love.sakura.ne.jp
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: syzbot <syzbot+355f8edb2ff45d5f95fa@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+0f1827363a305f74996f@syzkaller.appspotmail.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Tested-by: syzbot <syzbot+355f8edb2ff45d5f95fa@syzkaller.appspotmail.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-16 12:52:18 +02:00
Oleksij Rempel
2030043e61 can: j1939: fix Use-after-Free, hold skb ref while in use
This patch fixes a Use-after-Free found by the syzbot.

The problem is that a skb is taken from the per-session skb queue,
without incrementing the ref count. This leads to a Use-after-Free if
the skb is taken concurrently from the session queue due to a CTS.

Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/r/20210521115720.7533-1-o.rempel@pengutronix.de
Cc: Hillf Danton <hdanton@sina.com>
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: syzbot+220c1a29987a9a490903@syzkaller.appspotmail.com
Reported-by: syzbot+45199c1b73b4013525cf@syzkaller.appspotmail.com
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-06-16 12:52:18 +02:00
Punit Agrawal
6262e1b906 printk: Move EXPORT_SYMBOL() closer to vprintk definition
Commit 28e1745b9f ("printk: rename vprintk_func to vprintk") while
improving readability by removing vprintk indirection, inadvertently
placed the EXPORT_SYMBOL() for the newly renamed function at the end
of the file.

For reader sanity, and as is convention move the EXPORT_SYMBOL()
declaration just after the end of the function.

Fixes: 28e1745b9f ("printk: rename vprintk_func to vprintk")
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210614235635.887365-1-punitagrawal@gmail.com
2021-06-16 10:42:19 +02:00
Greg Kroah-Hartman
60ed39db6e Merge tag 'usb-v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-linus
Peter writes:

One bug fix for USB charger detection at imx7d and imx8m series SoCs

* tag 'usb-v5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb:
  usb: chipidea: imx: Fix Battery Charger 1.2 CDP detection
2021-06-16 09:33:39 +02:00
Breno Lima
c6d580d96f usb: chipidea: imx: Fix Battery Charger 1.2 CDP detection
i.MX8MM cannot detect certain CDP USB HUBs. usbmisc_imx.c driver is not
following CDP timing requirements defined by USB BC 1.2 specification
and section 3.2.4 Detection Timing CDP.

During Primary Detection the i.MX device should turn on VDP_SRC and
IDM_SINK for a minimum of 40ms (TVDPSRC_ON). After a time of TVDPSRC_ON,
the i.MX is allowed to check the status of the D- line. Current
implementation is waiting between 1ms and 2ms, and certain BC 1.2
complaint USB HUBs cannot be detected. Increase delay to 40ms allowing
enough time for primary detection.

During secondary detection the i.MX is required to disable VDP_SRC and
IDM_SNK, and enable VDM_SRC and IDP_SINK for at least 40ms (TVDMSRC_ON).

Current implementation is not disabling VDP_SRC and IDM_SNK, introduce
disable sequence in imx7d_charger_secondary_detection() function.

VDM_SRC and IDP_SINK should be enabled for at least 40ms (TVDMSRC_ON).
Increase delay allowing enough time for detection.

Cc: <stable@vger.kernel.org>
Fixes: 746f316b75 ("usb: chipidea: introduce imx7d USB charger detection")
Signed-off-by: Breno Lima <breno.lima@nxp.com>
Signed-off-by: Jun Li <jun.li@nxp.com>
Link: https://lore.kernel.org/r/20210614175013.495808-1-breno.lima@nxp.com
Signed-off-by: Peter Chen <peter.chen@kernel.org>
2021-06-16 09:04:22 +08:00
David S. Miller
a4f0377db1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-06-15

The following pull-request contains BPF updates for your *net* tree.

We've added 5 non-merge commits during the last 11 day(s) which contain
a total of 10 files changed, 115 insertions(+), 16 deletions(-).

The main changes are:

1) Fix marking incorrect umem ring as done in libbpf's
   xsk_socket__create_shared() helper, from Kev Jackson.

2) Fix oob leakage under a spectre v1 type confusion
   attack, from Daniel Borkmann.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 15:26:07 -07:00
Aleksander Jan Bajkowski
7ea6cd16f1 lantiq: net: fix duplicated skb in rx descriptor ring
The previous commit didn't fix the bug properly. By mistake, it replaces
the pointer of the next skb in the descriptor ring instead of the current
one. As a result, the two descriptors are assigned the same SKB. The error
is seen during the iperf test when skb_put tries to insert a second packet
and exceeds the available buffer.

Fixes: c7718ee96d ("net: lantiq: fix memory corruption in RX ring ")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 14:17:19 -07:00
Kristian Evensen
057d49334c qmi_wwan: Do not call netif_rx from rx_fixup
When the QMI_WWAN_FLAG_PASS_THROUGH is set, netif_rx() is called from
qmi_wwan_rx_fixup(). When the call to netif_rx() is successful (which is
most of the time), usbnet_skb_return() is called (from rx_process()).
usbnet_skb_return() will then call netif_rx() a second time for the same
skb.

Simplify the code and avoid the redundant netif_rx() call by changing
qmi_wwan_rx_fixup() to always return 1 when QMI_WWAN_FLAG_PASS_THROUGH
is set. We then leave it up to the existing infrastructure to call
netif_rx().

Suggested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 11:29:28 -07:00
Maciej Żenczykowski
c1a3d40673 net: cdc_ncm: switch to eth%d interface naming
This is meant to make the host side cdc_ncm interface consistently
named just like the older CDC protocols: cdc_ether & cdc_ecm
(and even rndis_host), which all use 'FLAG_ETHER | FLAG_POINTTOPOINT'.

include/linux/usb/usbnet.h:
  #define FLAG_ETHER	0x0020		/* maybe use "eth%d" names */
  #define FLAG_WLAN	0x0080		/* use "wlan%d" names */
  #define FLAG_WWAN	0x0400		/* use "wwan%d" names */
  #define FLAG_POINTTOPOINT 0x1000	/* possibly use "usb%d" names */

drivers/net/usb/usbnet.c @ line 1711:
  strcpy (net->name, "usb%d");
  ...
  // heuristic:  "usb%d" for links we know are two-host,
  // else "eth%d" when there's reasonable doubt.  userspace
  // can rename the link if it knows better.
  if ((dev->driver_info->flags & FLAG_ETHER) != 0 &&
      ((dev->driver_info->flags & FLAG_POINTTOPOINT) == 0 ||
       (net->dev_addr [0] & 0x02) == 0))
          strcpy (net->name, "eth%d");
  /* WLAN devices should always be named "wlan%d" */
  if ((dev->driver_info->flags & FLAG_WLAN) != 0)
          strcpy(net->name, "wlan%d");
  /* WWAN devices should always be named "wwan%d" */
  if ((dev->driver_info->flags & FLAG_WWAN) != 0)
          strcpy(net->name, "wwan%d");

So by using ETHER | POINTTOPOINT the interface naming is
either usb%d or eth%d based on the global uniqueness of the
mac address of the device.

Without this 2.5gbps ethernet dongles which all seem to use the cdc_ncm
driver end up being called usb%d instead of eth%d even though they're
definitely not two-host.  (All 1gbps & 5gbps ethernet usb dongles I've
tested don't hit this problem due to use of different drivers, primarily
r8152 and aqc111)

Fixes tag is based purely on git blame, and is really just here to make
sure this hits LTS branches newer than v4.5.

Cc: Lorenzo Colitti <lorenzo@google.com>
Fixes: 4d06dd537f ("cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind")
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 11:24:57 -07:00
Changbin Du
e34492dea6 net: inline function get_net_ns_by_fd if NET_NS is disabled
The function get_net_ns_by_fd() could be inlined when NET_NS is not
enabled.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 11:00:45 -07:00
Jakub Kicinski
475b92f932 ptp: improve max_adj check against unreasonable values
Scaled PPM conversion to PPB may (on 64bit systems) result
in a value larger than s32 can hold (freq/scaled_ppm is a long).
This means the kernel will not correctly reject unreasonably
high ->freq values (e.g. > 4294967295ppb, 281474976645 scaled PPM).

The conversion is equivalent to a division by ~66 (65.536),
so the value of ppb is always smaller than ppm, but not small
enough to assume narrowing the type from long -> s32 is okay.

Note that reasonable user space (e.g. ptp4l) will not use such
high values, anyway, 4289046510ppb ~= 4.3x, so the fix is
somewhat pedantic.

Fixes: d39a743511 ("ptp: validate the requested frequency adjustment.")
Fixes: d94ba80ebb ("ptp: Added a brand new class driver for ptp clocks.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 10:59:46 -07:00
Linus Torvalds
94f0b2d4a1 proc: only require mm_struct for writing
Commit 591a22c14d ("proc: Track /proc/$pid/attr/ opener mm_struct") we
started using __mem_open() to track the mm_struct at open-time, so that
we could then check it for writes.

But that also ended up making the permission checks at open time much
stricter - and not just for writes, but for reads too.  And that in turn
caused a regression for at least Fedora 29, where NIC interfaces fail to
start when using NetworkManager.

Since only the write side wanted the mm_struct test, ignore any failures
by __mem_open() at open time, leaving reads unaffected.  The write()
time verification of the mm_struct pointer will then catch the failure
case because a NULL pointer will not match a valid 'current->mm'.

Link: https://lore.kernel.org/netdev/YMjTlp2FSJYvoyFa@unreal/
Fixes: 591a22c14d ("proc: Track /proc/$pid/attr/ opener mm_struct")
Reported-and-tested-by: Leon Romanovsky <leon@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-15 10:47:51 -07:00
Kai Huang
4692bc775d x86/sgx: Add missing xa_destroy() when virtual EPC is destroyed
xa_destroy() needs to be called to destroy a virtual EPC's page array
before calling kfree() to free the virtual EPC. Currently it is not
called so add the missing xa_destroy().

Fixes: 540745ddbc ("x86/sgx: Introduce virtual EPC for use by KVM guests")
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Yang Zhong <yang.zhong@intel.com>
Link: https://lkml.kernel.org/r/20210615101639.291929-1-kai.huang@intel.com
2021-06-15 18:03:45 +02:00
Dan Carpenter
a33d62662d afs: Fix an IS_ERR() vs NULL check
The proc_symlink() function returns NULL on error, it doesn't return
error pointers.

Fixes: 5b86d4ff5d ("afs: Implement network namespacing")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/YLjMRKX40pTrJvgf@mwanda/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-15 07:42:26 -07:00
Michael Ellerman
478036c4cd powerpc: Fix initrd corruption with relative jump labels
Commit b0b3b2c78e ("powerpc: Switch to relative jump labels") switched
us to using relative jump labels. That involves changing the code,
target and key members in struct jump_entry to be relative to the
address of the jump_entry, rather than absolute addresses.

We have two static inlines that create a struct jump_entry,
arch_static_branch() and arch_static_branch_jump(), as well as an asm
macro ARCH_STATIC_BRANCH, which is used by the pseries-only hypervisor
tracing code.

Unfortunately we missed updating the key to be a relative reference in
ARCH_STATIC_BRANCH.

That causes a pseries kernel to have a handful of jump_entry structs
with bad key values. Instead of being a relative reference they instead
hold the full address of the key.

However the code doesn't expect that, it still adds the key value to the
address of the jump_entry (see jump_entry_key()) expecting to get a
pointer to a key somewhere in kernel data.

The table of jump_entry structs sits in rodata, which comes after the
kernel text. In a typical build this will be somewhere around 15MB. The
address of the key will be somewhere in data, typically around 20MB.
Adding the two values together gets us a pointer somewhere around 45MB.

We then call static_key_set_entries() with that bad pointer and modify
some members of the struct static_key we think we are pointing at.

A pseries kernel is typically ~30MB in size, so writing to ~45MB won't
corrupt the kernel itself. However if we're booting with an initrd,
depending on the size and exact location of the initrd, we can corrupt
the initrd. Depending on how exactly we corrupt the initrd it can either
cause the system to not boot, or just corrupt one of the files in the
initrd.

The fix is simply to make the key value relative to the jump_entry
struct in the ARCH_STATIC_BRANCH macro.

Fixes: b0b3b2c78e ("powerpc: Switch to relative jump labels")
Reported-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Reported-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reported-by: Greg Kurz <groug@kaod.org>
Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Daniel Axtens <dja@axtens.net>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210614131440.312360-1-mpe@ellerman.id.au
2021-06-15 23:35:57 +10:00
Peter Chen
4bf584a03e usb: dwc3: core: fix kernel panic when do reboot
When do system reboot, it calls dwc3_shutdown and the whole debugfs
for dwc3 has removed first, when the gadget tries to do deinit, and
remove debugfs for its endpoints, it meets NULL pointer dereference
issue when call debugfs_lookup. Fix it by removing the whole dwc3
debugfs later than dwc3_drd_exit.

[ 2924.958838] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000002
....
[ 2925.030994] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[ 2925.037005] pc : inode_permission+0x2c/0x198
[ 2925.041281] lr : lookup_one_len_common+0xb0/0xf8
[ 2925.045903] sp : ffff80001276ba70
[ 2925.049218] x29: ffff80001276ba70 x28: ffff0000c01f0000 x27: 0000000000000000
[ 2925.056364] x26: ffff800011791e70 x25: 0000000000000008 x24: dead000000000100
[ 2925.063510] x23: dead000000000122 x22: 0000000000000000 x21: 0000000000000001
[ 2925.070652] x20: ffff8000122c6188 x19: 0000000000000000 x18: 0000000000000000
[ 2925.077797] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000004
[ 2925.084943] x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000030
[ 2925.092087] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : ffff8000102b2420
[ 2925.099232] x8 : 7f7f7f7f7f7f7f7f x7 : feff73746e2f6f64 x6 : 0000000000008080
[ 2925.106378] x5 : 61c8864680b583eb x4 : 209e6ec2d263dbb7 x3 : 000074756f307065
[ 2925.113523] x2 : 0000000000000001 x1 : 0000000000000000 x0 : ffff8000122c6188
[ 2925.120671] Call trace:
[ 2925.123119]  inode_permission+0x2c/0x198
[ 2925.127042]  lookup_one_len_common+0xb0/0xf8
[ 2925.131315]  lookup_one_len_unlocked+0x34/0xb0
[ 2925.135764]  lookup_positive_unlocked+0x14/0x50
[ 2925.140296]  debugfs_lookup+0x68/0xa0
[ 2925.143964]  dwc3_gadget_free_endpoints+0x84/0xb0
[ 2925.148675]  dwc3_gadget_exit+0x28/0x78
[ 2925.152518]  dwc3_drd_exit+0x100/0x1f8
[ 2925.156267]  dwc3_remove+0x11c/0x120
[ 2925.159851]  dwc3_shutdown+0x14/0x20
[ 2925.163432]  platform_shutdown+0x28/0x38
[ 2925.167360]  device_shutdown+0x15c/0x378
[ 2925.171291]  kernel_restart_prepare+0x3c/0x48
[ 2925.175650]  kernel_restart+0x1c/0x68
[ 2925.179316]  __do_sys_reboot+0x218/0x240
[ 2925.183247]  __arm64_sys_reboot+0x28/0x30
[ 2925.187262]  invoke_syscall+0x48/0x100
[ 2925.191017]  el0_svc_common.constprop.0+0x48/0xc8
[ 2925.195726]  do_el0_svc+0x28/0x88
[ 2925.199045]  el0_svc+0x20/0x30
[ 2925.202104]  el0_sync_handler+0xa8/0xb0
[ 2925.205942]  el0_sync+0x148/0x180
[ 2925.209270] Code: a9025bf5 2a0203f5 121f0056 370802b5 (79400660)
[ 2925.215372] ---[ end trace 124254d8e485a58b ]---
[ 2925.220012] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 2925.227676] Kernel Offset: disabled
[ 2925.231164] CPU features: 0x00001001,20000846
[ 2925.235521] Memory Limit: none
[ 2925.238580] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

Fixes: 8d396bb0a5 ("usb: dwc3: debugfs: Add and remove endpoint dirs dynamically")
Cc: Jack Pham <jackp@codeaurora.org>
Tested-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Peter Chen <peter.chen@kernel.org>
Link: https://lore.kernel.org/r/20210608105656.10795-1-peter.chen@kernel.org
(cherry picked from commit 2a04276781)
Link: https://lore.kernel.org/r/20210615080847.GA10432@jackp-linux.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-15 12:20:26 +02:00
Marcin Juszkiewicz
8b1462b67f quota: finish disable quotactl_path syscall
In commit 5b9fedb31e ("quota: Disable quotactl_path syscall") Jan Kara
disabled quotactl_path syscall on several architectures.

This commit disables it on all architectures using unified list of
system calls:

- arm64
- arc
- csky
- h8300
- hexagon
- nds32
- nios2
- openrisc
- riscv (32/64)

CC: Jan Kara <jack@suse.cz>
CC: Christian Brauner <christian.brauner@ubuntu.com>
CC: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/lkml/20210512153621.n5u43jsytbik4yze@wittgenstein
Link: https://lore.kernel.org/r/20210614153712.313707-1-marcin@juszkiewicz.com.pl
Fixes: 5b9fedb31e ("quota: Disable quotactl_path syscall")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-06-15 11:22:45 +02:00
Tor Vic
0236526d76 Makefile: lto: Pass -warn-stack-size only on LLD < 13.0.0
Since LLVM commit fc018eb, the '-warn-stack-size' flag has been dropped
[1], leading to the following error message when building with Clang-13
and LLD-13:

    ld.lld: error: -plugin-opt=-: ld.lld: Unknown command line argument
    '-warn-stack-size=2048'.  Try: 'ld.lld --help'
    ld.lld: Did you mean '--asan-stack=2048'?

In the same way as with commit 2398ce8015 ("x86, lto: Pass
-stack-alignment only on LLD < 13.0.0") , make '-warn-stack-size'
conditional on LLD < 13.0.0.

[1] https://reviews.llvm.org/D103928

Fixes: 24845dcb17 ("Makefile: LTO: have linker check -Wframe-larger-than")
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1377
Signed-off-by: Tor Vic <torvic9@mailbox.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/7631bab7-a8ab-f884-ab54-f4198976125c@mailbox.org
2021-06-14 14:52:38 -07:00
Subash Abhinov Kasiviswanathan
2214fb5300 net: mhi_net: Update the transmit handler prototype
Update the function prototype of mhi_ndo_xmit to match
ndo_start_xmit. This otherwise leads to run time failures when
CFI is enabled in kernel.

Fixes: 3ffec6a14f ("net: Add mhi-net driver")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 14:13:09 -07:00
Daniel Borkmann
973377ffe8 bpf, selftests: Adjust few selftest outcomes wrt unreachable code
In almost all cases from test_verifier that have been changed in here, we've
had an unreachable path with a load from a register which has an invalid
address on purpose. This was basically to make sure that we never walk this
path and to have the verifier complain if it would otherwise. Change it to
match on the right error for unprivileged given we now test these paths
under speculative execution.

There's one case where we match on exact # of insns_processed. Due to the
extra path, this will of course mismatch on unprivileged. Thus, restrict the
test->insn_processed check to privileged-only.

In one other case, we result in a 'pointer comparison prohibited' error. This
is similarly due to verifying an 'invalid' branch where we end up with a value
pointer on one side of the comparison.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-06-14 23:06:38 +02:00
Daniel Borkmann
9183671af6 bpf: Fix leakage under speculation on mispredicted branches
The verifier only enumerates valid control-flow paths and skips paths that
are unreachable in the non-speculative domain. And so it can miss issues
under speculative execution on mispredicted branches.

For example, a type confusion has been demonstrated with the following
crafted program:

  // r0 = pointer to a map array entry
  // r6 = pointer to readable stack slot
  // r9 = scalar controlled by attacker
  1: r0 = *(u64 *)(r0) // cache miss
  2: if r0 != 0x0 goto line 4
  3: r6 = r9
  4: if r0 != 0x1 goto line 6
  5: r9 = *(u8 *)(r6)
  6: // leak r9

Since line 3 runs iff r0 == 0 and line 5 runs iff r0 == 1, the verifier
concludes that the pointer dereference on line 5 is safe. But: if the
attacker trains both the branches to fall-through, such that the following
is speculatively executed ...

  r6 = r9
  r9 = *(u8 *)(r6)
  // leak r9

... then the program will dereference an attacker-controlled value and could
leak its content under speculative execution via side-channel. This requires
to mistrain the branch predictor, which can be rather tricky, because the
branches are mutually exclusive. However such training can be done at
congruent addresses in user space using different branches that are not
mutually exclusive. That is, by training branches in user space ...

  A:  if r0 != 0x0 goto line C
  B:  ...
  C:  if r0 != 0x0 goto line D
  D:  ...

... such that addresses A and C collide to the same CPU branch prediction
entries in the PHT (pattern history table) as those of the BPF program's
lines 2 and 4, respectively. A non-privileged attacker could simply brute
force such collisions in the PHT until observing the attack succeeding.

Alternative methods to mistrain the branch predictor are also possible that
avoid brute forcing the collisions in the PHT. A reliable attack has been
demonstrated, for example, using the following crafted program:

  // r0 = pointer to a [control] map array entry
  // r7 = *(u64 *)(r0 + 0), training/attack phase
  // r8 = *(u64 *)(r0 + 8), oob address
  // [...]
  // r0 = pointer to a [data] map array entry
  1: if r7 == 0x3 goto line 3
  2: r8 = r0
  // crafted sequence of conditional jumps to separate the conditional
  // branch in line 193 from the current execution flow
  3: if r0 != 0x0 goto line 5
  4: if r0 == 0x0 goto exit
  5: if r0 != 0x0 goto line 7
  6: if r0 == 0x0 goto exit
  [...]
  187: if r0 != 0x0 goto line 189
  188: if r0 == 0x0 goto exit
  // load any slowly-loaded value (due to cache miss in phase 3) ...
  189: r3 = *(u64 *)(r0 + 0x1200)
  // ... and turn it into known zero for verifier, while preserving slowly-
  // loaded dependency when executing:
  190: r3 &= 1
  191: r3 &= 2
  // speculatively bypassed phase dependency
  192: r7 += r3
  193: if r7 == 0x3 goto exit
  194: r4 = *(u8 *)(r8 + 0)
  // leak r4

As can be seen, in training phase (phase != 0x3), the condition in line 1
turns into false and therefore r8 with the oob address is overridden with
the valid map value address, which in line 194 we can read out without
issues. However, in attack phase, line 2 is skipped, and due to the cache
miss in line 189 where the map value is (zeroed and later) added to the
phase register, the condition in line 193 takes the fall-through path due
to prior branch predictor training, where under speculation, it'll load the
byte at oob address r8 (unknown scalar type at that point) which could then
be leaked via side-channel.

One way to mitigate these is to 'branch off' an unreachable path, meaning,
the current verification path keeps following the is_branch_taken() path
and we push the other branch to the verification stack. Given this is
unreachable from the non-speculative domain, this branch's vstate is
explicitly marked as speculative. This is needed for two reasons: i) if
this path is solely seen from speculative execution, then we later on still
want the dead code elimination to kick in in order to sanitize these
instructions with jmp-1s, and ii) to ensure that paths walked in the
non-speculative domain are not pruned from earlier walks of paths walked in
the speculative domain. Additionally, for robustness, we mark the registers
which have been part of the conditional as unknown in the speculative path
given there should be no assumptions made on their content.

The fix in here mitigates type confusion attacks described earlier due to
i) all code paths in the BPF program being explored and ii) existing
verifier logic already ensuring that given memory access instruction
references one specific data structure.

An alternative to this fix that has also been looked at in this scope was to
mark aux->alu_state at the jump instruction with a BPF_JMP_TAKEN state as
well as direction encoding (always-goto, always-fallthrough, unknown), such
that mixing of different always-* directions themselves as well as mixing of
always-* with unknown directions would cause a program rejection by the
verifier, e.g. programs with constructs like 'if ([...]) { x = 0; } else
{ x = 1; }' with subsequent 'if (x == 1) { [...] }'. For unprivileged, this
would result in only single direction always-* taken paths, and unknown taken
paths being allowed, such that the former could be patched from a conditional
jump to an unconditional jump (ja). Compared to this approach here, it would
have two downsides: i) valid programs that otherwise are not performing any
pointer arithmetic, etc, would potentially be rejected/broken, and ii) we are
required to turn off path pruning for unprivileged, where both can be avoided
in this work through pushing the invalid branch to the verification stack.

The issue was originally discovered by Adam and Ofek, and later independently
discovered and reported as a result of Benedict and Piotr's research work.

Fixes: b2157399cc ("bpf: prevent out-of-bounds speculation")
Reported-by: Adam Morrison <mad@cs.tau.ac.il>
Reported-by: Ofek Kirzner <ofekkir@gmail.com>
Reported-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reported-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-06-14 23:06:10 +02:00
Daniel Borkmann
fe9a5ca7e3 bpf: Do not mark insn as seen under speculative path verification
... in such circumstances, we do not want to mark the instruction as seen given
the goal is still to jmp-1 rewrite/sanitize dead code, if it is not reachable
from the non-speculative path verification. We do however want to verify it for
safety regardless.

With the patch as-is all the insns that have been marked as seen before the
patch will also be marked as seen after the patch (just with a potentially
different non-zero count). An upcoming patch will also verify paths that are
unreachable in the non-speculative domain, hence this extension is needed.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-06-14 23:06:06 +02:00
Daniel Borkmann
d203b0fd86 bpf: Inherit expanded/patched seen count from old aux data
Instead of relying on current env->pass_cnt, use the seen count from the
old aux data in adjust_insn_aux_data(), and expand it to the new range of
patched instructions. This change is valid given we always expand 1:n
with n>=1, so what applies to the old/original instruction needs to apply
for the replacement as well.

Not relying on env->pass_cnt is a prerequisite for a later change where we
want to avoid marking an instruction seen when verified under speculative
execution path.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-06-14 23:06:00 +02:00
David S. Miller
45deacc731 Merge tag 'for-net-2021-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - Fix crash on SMP when debug is enabled
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 14:00:57 -07:00
Odin Ugedal
a7b359fc6a sched/fair: Correctly insert cfs_rq's to list on unthrottle
Fix an issue where fairness is decreased since cfs_rq's can end up not
being decayed properly. For two sibling control groups with the same
priority, this can often lead to a load ratio of 99/1 (!!).

This happens because when a cfs_rq is throttled, all the descendant
cfs_rq's will be removed from the leaf list. When they initial cfs_rq
is unthrottled, it will currently only re add descendant cfs_rq's if
they have one or more entities enqueued. This is not a perfect
heuristic.

Instead, we insert all cfs_rq's that contain one or more enqueued
entities, or it its load is not completely decayed.

Can often lead to situations like this for equally weighted control
groups:

  $ ps u -C stress
  USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
  root       10009 88.8  0.0   3676   100 pts/1    R+   11:04   0:13 stress --cpu 1
  root       10023  3.0  0.0   3676   104 pts/1    R+   11:04   0:00 stress --cpu 1

Fixes: 31bc6aeaab ("sched/fair: Optimize update_blocked_averages()")
[vingo: !SMP build fix]
Signed-off-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20210612112815.61678-1-odin@uged.al
2021-06-14 22:58:47 +02:00
Luiz Augusto von Dentz
995fca15b7 Bluetooth: SMP: Fix crash when receiving new connection when debug is enabled
When receiving a new connection pchan->conn won't be initialized so the
code cannot use bt_dev_dbg as the pointer to hci_dev won't be
accessible.

Fixes: 2e1614f7d6 ("Bluetooth: SMP: Convert BT_ERR/BT_DBG to bt_dev_err/bt_dev_dbg")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-06-14 22:16:27 +02:00
Pavel Skripkin
ad9d24c942 net: qrtr: fix OOB Read in qrtr_endpoint_post
Syzbot reported slab-out-of-bounds Read in
qrtr_endpoint_post. The problem was in wrong
_size_ type:

	if (len != ALIGN(size, 4) + hdrlen)
		goto err;

If size from qrtr_hdr is 4294967293 (0xfffffffd), the result of
ALIGN(size, 4) will be 0. In case of len == hdrlen and size == 4294967293
in header this check won't fail and

	skb_put_data(skb, data + hdrlen, size);

will read out of bound from data, which is hdrlen allocated block.

Fixes: 194ccc8829 ("net: qrtr: Support decoding incoming v2 packets")
Reported-and-tested-by: syzbot+1917d778024161609247@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 13:01:26 -07:00
David Ahern
b87b04f501 ipv4: Fix device used for dst_alloc with local routes
Oliver reported a use case where deleting a VRF device can hang
waiting for the refcnt to drop to 0. The root cause is that the dst
is allocated against the VRF device but cached on the loopback
device.

The use case (added to the selftests) has an implicit VRF crossing
due to the ordering of the FIB rules (lookup local is before the
l3mdev rule, but the problem occurs even if the FIB rules are
re-ordered with local after l3mdev because the VRF table does not
have a default route to terminate the lookup). The end result is
is that the FIB lookup returns the loopback device as the nexthop,
but the ingress device is in a VRF. The mismatch causes the dst
alloc against the VRF device but then cached on the loopback.

The fix is to bring the trick used for IPv6 (see ip6_rt_get_dev_rcu):
pick the dst alloc device based the fib lookup result but with checks
that the result has a nexthop device (e.g., not an unreachable or
prohibit entry).

Fixes: f5a0aab84b ("net: ipv4: dst for local input routes should use l3mdev if relevant")
Reported-by: Oliver Herms <oliver.peter.herms@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:30:53 -07:00
Pavel Skripkin
58af3d3d54 net: caif: fix memory leak in ldisc_open
Syzbot reported memory leak in tty_init_dev().
The problem was in unputted tty in ldisc_open()

static int ldisc_open(struct tty_struct *tty)
{
...
	ser->tty = tty_kref_get(tty);
...
	result = register_netdevice(dev);
	if (result) {
		rtnl_unlock();
		free_netdev(dev);
		return -ENODEV;
	}
...
}

Ser pointer is netdev private_data, so after free_netdev()
this pointer goes away with unputted tty reference. So, fix
it by adding tty_kref_put() before freeing netdev.

Reported-and-tested-by: syzbot+f303e045423e617d2cad@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:28:16 -07:00
Rahul Lakkireddy
09427c1915 cxgb4: fix wrong ethtool n-tuple rule lookup
The TID returned during successful filter creation is relative to
the region in which the filter is created. Using it directly always
returns Hi Prio/Normal filter region's entry for the first couple of
entries, even though the rule is actually inserted in Hash region.
Fix by analyzing in which region the filter has been inserted and
save the absolute TID to be used for lookup later.

Fixes: db43b30cd8 ("cxgb4: add ethtool n-tuple filter deletion")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:17:57 -07:00
Christophe JAILLET
49a10c7b17 netxen_nic: Fix an error handling path in 'netxen_nic_probe()'
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it
must be undone by a corresponding 'pci_disable_pcie_error_reporting()'
call, as already done in the remove function.

Fixes: e87ad55393 ("netxen: support pci error handlers")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:17:01 -07:00
Christophe JAILLET
cb3376604a qlcnic: Fix an error handling path in 'qlcnic_probe()'
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it
must be undone by a corresponding 'pci_disable_pcie_error_reporting()'
call, as already done in the remove function.

Fixes: 451724c821 ("qlcnic: aer support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:16:07 -07:00
Jakub Kicinski
e175aef902 ethtool: strset: fix message length calculation
Outer nest for ETHTOOL_A_STRSET_STRINGSETS is not accounted for.
This may result in ETHTOOL_MSG_STRSET_GET producing a warning like:

    calculated message payload length (684) not sufficient
    WARNING: CPU: 0 PID: 30967 at net/ethtool/netlink.c:369 ethnl_default_doit+0x87a/0xa20

and a splat.

As usually with such warnings three conditions must be met for the warning
to trigger:
 - there must be no skb size rounding up (e.g. reply_size of 684);
 - string set must be per-device (so that the header gets populated);
 - the device name must be at least 12 characters long.

all in all with current user space it looks like reading priv flags
is the only place this could potentially happen. Or with syzbot :)

Reported-by: syzbot+59aa77b92d06cd5a54f2@syzkaller.appspotmail.com
Fixes: 71921690f9 ("ethtool: provide string sets with STRSET_GET request")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:14:24 -07:00
Alex Elder
994c393bb6 net: qualcomm: rmnet: don't over-count statistics
The purpose of the loop using u64_stats_fetch_*_irq() is to ensure
statistics on a given CPU are collected atomically. If one of the
statistics values gets updated within the begin/retry window, the
loop will run again.

Currently the statistics totals are updated inside that window.
This means that if the loop ever retries, the statistics for the
CPU will be counted more than once.

Fix this by taking a snapshot of a CPU's statistics inside the
protected window, and then updating the counters with the snapshot
values after exiting the loop.

(Also add a newline at the end of this file...)

Fixes: 192c4b5d48 ("net: qualcomm: rmnet: Add support for 64 bit stats")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:13:38 -07:00
Tyson Moore
4f667b8e04 sch_cake: revise docs for RFC 8622 LE PHB support
Commit b8392808eb ("sch_cake: add RFC 8622 LE PHB support to CAKE
diffserv handling") added the LE mark to the Bulk tin. Update the
comments to reflect the change.

Signed-off-by: Tyson Moore <tyson@tyson.me>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14 12:10:57 -07:00
Haibo Chen
f422316c8e spi: spi-nxp-fspi: move the register operation after the clock enable
Move the register operation after the clock enable, otherwise system
will stuck when this driver probe.

Fixes: 71d80563b0 ("spi: spi-nxp-fspi: fix fspi panic by unexpected interrupts")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Link: https://lore.kernel.org/r/1623317073-25158-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-14 15:02:01 +01:00
Viresh Kumar
771fac5e26 Revert "cpufreq: CPPC: Add support for frequency invariance"
This reverts commit 4c38f2df71.

There are few races in the frequency invariance support for CPPC driver,
namely the driver doesn't stop the kthread_work and irq_work on policy
exit during suspend/resume or CPU hotplug.

A proper fix won't be possible for the 5.13-rc, as it requires a lot of
changes. Lets revert the patch instead for now.

Fixes: 4c38f2df71 ("cpufreq: CPPC: Add support for frequency invariance")
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-14 15:55:02 +02:00
Michael Ellerman
e41d6c3f4f powerpc/signal64: Copy siginfo before changing regs->nip
In commit 96d7a4e06f ("powerpc/signal64: Rewrite handle_rt_signal64()
to minimise uaccess switches") the 64-bit signal code was rearranged to
use user_write_access_begin/end().

As part of that change the call to copy_siginfo_to_user() was moved
later in the function, so that it could be done after the
user_write_access_end().

In particular it was moved after we modify regs->nip to point to the
signal trampoline. That means if copy_siginfo_to_user() fails we exit
handle_rt_signal64() with an error but with regs->nip modified, whereas
previously we would not modify regs->nip until the copy succeeded.

Returning an error from signal delivery but with regs->nip updated
leaves the process in a sort of half-delivered state. We do immediately
force a SEGV in signal_setup_done(), called from do_signal(), so the
process should never run in the half-delivered state.

However that SEGV is not delivered until we've gone around to
do_notify_resume() again, so it's possible some tracing could observe
the half-delivered state.

There are other cases where we fail signal delivery with regs partly
updated, eg. the write to newsp and SA_SIGINFO, but the latter at least
is very unlikely to fail as it reads back from the frame we just wrote
to.

Looking at other arches they seem to be more careful about leaving regs
unchanged until the copy operations have succeeded, and in general that
seems like good hygenie.

So although the current behaviour is not cleary buggy, it's also not
clearly correct. So move the call to copy_siginfo_to_user() up prior to
the modification of regs->nip, which is closer to the old behaviour, and
easier to reason about.

Fixes: 96d7a4e06f ("powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess switches")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210608134605.2783677-1-mpe@ellerman.id.au
2021-06-14 22:14:54 +10:00
Neil Armstrong
103a5348c2 mmc: meson-gx: use memcpy_to/fromio for dram-access-quirk
It has been reported that usage of memcpy() to/from an iomem mapping is invalid,
and a recent arm64 memcpy update [1] triggers a memory abort when dram-access-quirk
is used on the G12A/G12B platforms.

This adds a local sg_copy_to_buffer which makes usage of io versions of memcpy
when dram-access-quirk is enabled.

[1] 285133040e ("arm64: Import latest memcpy()/memmove() implementation")

Fixes: acdc8e71d9 ("mmc: meson-gx: add dram-access-quirk")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20210609150230.9291-1-narmstrong@baylibre.com
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-06-14 14:02:33 +02:00
Matthew Bobrowski
f644bc449b fanotify: fix copy_event_to_user() fid error clean up
Ensure that clean up is performed on the allocated file descriptor and
struct file object in the event that an error is encountered while copying
fid info objects. Currently, we return directly to the caller when an error
is experienced in the fid info copying helper, which isn't ideal given that
the listener process could be left with a dangling file descriptor in their
fdtable.

Fixes: 5e469c830f ("fanotify: copy event fid info to user")
Fixes: 44d705b037 ("fanotify: report name info for FAN_DIR_MODIFY event")
Link: https://lore.kernel.org/linux-fsdevel/YMKv1U7tNPK955ho@google.com/T/#m15361cd6399dad4396aad650de25dbf6b312288e
Link: https://lore.kernel.org/r/1ef8ae9100101eb1a91763c516c2e9a3a3b112bd.1623376346.git.repnop@google.com
Signed-off-by: Matthew Bobrowski <repnop@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-06-14 12:16:37 +02:00
Linus Torvalds
009c9aa5be Linux 5.13-rc6 2021-06-13 14:43:10 -07:00
Linus Torvalds
e4e453434a Merge tag 'perf-tools-fixes-for-v5.13-2021-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Correct buffer copying when peeking events

 - Sync cpufeatures/disabled-features.h header with the kernel sources

* tag 'perf-tools-fixes-for-v5.13-2021-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  tools headers cpufeatures: Sync with the kernel sources
  perf session: Correct buffer copying when peeking events
2021-06-13 12:41:47 -07:00
Linus Torvalds
960f0716d8 Merge tag 'nfs-for-5.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

  Stable fixes:

   - Fix use-after-free in nfs4_init_client()

  Bugfixes:

   - Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()

   - Fix second deadlock in nfs4_evict_inode()

   - nfs4_proc_set_acl should not change the value of NFS_CAP_UIDGID_NOMAP

   - Fix setting of the NFS_CAP_SECURITY_LABEL capability"

* tag 'nfs-for-5.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Fix second deadlock in nfs4_evict_inode()
  NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()
  NFS: FMODE_READ and friends are C macros, not enum types
  NFS: Fix a potential NULL dereference in nfs_get_client()
  NFS: Fix use-after-free in nfs4_init_client()
  NFS: Ensure the NFS_CAP_SECURITY_LABEL capability is set when appropriate
  NFSv4: nfs4_proc_set_acl needs to restore NFS_CAP_UIDGID_NOMAP on error.
2021-06-13 12:32:59 -07:00
Linus Torvalds
331a6edb30 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Four reasonably small fixes to the core for scsi host allocation
  failure paths.

  The root problem is that we're not freeing the memory allocated by
  dev_set_name(), which involves a rejig of may of the free on error
  paths to do put_device() instead of kfree which, in turn, has several
  other knock on ramifications and inspection turned up a few other
  lurking bugs"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: core: Only put parent device if host state differs from SHOST_CREATED
  scsi: core: Put .shost_dev in failure path if host state changes to RUNNING
  scsi: core: Fix failure handling of scsi_add_host_with_dma()
  scsi: core: Fix error handling of scsi_host_alloc()
2021-06-13 12:25:33 -07:00
Randy Dunlap
01f5315dd7 riscv: sifive: fix Kconfig errata warning
The SOC_SIFIVE Kconfig entry unconditionally selects ERRATA_SIFIVE.
However, ERRATA_SIFIVE depends on RISCV_ERRATA_ALTERNATIVE, which is
not set, so SOC_SIFIVE should either depend on or select
RISCV_ERRATA_ALTERNATIVE. Use 'select' here to quieten the Kconfig
warning.

WARNING: unmet direct dependencies detected for ERRATA_SIFIVE
  Depends on [n]: RISCV_ERRATA_ALTERNATIVE [=n]
  Selected by [y]:
  - SOC_SIFIVE [=y]

Fixes: 1a0e5dbd37 ("riscv: sifive: Add SiFive alternative ports")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: linux-riscv@lists.infradead.org
Cc: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-12 17:20:50 -07:00
Khem Raj
5d2388dbf8 riscv32: Use medany C model for modules
When CONFIG_CMODEL_MEDLOW is used it ends up generating riscv_hi20_rela
relocations in modules which are not resolved during runtime and
following errors would be seen

[    4.802714] virtio_input: target 00000000c1539090 can not be addressed by the 32-bit offset from PC = 39148b7b
[    4.854800] virtio_input: target 00000000c1539090 can not be addressed by the 32-bit offset from PC = 9774456d

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-12 17:20:49 -07:00
Linus Torvalds
8ecfa36cd4 Merge tag 'riscv-for-linus-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:

 - A pair of XIP fixes: one to fix alternatives, and one to turn off the
   rest of the features that require code modification

 - A fix to a type that was causing some alternatives to break

 - A build fix for BUILTIN_DTB

* tag 'riscv-for-linus-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix BUILTIN_DTB for sifive and microchip soc
  riscv: alternative: fix typo in macro name
  riscv: code patching only works on !XIP_KERNEL
  riscv: xip: support runtime trap patching
2021-06-12 13:57:49 -07:00
Feng Tang
2e3025434a mm: relocate 'write_protect_seq' in struct mm_struct
0day robot reported a 9.2% regression for will-it-scale mmap1 test
case[1], caused by commit 57efa1fe59 ("mm/gup: prevent gup_fast from
racing with COW during fork").

Further debug shows the regression is due to that commit changes the
offset of hot fields 'mmap_lock' inside structure 'mm_struct', thus some
cache alignment changes.

From the perf data, the contention for 'mmap_lock' is very severe and
takes around 95% cpu cycles, and it is a rw_semaphore

        struct rw_semaphore {
                atomic_long_t count;	/* 8 bytes */
                atomic_long_t owner;	/* 8 bytes */
                struct optimistic_spin_queue osq; /* spinner MCS lock */
                ...

Before commit 57efa1fe59 adds the 'write_protect_seq', it happens to
have a very optimal cache alignment layout, as Linus explained:

 "and before the addition of the 'write_protect_seq' field, the
  mmap_sem was at offset 120 in 'struct mm_struct'.

  Which meant that count and owner were in two different cachelines,
  and then when you have contention and spend time in
  rwsem_down_write_slowpath(), this is probably *exactly* the kind
  of layout you want.

  Because first the rwsem_write_trylock() will do a cmpxchg on the
  first cacheline (for the optimistic fast-path), and then in the
  case of contention, rwsem_down_write_slowpath() will just access
  the second cacheline.

  Which is probably just optimal for a load that spends a lot of
  time contended - new waiters touch that first cacheline, and then
  they queue themselves up on the second cacheline."

After the commit, the rw_semaphore is at offset 128, which means the
'count' and 'owner' fields are now in the same cacheline, and causes
more cache bouncing.

Currently there are 3 "#ifdef CONFIG_XXX" before 'mmap_lock' which will
affect its offset:

  CONFIG_MMU
  CONFIG_MEMBARRIER
  CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES

The layout above is on 64 bits system with 0day's default kernel config
(similar to RHEL-8.3's config), in which all these 3 options are 'y'.
And the layout can vary with different kernel configs.

Relayouting a structure is usually a double-edged sword, as sometimes it
can helps one case, but hurt other cases.  For this case, one solution
is, as the newly added 'write_protect_seq' is a 4 bytes long seqcount_t
(when CONFIG_DEBUG_LOCK_ALLOC=n), placing it into an existing 4 bytes
hole in 'mm_struct' will not change other fields' alignment, while
restoring the regression.

Link: https://lore.kernel.org/lkml/20210525031636.GB7744@xsang-OptiPlex-9020/ [1]
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-12 13:28:50 -07:00
Changbin Du
ea6932d70e net: make get_net_ns return error if NET_NS is disabled
There is a panic in socket ioctl cmd SIOCGSKNS when NET_NS is not enabled.
The reason is that nsfs tries to access ns->ops but the proc_ns_operations
is not implemented in this case.

[7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010
[7.670268] pgd = 32b54000
[7.670544] [00000010] *pgd=00000000
[7.671861] Internal error: Oops: 5 [#1] SMP ARM
[7.672315] Modules linked in:
[7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16
[7.673309] Hardware name: Generic DT based system
[7.673642] PC is at nsfs_evict+0x24/0x30
[7.674486] LR is at clear_inode+0x20/0x9c

The same to tun SIOCGSKNS command.

To fix this problem, we make get_net_ns() return -EINVAL when NET_NS is
disabled. Meanwhile move it to right place net/core/net_namespace.c.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Fixes: c62cce2cae ("net: add an ioctl to get a socket network namespace")
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12 13:13:08 -07:00
Linus Torvalds
43cb5d49a9 Merge tag 'usb-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are a number of tiny USB fixes for 5.13-rc6.

  There are more than I would normally like, but there's been a bunch of
  people banging on the gadget and dwc3 and typec code recently for I
  think an Android release, which has resulted in a number of small
  fixes. It's nice to see companies send fixes upstream for this type of
  work, a notable change from years ago.

  Anyway, fixes in here are:

   - usb-serial device id updates

   - usb-serial cp210x driver fixes for broken firmware versions

   - typec fixes for crazy charging devices and other reported problems

   - dwc3 fixes for reported problems found

   - gadget fixes for reported problems

   - tiny xhci fixes

   - other small fixes for reported issues.

   - revert of a problem fix found by linux-next testing

  All of these have passed 0-day and linux-next testing with no reported
  problems (the revert for the found linux-next build problem included)"

* tag 'usb-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (44 commits)
  Revert "usb: gadget: fsl: Re-enable driver for ARM SoCs"
  usb: typec: mux: Fix copy-paste mistake in typec_mux_match
  usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path
  usb: gadget: fsl: Re-enable driver for ARM SoCs
  usb: typec: wcove: Use LE to CPU conversion when accessing msg->header
  USB: serial: cp210x: fix CP2102N-A01 modem control
  USB: serial: cp210x: fix alternate function for CP2102N QFN20
  usb: misc: brcmstb-usb-pinmap: check return value after calling platform_get_resource()
  usb: dwc3: ep0: fix NULL pointer exception
  usb: gadget: eem: fix wrong eem header operation
  usb: typec: intel_pmc_mux: Put ACPI device using acpi_dev_put()
  usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource()
  usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe()
  usb: typec: tcpm: Do not finish VDM AMS for retrying Responses
  usb: fix various gadget panics on 10gbps cabling
  usb: fix various gadgets null ptr deref on 10gbps cabling.
  usb: pci-quirks: disable D3cold on xhci suspend for s2idle on AMD Renoir
  usb: f_ncm: only first packet of aggregate needs to start timer
  USB: f_ncm: ncm_bitrate (speed) is unsigned
  MAINTAINERS: usb: add entry for isp1760
  ...
2021-06-12 12:34:49 -07:00
Linus Torvalds
c46fe4aa82 Merge tag 'tty-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull serial driver fix from Greg KH:
 "A single 8250_exar serial driver fix for a reported problem with a
  change that happened in 5.13-rc1.

  It has been in linux-next with no reported problems"

* tag 'tty-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: 8250_exar: Avoid NULL pointer dereference at ->exit()
2021-06-12 12:27:05 -07:00
Linus Torvalds
0d50658834 Merge tag 'staging-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
 "Two tiny staging driver fixes:

   - ralink-gdma driver authorship information fixed up

   - rtl8723bs driver fix for reported regression

  Both have been in linux-next for a while with no reported problems"

* tag 'staging-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: ralink-gdma: Remove incorrect author information
  staging: rtl8723bs: Fix uninitialized variables
2021-06-12 12:23:54 -07:00
Linus Torvalds
87a7f7368b Merge tag 'driver-core-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
 "A single debugfs fix for 5.13-rc6, fixing a bug in
  debugfs_read_file_str() that showed up in 5.13-rc1.

  It has been in linux-next for a full week with no
  reported problems"

* tag 'driver-core-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  debugfs: Fix debugfs_read_file_str()
2021-06-12 12:18:49 -07:00
Linus Torvalds
1dfa2e77bb Merge tag 'char-misc-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are some small misc driver fixes for 5.13-rc6 that fix some
  reported problems:

   - Tiny phy driver fixes for reported issues

   - rtsx regression for when the device suspended

   - mhi driver fix for a use-after-free

  All of these have been in linux-next for a few days with no reported
  issues"

* tag 'char-misc-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG
  bus: mhi: pci-generic: Fix hibernation
  bus: mhi: pci_generic: Fix possible use-after-free in mhi_pci_remove()
  bus: mhi: pci_generic: T99W175: update channel name from AT to DUN
  phy: Sparx5 Eth SerDes: check return value after calling platform_get_resource()
  phy: ralink: phy-mt7621-pci: drop 'of_match_ptr' to fix -Wunused-const-variable
  phy: ti: Fix an error code in wiz_probe()
  phy: phy-mtk-tphy: Fix some resource leaks in mtk_phy_init()
  phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe()
  phy: usb: Fix misuse of IS_ENABLED
2021-06-12 12:13:55 -07:00
Linus Torvalds
141415d737 Merge tag 'pinctrl-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:

 - Fix some documentation warnings for Allwinner

 - Fix duplicated GPIO groups on Qualcomm SDX55

 - Fix a double enablement bug in the Ralink driver

 - Fix the Qualcomm SC8180x Kconfig so the driver can be selected.

* tag 'pinctrl-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: qcom: Make it possible to select SC8180x TLMM
  pinctrl: ralink: rt2880: avoid to error in calls is pin is already enabled
  pinctrl: qcom: Fix duplication in gpio_groups
  pinctrl: aspeed: Fix minor documentation error
2021-06-12 12:06:24 -07:00
Linus Torvalds
efc1fd601a Merge tag 'block-5.13-2021-06-12' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A few fixes that should go into 5.13:

   - Fix a regression deadlock introduced in this release between open
     and remove of a bdev (Christoph)

   - Fix an async_xor md regression in this release (Xiao)

   - Fix bcache oversized read issue (Coly)"

* tag 'block-5.13-2021-06-12' of git://git.kernel.dk/linux-block:
  block: loop: fix deadlock between open and remove
  async_xor: check src_offs is not NULL before updating it
  bcache: avoid oversized read request in cache missing code path
  bcache: remove bcache device self-defined readahead
2021-06-12 11:59:58 -07:00
Linus Torvalds
b2568eeb96 Merge tag 'io_uring-5.13-2021-06-12' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "Just an API change for the registration changes that went into this
  release. Better to get it sorted out now than before it's too late"

* tag 'io_uring-5.13-2021-06-12' of git://git.kernel.dk/linux-block:
  io_uring: add feature flag for rsrc tags
  io_uring: change registration/upd/rsrc tagging ABI
2021-06-12 11:53:20 -07:00
Linus Torvalds
99f925947a Merge tag 'sched-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc fixes:

   - Fix performance regression caused by lack of intended batching of
     RCU callbacks by over-eager NOHZ-full code.

   - Fix cgroups related corruption of load_avg and load_sum metrics.

   - Three fixes to fix blocked load, util_sum/runnable_sum and util_est
     tracking bugs"

* tag 'sched-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling
  sched/pelt: Ensure that *_sum is always synced with *_avg
  tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed
  sched/fair: Make sure to update tg contrib for blocked load
  sched/fair: Keep load_avg and load_sum synced
2021-06-12 11:41:28 -07:00
Linus Torvalds
191aaf6cc4 Merge tag 'perf-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixes:

   - Fix the NMI watchdog on ancient Intel CPUs

   - Remove a misguided, NMI-unsafe KASAN callback from the NMI-safe
     irq_work path used by perf.

   - Fix uncore events on Ice Lake servers.

   - Someone booted maxcpus=1 on an SNB-EP, and the uncore driver
     emitted warnings and was probably buggy. Fix it.

   - KCSAN found a genuine data race in the core perf code. Somewhat
     ironically the bug was introduced through a recent race fix. :-/
     In our defense, the new race window was much more narrow. Fix it"

* tag 'perf-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs
  irq_work: Make irq_work_queue() NMI-safe again
  perf/x86/intel/uncore: Fix M2M event umask for Ice Lake server
  perf/x86/intel/uncore: Fix a kernel WARNING triggered by maxcpus=1
  perf: Fix data race between pin_count increment/decrement
2021-06-12 11:34:49 -07:00
Linus Torvalds
768895fb77 Merge tag 'objtool-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Ingo Molnar:
 "Two objtool fixes:

   - fix a bug that corrupts the code by mistakenly rewriting
     conditional jumps

   - fix another bug generating an incorrect ELF symbol table
     during retpoline rewriting"

* tag 'objtool-urgent-2021-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Only rewrite unconditional retpoline thunk calls
  objtool: Fix .symtab_shndx handling for elf_create_undef_symbol()
2021-06-12 11:10:28 -07:00
Alexandre Ghiti
0ddd7eaffa riscv: Fix BUILTIN_DTB for sifive and microchip soc
Fix BUILTIN_DTB config which resulted in a dtb that was actually not
built into the Linux image: in the same manner as Canaan soc does,
create an object file from the dtb file that will get linked into the
Linux image.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-11 21:07:09 -07:00
Linus Torvalds
ad347abe4a Merge tag 'trace-v5.13-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:

 - Fix the length check in the temp buffer filter

 - Fix build failure in bootconfig tools for "fallthrough" macro

 - Fix error return of bootconfig apply_xbc() routine

* tag 'trace-v5.13-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Correct the length check which causes memory corruption
  ftrace: Do not blindly read the ip address in ftrace_bug()
  tools/bootconfig: Fix a build error accroding to undefined fallthrough
  tools/bootconfig: Fix error return code in apply_xbc()
2021-06-11 17:05:03 -07:00
Linus Torvalds
548843c096 Merge tag 'clang-features-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull clang LTO fix from Kees Cook:
 "Clang 13 fixed some IR behavior for LTO, but this broke work-arounds
  used in the kernel.

  Handle changes to needed LTO flags in Clang 13 (Tor Vic)"

* tag 'clang-features-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  x86, lto: Pass -stack-alignment only on LLD < 13.0.0
2021-06-11 16:29:53 -07:00
Linus Torvalds
e65b7914b2 Merge tag 'gpio-fixes-for-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:
 "Fix a shift-out-of-bounds error in gpio-wcd934x"

* tag 'gpio-fixes-for-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: wcd934x: Fix shift-out-of-bounds error
2021-06-11 16:27:18 -07:00
Jisheng Zhang
1adb20f0d4 net: stmmac: dwmac1000: Fix extended MAC address registers definition
The register starts from 0x800 is the 16th MAC address register rather
than the first one.

Fixes: cffb13f4d6 ("stmmac: extend mac addr reg and fix perfect filering")
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 13:05:55 -07:00
Linus Torvalds
f21b807c3c Merge tag 'drm-fixes-2021-06-11' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Another week of fixes, nothing too crazy, but a few all over the
  place.

  Two locking fixes in the core/ttm area, a couple of small driver fixes
  (radeon, sun4i, mcde, vc4). Then msm and amdgpu have a set of fixes
  each, mostly for smaller things, though the msm has a DSI fix for a
  black screen.

  I haven't seen any intel fixes this week so they may have a few that
  may or may not wait for next week.

  drm:
   - auth locking fix

  ttm:
   - locking fix

  amdgpu:
   - Use kvzmalloc in amdgu_bo_create
   - Use drm_dbg_kms for reporting failure to get a GEM FB
   - Fix some register offsets for Sienna Cichlid
   - Fix fall-through warning

  radeon:
   - memcpy_to/from_io fixes

  msm:
   - NULL ptr deref fix
   - CP_PROTECT reg programming fix
   - incorrect register shift fix
   - DSI blank screen fix

  sun4i:
   - hdmi output probing fix

  mcde:
   - DSI pipeline calc fix

  vc4:
   - out of bounds fix"

* tag 'drm-fixes-2021-06-11' of git://anongit.freedesktop.org/drm/drm:
  drm/msm/dsi: Stash away calculated vco frequency on recalc
  drm: Lock pointer access in drm_master_release()
  drm/mcde: Fix off by 10^3 in calculation
  drm/msm/a6xx: avoid shadow NULL reference in failure path
  drm/msm/a6xx: fix incorrectly set uavflagprd_inv field for A650
  drm/msm/a6xx: update/fix CP_PROTECT initialization
  radeon: use memcpy_to/fromio for UVD fw upload
  drm/amd/pm: Fix fall-through warning for Clang
  drm/amdgpu: Fix incorrect register offsets for Sienna Cichlid
  drm/amdgpu: Use drm_dbg_kms for reporting failure to get a GEM FB
  drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create
  drm/msm: Init mm_list before accessing it for use_vram path
  drm: Fix use-after-free read in drm_getunique()
  drm/vc4: fix vc4_atomic_commit_tail() logic
  drm/ttm: fix deref of bo->ttm without holding the lock v2
  drm/sun4i: dw-hdmi: Make HDMI PHY into a platform device
2021-06-11 12:33:38 -07:00
David S. Miller
f4cdcae03f Merge branch 'cxgb4-fixes'
Rahul Lakkireddy says:

====================
cxgb4: bug fixes for ethtool flash ops

This series of patches add bug fixes in ethtool flash operations.

Patch 1 fixes an endianness issue when writing boot image to flash
after the device ID has been updated.

Patch 2 fixes sleep in atomic when writing PHY firmware to flash.

Patch 3 fixes issue with PHY firmware image not getting written to
flash when chip is still running.
-====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 11:15:01 -07:00
Rahul Lakkireddy
6d297540f7 cxgb4: halt chip before flashing PHY firmware image
When using firmware-assisted PHY firmware image write to flash,
halt the chip before beginning the flash write operation to allow
the running firmware to store the image persistently. Otherwise,
the running firmware will only store the PHY image in local on-chip
RAM, which will be lost after next reset.

Fixes: 4ee339e1e9 ("cxgb4: add support to flash PHY image")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 11:15:00 -07:00
Rahul Lakkireddy
f046bd0ae1 cxgb4: fix sleep in atomic when flashing PHY firmware
Before writing new PHY firmware to on-chip memory, driver queries
firmware for current running PHY firmware version, which can result
in sleep waiting for reply. So, move spinlock closer to the actual
on-chip memory write operation, instead of taking it at the callers.

Fixes: 5fff701c83 ("cxgb4: always sync access when flashing PHY firmware")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 11:15:00 -07:00
Rahul Lakkireddy
42a2039753 cxgb4: fix endianness when flashing boot image
Boot images are copied to memory and updated with current underlying
device ID before flashing them to adapter. Ensure the updated images
are always flashed in Big Endian to allow the firmware to read the
new images during boot properly.

Fixes: 550883558f ("cxgb4: add support to flash boot image")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 11:15:00 -07:00
Christophe JAILLET
33e381448c alx: Fix an error handling path in 'alx_probe()'
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it
must be undone by a corresponding 'pci_disable_pcie_error_reporting()'
call, as already done in the remove function.

Fixes: ab69bde6b2 ("alx: add a simple AR816x/AR817x device driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 11:12:54 -07:00
Linus Torvalds
929d931f2b Merge tag 'devicetree-fixes-for-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fix from Rob Herring:
 "A single fix for broken media/renesas,drif.yaml binding schema"

* tag 'devicetree-fixes-for-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  media: dt-bindings: media: renesas,drif: Fix fck definition
2021-06-11 11:02:56 -07:00
Jens Axboe
85f3f17b5d Merge branch 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.13
Pull MD related fix from Song.

* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  async_xor: check src_offs is not NULL before updating it
2021-06-11 11:56:08 -06:00
Linus Torvalds
d17bcc5ede Merge tag 'acpi-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These revert a problematic recent commit and fix a regression
  introduced during the 5.12 development cycle.

  Specifics:

   - Revert recent commit that attempted to fix the FACS table reference
     counting but introduced a problem with accessing the hardware
     signature after hibernation (Zhang Rui).

   - Fix regression in the _OSC handling that broke the loading of ACPI
     tables on some systems (Mika Westerberg)"

* tag 'acpi-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: Pass the same capabilities to the _OSC regardless of the query flag
  Revert "ACPI: sleep: Put the FACS table after using it"
2021-06-11 10:53:43 -07:00
Christoph Hellwig
990e78116d block: loop: fix deadlock between open and remove
Commit c76f48eb5c ("block: take bd_mutex around delete_partitions in
del_gendisk") adds disk->part0->bd_mutex in del_gendisk(), this way
causes the following AB/BA deadlock between removing loop and opening
loop:

 1) loop_control_ioctl(LOOP_CTL_REMOVE)
     -> mutex_lock(&loop_ctl_mutex)
     -> del_gendisk
         -> mutex_lock(&disk->part0->bd_mutex)

 2) blkdev_get_by_dev
     -> mutex_lock(&disk->part0->bd_mutex)
     -> lo_open
         -> mutex_lock(&loop_ctl_mutex)

Add a new Lo_deleting state to remove the need for clearing
->private_data and thus holding loop_ctl_mutex in the ioctl
LOOP_CTL_REMOVE path.

Based on an analysis and earlier patch from
Ming Lei <ming.lei@redhat.com>.

Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: c76f48eb5c ("block: take bd_mutex around delete_partitions in del_gendisk")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210605140950.5800-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-11 11:50:54 -06:00
Linus Torvalds
fd2cd569a4 Merge tag 'sound-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A bit more commits than expected at this time, but likely it's the
  last shot before the final.

  Many of changes are device-specific fix-ups for various ASoC drivers,
  while a few usual HD-audio quirks and a FireWire fix, as well as a
  couple of ALSA / ASoC core fixes.

  All look nice and small, and nothing to scare much"

* tag 'sound-5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: seq: Fix race of snd_seq_timer_open()
  ALSA: hda/realtek: fix mute/micmute LEDs for HP ZBook Power G8
  ALSA: hda/realtek: headphone and mic don't work on an Acer laptop
  ASoC: qcom: lpass-cpu: Fix pop noise during audio capture begin
  ALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun()
  ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 840 Aero G8
  ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 1040 G8
  ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Elite Dragonfly G2
  ASoC: rt5682: Fix the fast discharge for headset unplugging in soundwire mode
  ASoC: tas2562: Fix TDM_CFG0_SAMPRATE values
  ASoC: meson: gx-card: fix sound-dai dt schema
  ASoC: AMD Renoir: Remove fix for DMI entry on Lenovo 2020 platforms
  ASoC: AMD Renoir - add DMI entry for Lenovo 2020 AMD platforms
  ASoC: SOF: reset enabled_cores state at suspend
  ASoC: fsl-asoc-card: Set .owner attribute when registering card.
  ASoC: topology: Fix spelling mistake "vesion" -> "version"
  ASoC: rt5659: Fix the lost powers for the HDA header
  ASoC: core: Fix Null-point-dereference in fmt_single_name()
2021-06-11 10:47:10 -07:00
Tor Vic
2398ce8015 x86, lto: Pass -stack-alignment only on LLD < 13.0.0
Since LLVM commit 3787ee4, the '-stack-alignment' flag has been dropped
[1], leading to the following error message when building a LTO kernel
with Clang-13 and LLD-13:

    ld.lld: error: -plugin-opt=-: ld.lld: Unknown command line argument
    '-stack-alignment=8'.  Try 'ld.lld --help'
    ld.lld: Did you mean '--stackrealign=8'?

It also appears that the '-code-model' flag is not necessary anymore
starting with LLVM-9 [2].

Drop '-code-model' and make '-stack-alignment' conditional on LLD < 13.0.0.

These flags were necessary because these flags were not encoded in the
IR properly, so the link would restart optimizations without them. Now
there are properly encoded in the IR, and these flags exposing
implementation details are no longer necessary.

[1] https://reviews.llvm.org/D103048
[2] https://reviews.llvm.org/D52322

Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1377
Signed-off-by: Tor Vic <torvic9@mailbox.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/f2c018ee-5999-741e-58d4-e482d5246067@mailbox.org
2021-06-11 10:33:45 -07:00
Praneeth Bajjuri
da9ef50f54 net: phy: dp83867: perform soft reset and retain established link
Current logic is performing hard reset and causing the programmed
registers to be wiped out.

as per datasheet: https://www.ti.com/lit/ds/symlink/dp83867cr.pdf
8.6.26 Control Register (CTRL)

do SW_RESTART to perform a reset not including the registers,
If performed when link is already present,
it will drop the link and trigger re-auto negotiation.

Signed-off-by: Praneeth Bajjuri <praneeth@ti.com>
Signed-off-by: Geet Modi <geet.modi@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 10:13:03 -07:00
Linus Torvalds
4244b5d872 Merge tag 'hwmon-for-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
 "Fixes for tps23861, scpi-hwmon, and corsair-psu drivers, plus a
  bindings fix for TI ADS7828"

* tag 'hwmon-for-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (tps23861) correct shunt LSB values
  hwmon: (tps23861) set current shunt value
  hwmon: (tps23861) define regmap max register
  hwmon: (scpi-hwmon) shows the negative temperature properly
  hwmon: (corsair-psu) fix suspend behavior
  dt-bindings: hwmon: Fix typo in TI ADS7828 bindings
2021-06-11 10:07:50 -07:00
Linus Torvalds
f30dc8f94e Merge tag 'mmc-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
 "A couple of MMC fixes to the Renesas SDHI driver:

   - Fix HS400 on R-Car M3-W+

   - Abort tuning when timeout detected"

* tag 'mmc-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: renesas_sdhi: Fix HS400 on R-Car M3-W+
  mmc: renesas_sdhi: abort tuning when timeout detected
2021-06-11 10:02:30 -07:00
Rafael J. Wysocki
bc8865ab32 Merge branch 'acpi-bus'
* acpi-bus:
  ACPI: Pass the same capabilities to the _OSC regardless of the query flag
2021-06-11 17:57:24 +02:00
Sean Christopherson
654430efde KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMU
Calculate and check the full mmu_role when initializing the MMU context
for the nested MMU, where "full" means the bits and pieces of the role
that aren't handled by kvm_calc_mmu_role_common().  While the nested MMU
isn't used for shadow paging, things like the number of levels in the
guest's page tables are surprisingly important when walking the guest
page tables.  Failure to reinitialize the nested MMU context if L2's
paging mode changes can result in unexpected and/or missed page faults,
and likely other explosions.

E.g. if an L1 vCPU is running both a 32-bit PAE L2 and a 64-bit L2, the
"common" role calculation will yield the same role for both L2s.  If the
64-bit L2 is run after the 32-bit PAE L2, L0 will fail to reinitialize
the nested MMU context, ultimately resulting in a bad walk of L2's page
tables as the MMU will still have a guest root_level of PT32E_ROOT_LEVEL.

  WARNING: CPU: 4 PID: 167334 at arch/x86/kvm/vmx/vmx.c:3075 ept_save_pdptrs+0x15/0xe0 [kvm_intel]
  Modules linked in: kvm_intel]
  CPU: 4 PID: 167334 Comm: CPU 3/KVM Not tainted 5.13.0-rc1-d849817d5673-reqs #185
  Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
  RIP: 0010:ept_save_pdptrs+0x15/0xe0 [kvm_intel]
  Code: <0f> 0b c3 f6 87 d8 02 00f
  RSP: 0018:ffffbba702dbba00 EFLAGS: 00010202
  RAX: 0000000000000011 RBX: 0000000000000002 RCX: ffffffff810a2c08
  RDX: ffff91d7bc30acc0 RSI: 0000000000000011 RDI: ffff91d7bc30a600
  RBP: ffff91d7bc30a600 R08: 0000000000000010 R09: 0000000000000007
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff91d7bc30a600
  R13: ffff91d7bc30acc0 R14: ffff91d67c123460 R15: 0000000115d7e005
  FS:  00007fe8e9ffb700(0000) GS:ffff91d90fb00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000029f15a001 CR4: 00000000001726e0
  Call Trace:
   kvm_pdptr_read+0x3a/0x40 [kvm]
   paging64_walk_addr_generic+0x327/0x6a0 [kvm]
   paging64_gva_to_gpa_nested+0x3f/0xb0 [kvm]
   kvm_fetch_guest_virt+0x4c/0xb0 [kvm]
   __do_insn_fetch_bytes+0x11a/0x1f0 [kvm]
   x86_decode_insn+0x787/0x1490 [kvm]
   x86_decode_emulated_instruction+0x58/0x1e0 [kvm]
   x86_emulate_instruction+0x122/0x4f0 [kvm]
   vmx_handle_exit+0x120/0x660 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0xe25/0x1cb0 [kvm]
   kvm_vcpu_ioctl+0x211/0x5a0 [kvm]
   __x64_sys_ioctl+0x83/0xb0
   do_syscall_64+0x40/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bf627a9288 ("x86/kvm/mmu: check if MMU reconfiguration is needed in init_kvm_nested_mmu()")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210610220026.1364486-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-11 11:54:49 -04:00
Arnaldo Carvalho de Melo
36524112ab tools headers cpufeatures: Sync with the kernel sources
To pick the changes in:

  fb35d30fe5 ("x86/cpufeatures: Assign dedicated feature word for CPUID_0x8000001F[EAX]")
  e7b6385b01 ("x86/cpufeatures: Add Intel SGX hardware bits")
  1478b99a76 ("x86/cpufeatures: Mark ENQCMD as disabled when configured out")

That don't cause any change in the tools, just silences this perf build
warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc: Borislav Petkov <bp@suse.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-11 12:54:24 -03:00
Leo Yan
197eecb6ec perf session: Correct buffer copying when peeking events
When peeking an event, it has a short path and a long path.  The short
path uses the session pointer "one_mmap_addr" to directly fetch the
event; and the long path needs to read out the event header and the
following event data from file and fill into the buffer pointer passed
through the argument "buf".

The issue is in the long path that it copies the event header and event
data into the same destination address which pointer "buf", this means
the event header is overwritten.  We are just lucky to run into the
short path in most cases, so we don't hit the issue in the long path.

This patch adds the offset "hdr_sz" to the pointer "buf" when copying
the event data, so that it can reserve the event header which can be
used properly by its caller.

Fixes: 5a52f33adf ("perf session: Add perf_session__peek_event()")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-11 12:54:24 -03:00
Wanpeng Li
dfdc0a714d KVM: X86: Fix x86_emulator slab cache leak
Commit c9b8b07cde (KVM: x86: Dynamically allocate per-vCPU emulation context)
tries to allocate per-vCPU emulation context dynamically, however, the
x86_emulator slab cache is still exiting after the kvm module is unload
as below after destroying the VM and unloading the kvm module.

grep x86_emulator /proc/slabinfo
x86_emulator          36     36   2672   12    8 : tunables    0    0    0 : slabdata      3      3      0

This patch fixes this slab cache leak by destroying the x86_emulator slab cache
when the kvm module is unloaded.

Fixes: c9b8b07cde (KVM: x86: Dynamically allocate per-vCPU emulation context)
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1623387573-5969-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-11 11:53:48 -04:00
Alper Gun
934002cd66 KVM: SVM: Call SEV Guest Decommission if ASID binding fails
Send SEV_CMD_DECOMMISSION command to PSP firmware if ASID binding
fails. If a failure happens after  a successful LAUNCH_START command,
a decommission command should be executed. Otherwise, guest context
will be unfreed inside the AMD SP. After the firmware will not have
memory to allocate more SEV guest context, LAUNCH_START command will
begin to fail with SEV_RET_RESOURCE_LIMIT error.

The existing code calls decommission inside sev_unbind_asid, but it is
not called if a failure happens before guest activation succeeds. If
sev_bind_asid fails, decommission is never called. PSP firmware has a
limit for the number of guests. If sev_asid_binding fails many times,
PSP firmware will not have resources to create another guest context.

Cc: stable@vger.kernel.org
Fixes: 59414c9892 ("KVM: SVM: Add support for KVM_SEV_LAUNCH_START command")
Reported-by: Peter Gonda <pgonda@google.com>
Signed-off-by: Alper Gun <alpergun@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210610174604.2554090-1-alpergun@google.com>
2021-06-11 11:52:48 -04:00
Greg Kroah-Hartman
7c4363d394 Merge tag 'usb-serial-5.13-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:

USB-serial fixes for 5.13-rc6

Here are two fixes for the cp210x driver. The first fixes a regression
with early revisions of the CP2102N which specifically broke some ESP32
development boards. The second makes sure that the pin configuration is
detected properly also for the CP2102N QFN20 package.

Both have been in linux-next over night and with no reported issues.

* tag 'usb-serial-5.13-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: cp210x: fix CP2102N-A01 modem control
  USB: serial: cp210x: fix alternate function for CP2102N QFN20
2021-06-11 12:32:49 +02:00
Greg Kroah-Hartman
abd062886c Revert "usb: gadget: fsl: Re-enable driver for ARM SoCs"
This reverts commit e0e8b6abe8.

Turns out this breaks the build.  We had numerous reports of problems
from linux-next and 0-day about this not working properly, so revert it
for now until it can be figured out properly.

The build errors are:
	arm-linux-gnueabi-ld: fsl_udc_core.c:(.text+0x29d4): undefined reference to `fsl_udc_clk_finalize'
	arm-linux-gnueabi-ld: fsl_udc_core.c:(.text+0x2ba8): undefined reference to `fsl_udc_clk_release'
	fsl_udc_core.c:(.text+0x2848): undefined reference to `fsl_udc_clk_init'
	fsl_udc_core.c:(.text+0xe88): undefined reference to `fsl_udc_clk_release'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: e0e8b6abe8 ("usb: gadget: fsl: Re-enable driver for ARM SoCs")
Cc: stable <stable@vger.kernel.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Leo Li <leoyang.li@nxp.com>
Cc: Peter Chen <peter.chen@nxp.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Ran Wang <ran.wang_1@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-11 09:18:47 +02:00
Peter Zijlstra
2d49b721dc objtool: Only rewrite unconditional retpoline thunk calls
It turns out that the compilers generate conditional branches to the
retpoline thunks like:

  5d5:   0f 85 00 00 00 00       jne    5db <cpuidle_reflect+0x22>
	5d7: R_X86_64_PLT32     __x86_indirect_thunk_r11-0x4

while the rewrite can only handle JMP/CALL to the thunks. The result
is the alternative wrecking the code. Make sure to skip writing the
alternatives for conditional branches.

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Lukasz Majczak <lma@semihalf.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
2021-06-11 08:53:06 +02:00
Vitaly Wool
858cf86049 riscv: alternative: fix typo in macro name
alternative-macros.h defines ALT_NEW_CONTENT in its assembly part
and ALT_NEW_CONSTENT in the C part. Most likely it is the latter
that is wrong.

Fixes: 6f4eea9046
	(riscv: Introduce alternative mechanism to apply errata solution)
Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-10 20:35:05 -07:00
Xiao Ni
9be148e408 async_xor: check src_offs is not NULL before updating it
When PAGE_SIZE is greater than 4kB, multiple stripes may share the same
page. Thus, src_offs is added to async_xor_offs() with array of offsets.
However, async_xor() passes NULL src_offs to async_xor_offs(). In such
case, src_offs should not be updated. Add a check before the update.

Fixes: ceaf2966ab08(async_xor: increase src_offs when dropping destination page)
Cc: stable@vger.kernel.org # v5.10+
Reported-by: Oleksandr Shchirskyi <oleksandr.shchirskyi@linux.intel.com>
Tested-by: Oleksandr Shchirskyi <oleksandr.shchirskyi@intel.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
2021-06-10 19:40:14 -07:00
Dave Airlie
7de5c0d70c Merge tag 'amd-drm-fixes-5.13-2021-06-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.13-2021-06-09:

amdgpu:
- Use kvzmalloc in amdgu_bo_create
- Use drm_dbg_kms for reporting failure to get a GEM FB
- Fix some register offsets for Sienna Cichlid
- Fix fall-through warning

radeon:
- memcpy_to/from_io fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210610035631.3943-1-alexander.deucher@amd.com
2021-06-11 11:17:10 +10:00
Dave Airlie
750643a99e Merge tag 'drm-misc-fixes-2021-06-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
One fix for snu4i that prevents it from probing, two locking fixes for
ttm and drm_auth, one off-by-x1000 fix for mcde and a fix for vc4 to
prevent an out-of-bounds access.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210610171653.lqsoadxrhdk73cdy@gilmour
2021-06-11 10:59:55 +10:00
Dave Airlie
43f44f5bd1 Merge tag 'drm-msm-fixes-2021-06-10' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
- NULL ptr deref fix
- CP_PROTECT reg programming fix
- incorrect register shift fix
- DSI blank screen fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvbcz0=QxGYnX9u7cD1SCvFSx20dzrZuOccjtRRBTJd5Q@mail.gmail.com
2021-06-11 10:46:01 +10:00
Vineet Gupta
110febc014 ARC: fix CONFIG_HARDENED_USERCOPY
Currently enabling this triggers a warning

| usercopy: Kernel memory overwrite attempt detected to kernel text (offset 155633, size 11)!
| usercopy: BUG: failure at mm/usercopy.c:99/usercopy_abort()!
|
|gcc generated __builtin_trap
|Path: /bin/busybox
|CPU: 0 PID: 84 Comm: init Not tainted 5.4.22
|
|[ECR ]: 0x00090005 => gcc generated __builtin_trap
|[EFA ]: 0x9024fcaa
|[BLINK ]: usercopy_abort+0x8a/0x8c
|[ERET ]: memfd_fcntl+0x0/0x470
|[STAT32]: 0x80080802 : IE K
|...
|...
|Stack Trace:
| memfd_fcntl+0x0/0x470
| usercopy_abort+0x8a/0x8c
| __check_object_size+0x10e/0x138
| copy_strings+0x1f4/0x38c
| __do_execve_file+0x352/0x848
| EV_Trap+0xcc/0xd0

The issue is triggered by an allocation in "init reclaimed" region.
ARC _stext emcompasses the init region (for historical reasons we wanted
the init.text to be under .text as well). This however trips up
__check_object_size()->check_kernel_text_object() which treats this as
object bleeding into kernel text.

Fix that by rezoning _stext to start from regular kernel .text and leave
out .init altogether.

Fixes: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/15
Reported-by: Evgeniy Didin <didin@synopsys.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2021-06-10 17:37:00 -07:00
Vineet Gupta
96f1b00138 ARCv2: save ABI registers across signal handling
ARCv2 has some configuration dependent registers (r30, r58, r59) which
could be targetted by the compiler. To keep the ABI stable, these were
unconditionally part of the glibc ABI
(sysdeps/unix/sysv/linux/arc/sys/ucontext.h:mcontext_t) however we
missed populating them (by saving/restoring them across signal
handling).

This patch fixes the issue by
 - adding arcv2 ABI regs to kernel struct sigcontext
 - populating them during signal handling

Change to struct sigcontext might seem like a glibc ABI change (although
it primarily uses ucontext_t:mcontext_t) but the fact is
 - it has only been extended (existing fields are not touched)
 - the old sigcontext was ABI incomplete to begin with anyways

Fixes: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/53
Cc: <stable@vger.kernel.org>
Tested-by: kernel test robot <lkp@intel.com>
Reported-by: Vladimir Isaev <isaev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2021-06-10 17:21:38 -07:00
David S. Miller
232e3683b4 Merge branch 'mptcp-fixes'
Mat Martineau says:

====================
mptcp: More v5.13 fixes

Here's another batch of MPTCP fixes for v5.13.

Patch 1 cleans up memory accounting between the MPTCP-level socket and
the subflows to more reliably transfer forward allocated memory under
pressure.

Patch 2 wakes up socket readers more reliably.

Patch 3 changes a WARN_ONCE to a pr_debug.

Patch 4 changes the selftests to only use syncookies in test cases where
they do not cause spurious failures.

Patch 5 modifies socket error reporting to avoid a possible soft lockup.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 16:47:45 -07:00
Paolo Abeni
499ada5073 mptcp: fix soft lookup in subflow_error_report()
Maxim reported a soft lookup in subflow_error_report():

 watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0]
 RIP: 0010:native_queued_spin_lock_slowpath
 RSP: 0018:ffffa859c0003bc0 EFLAGS: 00000202
 RAX: 0000000000000101 RBX: 0000000000000001 RCX: 0000000000000000
 RDX: ffff9195c2772d88 RSI: 0000000000000000 RDI: ffff9195c2772d88
 RBP: ffff9195c2772d00 R08: 00000000000067b0 R09: c6e31da9eb1e44f4
 R10: ffff9195ef379700 R11: ffff9195edb50710 R12: ffff9195c2772d88
 R13: ffff9195f500e3d0 R14: ffff9195ef379700 R15: ffff9195ef379700
 FS:  0000000000000000(0000) GS:ffff91961f400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000c000407000 CR3: 0000000002988000 CR4: 00000000000006f0
 Call Trace:
  <IRQ>
 _raw_spin_lock_bh
 subflow_error_report
 mptcp_subflow_data_available
 __mptcp_move_skbs_from_subflow
 mptcp_data_ready
 tcp_data_queue
 tcp_rcv_established
 tcp_v4_do_rcv
 tcp_v4_rcv
 ip_protocol_deliver_rcu
 ip_local_deliver_finish
 __netif_receive_skb_one_core
 netif_receive_skb
 rtl8139_poll 8139too
 __napi_poll
 net_rx_action
 __do_softirq
 __irq_exit_rcu
 common_interrupt
  </IRQ>

The calling function - mptcp_subflow_data_available() - can be invoked
from different contexts:
- plain ssk socket lock
- ssk socket lock + mptcp_data_lock
- ssk socket lock + mptcp_data_lock + msk socket lock.

Since subflow_error_report() tries to acquire the mptcp_data_lock, the
latter two call chains will cause soft lookup.

This change addresses the issue moving the error reporting call to
outer functions, where the held locks list is known and the we can
acquire only the needed one.

Reported-by: Maxim Galaganov <max@internet.ru>
Fixes: 15cc104533 ("mptcp: deliver ssk errors to msk")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/199
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 16:47:45 -07:00
Paolo Abeni
2395da0e17 selftests: mptcp: enable syncookie only in absence of reorders
Syncookie validation may fail for OoO packets, causing spurious
resets and self-tests failures, so let's force syncookie only
for tests iteration with no OoO.

Fixes: fed61c4b58 ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/198
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 16:47:45 -07:00
Paolo Abeni
61e710227e mptcp: do not warn on bad input from the network
warn_bad_map() produces a kernel WARN on bad input coming
from the network. Use pr_debug() to avoid spamming the system
log.

Additionally, when the right bound check fails, warn_bad_map() reports
the wrong ssn value, let's fix it.

Fixes: 648ef4b886 ("mptcp: Implement MPTCP receive path")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/107
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 16:47:45 -07:00
Paolo Abeni
99d1055ce2 mptcp: wake-up readers only for in sequence data
Currently we rely on the subflow->data_avail field, which is subject to
races:

	ssk1
		skb len = 500 DSS(seq=1, len=1000, off=0)
		# data_avail == MPTCP_SUBFLOW_DATA_AVAIL

	ssk2
		skb len = 500 DSS(seq = 501, len=1000)
		# data_avail == MPTCP_SUBFLOW_DATA_AVAIL

	ssk1
		skb len = 500 DSS(seq = 1, len=1000, off =500)
		# still data_avail == MPTCP_SUBFLOW_DATA_AVAIL,
		# as the skb is covered by a pre-existing map,
		# which was in-sequence at reception time.

Instead we can explicitly check if some has been received in-sequence,
propagating the info from __mptcp_move_skbs_from_subflow().

Additionally add the 'ONCE' annotation to the 'data_avail' memory
access, as msk will read it outside the subflow socket lock.

Fixes: 648ef4b886 ("mptcp: Implement MPTCP receive path")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 16:47:44 -07:00
Paolo Abeni
72f961320d mptcp: try harder to borrow memory from subflow under pressure
If the host is under sever memory pressure, and RX forward
memory allocation for the msk fails, we try to borrow the
required memory from the ingress subflow.

The current attempt is a bit flaky: if skb->truesize is less
than SK_MEM_QUANTUM, the ssk will not release any memory, and
the next schedule will fail again.

Instead, directly move the required amount of pages from the
ssk to the msk, if available

Fixes: 9c3f94e168 ("mptcp: add missing memory scheduling in the rx path")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 16:47:44 -07:00
Jisheng Zhang
42e0e0b453 riscv: code patching only works on !XIP_KERNEL
Some features which need code patching such as KPROBES, DYNAMIC_FTRACE
KGDB can only work on !XIP_KERNEL. Add dependencies for these features
that rely on code patching.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-10 16:33:50 -07:00
Vitaly Wool
5e63215c2f riscv: xip: support runtime trap patching
RISCV_ERRATA_ALTERNATIVE patches text at runtime which is currently
not possible when the kernel is executed from the flash in XIP mode.
Since runtime patching concerns only traps at the moment, let's just
have all the traps reside in RAM anyway if RISCV_ERRATA_ALTERNATIVE
is set. Thus, these functions will be patch-able even when the .text
section is in flash.

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-10 16:16:06 -07:00
Pavel Begunkov
9690557e22 io_uring: add feature flag for rsrc tags
Add IORING_FEAT_RSRC_TAGS indicating that io_uring supports a bunch of
new IORING_REGISTER operations, in particular
IORING_REGISTER_[FILES[,UPDATE]2,BUFFERS[2,UPDATE]] that support rsrc
tagging, and also indicating implemented dynamic fixed buffer updates.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9b995d4045b6c6b4ab7510ca124fd25ac2203af7.1623339162.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-10 16:33:51 -06:00
Pavel Begunkov
992da01aa9 io_uring: change registration/upd/rsrc tagging ABI
There are ABI moments about recently added rsrc registration/update and
tagging that might become a nuisance in the future. First,
IORING_REGISTER_RSRC[_UPD] hide different types of resources under it,
so breaks fine control over them by restrictions. It works for now, but
once those are wanted under restrictions it would require a rework.

It was also inconvenient trying to fit a new resource not supporting
all the features (e.g. dynamic update) into the interface, so better
to return to IORING_REGISTER_* top level dispatching.

Second, register/update were considered to accept a type of resource,
however that's not a good idea because there might be several ways of
registration of a single resource type, e.g. we may want to add
non-contig buffers or anything more exquisite as dma mapped memory.
So, remove IORING_RSRC_[FILE,BUFFER] out of the ABI, and place them
internally for now to limit changes.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9b554897a7c17ad6e3becc48dfed2f7af9f423d5.1623339162.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-10 16:33:51 -06:00
David S. Miller
22488e4550 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Fix a crash when stateful expression with its own gc callback
   is used in a set definition.

2) Skip IPv6 packets from any link-local address in IPv6 fib expression.
   Add a selftest for this scenario, from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:33:56 -07:00
David S. Miller
0280f429dc Merge branch 'tcp-options-oob-fixes'
Maxim Mikityanskiy says:

====================
Fix out of bounds when parsing TCP options

This series fixes out-of-bounds access in various places in the kernel
where parsing of TCP options takes place. Fortunately, many more
occurrences don't have this bug.

v2 changes:

synproxy: Added an early return when length < 0 to avoid calling
skb_header_pointer with negative length.

sch_cake: Added doff validation to avoid parsing garbage.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:26:18 -07:00
Maxim Mikityanskiy
ba91c49ded sch_cake: Fix out of bounds when parsing TCP options and header
The TCP option parser in cake qdisc (cake_get_tcpopt and
cake_tcph_may_drop) could read one byte out of bounds. When the length
is 1, the execution flow gets into the loop, reads one byte of the
opcode, and if the opcode is neither TCPOPT_EOL nor TCPOPT_NOP, it reads
one more byte, which exceeds the length of 1.

This fix is inspired by commit 9609dad263 ("ipv4: tcp_input: fix stack
out of bounds when parsing TCP options.").

v2 changes:

Added doff validation in cake_get_tcphdr to avoid parsing garbage as TCP
header. Although it wasn't strictly an out-of-bounds access (memory was
allocated), garbage values could be read where CAKE expected the TCP
header if doff was smaller than 5.

Cc: Young Xiao <92siuyang@gmail.com>
Fixes: 8b7138814f ("sch_cake: Add optional ACK filter")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:26:18 -07:00
Maxim Mikityanskiy
07718be265 mptcp: Fix out of bounds when parsing TCP options
The TCP option parser in mptcp (mptcp_get_options) could read one byte
out of bounds. When the length is 1, the execution flow gets into the
loop, reads one byte of the opcode, and if the opcode is neither
TCPOPT_EOL nor TCPOPT_NOP, it reads one more byte, which exceeds the
length of 1.

This fix is inspired by commit 9609dad263 ("ipv4: tcp_input: fix stack
out of bounds when parsing TCP options.").

Cc: Young Xiao <92siuyang@gmail.com>
Fixes: cec37a6e41 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:26:18 -07:00
Maxim Mikityanskiy
5fc177ab75 netfilter: synproxy: Fix out of bounds when parsing TCP options
The TCP option parser in synproxy (synproxy_parse_options) could read
one byte out of bounds. When the length is 1, the execution flow gets
into the loop, reads one byte of the opcode, and if the opcode is
neither TCPOPT_EOL nor TCPOPT_NOP, it reads one more byte, which exceeds
the length of 1.

This fix is inspired by commit 9609dad263 ("ipv4: tcp_input: fix stack
out of bounds when parsing TCP options.").

v2 changes:

Added an early return when length < 0 to avoid calling
skb_header_pointer with negative length.

Cc: Young Xiao <92siuyang@gmail.com>
Fixes: 48b1de4c11 ("netfilter: add SYNPROXY core/target")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:26:18 -07:00
Eric Dumazet
d1b5bee4c8 net/packet: annotate data race in packet_sendmsg()
There is a known race in packet_sendmsg(), addressed
in commit 32d3182cd2 ("net/packet: fix race in tpacket_snd()")

Now we have data_race(), we can use it to avoid a future KCSAN warning,
as syzbot loves stressing af_packet sockets :)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:12:54 -07:00
Eric Dumazet
b71eaed8c0 inet: annotate date races around sk->sk_txhash
UDP sendmsg() path can be lockless, it is possible for another
thread to re-connect an change sk->sk_txhash under us.

There is no serious impact, but we can use READ_ONCE()/WRITE_ONCE()
pair to document the race.

BUG: KCSAN: data-race in __ip4_datagram_connect / skb_set_owner_w

write to 0xffff88813397920c of 4 bytes by task 30997 on cpu 1:
 sk_set_txhash include/net/sock.h:1937 [inline]
 __ip4_datagram_connect+0x69e/0x710 net/ipv4/datagram.c:75
 __ip6_datagram_connect+0x551/0x840 net/ipv6/datagram.c:189
 ip6_datagram_connect+0x2a/0x40 net/ipv6/datagram.c:272
 inet_dgram_connect+0xfd/0x180 net/ipv4/af_inet.c:580
 __sys_connect_file net/socket.c:1837 [inline]
 __sys_connect+0x245/0x280 net/socket.c:1854
 __do_sys_connect net/socket.c:1864 [inline]
 __se_sys_connect net/socket.c:1861 [inline]
 __x64_sys_connect+0x3d/0x50 net/socket.c:1861
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff88813397920c of 4 bytes by task 31039 on cpu 0:
 skb_set_hash_from_sk include/net/sock.h:2211 [inline]
 skb_set_owner_w+0x118/0x220 net/core/sock.c:2101
 sock_alloc_send_pskb+0x452/0x4e0 net/core/sock.c:2359
 sock_alloc_send_skb+0x2d/0x40 net/core/sock.c:2373
 __ip6_append_data+0x1743/0x21a0 net/ipv6/ip6_output.c:1621
 ip6_make_skb+0x258/0x420 net/ipv6/ip6_output.c:1983
 udpv6_sendmsg+0x160a/0x16b0 net/ipv6/udp.c:1527
 inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:642
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
 ___sys_sendmsg net/socket.c:2404 [inline]
 __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490
 __do_sys_sendmmsg net/socket.c:2519 [inline]
 __se_sys_sendmmsg net/socket.c:2516 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0xbca3c43d -> 0xfdb309e0

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 31039 Comm: syz-executor.2 Not tainted 5.13.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:12:54 -07:00
Eric Dumazet
f13ef10059 net: annotate data race in sock_error()
sock_error() is known to be racy. The code avoids
an atomic operation is sk_err is zero, and this field
could be changed under us, this is fine.

Sysbot reported:

BUG: KCSAN: data-race in sock_alloc_send_pskb / unix_release_sock

write to 0xffff888131855630 of 4 bytes by task 9365 on cpu 1:
 unix_release_sock+0x2e9/0x6e0 net/unix/af_unix.c:550
 unix_release+0x2f/0x50 net/unix/af_unix.c:859
 __sock_release net/socket.c:599 [inline]
 sock_close+0x6c/0x150 net/socket.c:1258
 __fput+0x25b/0x4e0 fs/file_table.c:280
 ____fput+0x11/0x20 fs/file_table.c:313
 task_work_run+0xae/0x130 kernel/task_work.c:164
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
 exit_to_user_mode_prepare+0x156/0x190 kernel/entry/common.c:208
 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
 syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301
 do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff888131855630 of 4 bytes by task 9385 on cpu 0:
 sock_error include/net/sock.h:2269 [inline]
 sock_alloc_send_pskb+0xe4/0x4e0 net/core/sock.c:2336
 unix_dgram_sendmsg+0x478/0x1610 net/unix/af_unix.c:1671
 unix_seqpacket_sendmsg+0xc2/0x100 net/unix/af_unix.c:2055
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
 __sys_sendmsg_sock+0x25/0x30 net/socket.c:2416
 io_sendmsg fs/io_uring.c:4367 [inline]
 io_issue_sqe+0x231a/0x6750 fs/io_uring.c:6135
 __io_queue_sqe+0xe9/0x360 fs/io_uring.c:6414
 __io_req_task_submit fs/io_uring.c:2039 [inline]
 io_async_task_func+0x312/0x590 fs/io_uring.c:5074
 __tctx_task_work fs/io_uring.c:1910 [inline]
 tctx_task_work+0x1d4/0x3d0 fs/io_uring.c:1924
 task_work_run+0xae/0x130 kernel/task_work.c:164
 tracehook_notify_signal include/linux/tracehook.h:212 [inline]
 handle_signal_work kernel/entry/common.c:145 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
 exit_to_user_mode_prepare+0xf8/0x190 kernel/entry/common.c:208
 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
 syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301
 do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x00000000 -> 0x00000068

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 9385 Comm: syz-executor.3 Not tainted 5.13.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:12:54 -07:00
David S. Miller
172947ac67 Merge branch 'bridge-egress-fixes'
Nikolay Aleksandrov says:

====================
net: bridge: vlan tunnel egress path fixes

These two fixes take care of tunnel_dst problems in the vlan tunnel egress
path. Patch 01 fixes a null ptr deref due to the lockless use of tunnel_dst
pointer without checking it first, and patch 02 fixes a use-after-free
issue due to wrong dst refcounting (dst_clone() -> dst_hold_safe()).

Both fix the same commit and should be queued for stable backports:
Fixes: 11538d039a ("bridge: vlan dst_metadata hooks in ingress and egress paths")

v2: no changes, added stable list to CC
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:06:43 -07:00
Nikolay Aleksandrov
cfc579f9d8 net: bridge: fix vlan tunnel dst refcnt when egressing
The egress tunnel code uses dst_clone() and directly sets the result
which is wrong because the entry might have 0 refcnt or be already deleted,
causing number of problems. It also triggers the WARN_ON() in dst_hold()[1]
when a refcnt couldn't be taken. Fix it by using dst_hold_safe() and
checking if a reference was actually taken before setting the dst.

[1] dmesg WARN_ON log and following refcnt errors
 WARNING: CPU: 5 PID: 38 at include/net/dst.h:230 br_handle_egress_vlan_tunnel+0x10b/0x134 [bridge]
 Modules linked in: 8021q garp mrp bridge stp llc bonding ipv6 virtio_net
 CPU: 5 PID: 38 Comm: ksoftirqd/5 Kdump: loaded Tainted: G        W         5.13.0-rc3+ #360
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
 RIP: 0010:br_handle_egress_vlan_tunnel+0x10b/0x134 [bridge]
 Code: e8 85 bc 01 e1 45 84 f6 74 90 45 31 f6 85 db 48 c7 c7 a0 02 19 a0 41 0f 94 c6 31 c9 31 d2 44 89 f6 e8 64 bc 01 e1 85 db 75 02 <0f> 0b 31 c9 31 d2 44 89 f6 48 c7 c7 70 02 19 a0 e8 4b bc 01 e1 49
 RSP: 0018:ffff8881003d39e8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffa01902a0
 RBP: ffff8881040c6700 R08: 0000000000000000 R09: 0000000000000001
 R10: 2ce93d0054fe0d00 R11: 54fe0d00000e0000 R12: ffff888109515000
 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000401
 FS:  0000000000000000(0000) GS:ffff88822bf40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f42ba70f030 CR3: 0000000109926000 CR4: 00000000000006e0
 Call Trace:
  br_handle_vlan+0xbc/0xca [bridge]
  __br_forward+0x23/0x164 [bridge]
  deliver_clone+0x41/0x48 [bridge]
  br_handle_frame_finish+0x36f/0x3aa [bridge]
  ? skb_dst+0x2e/0x38 [bridge]
  ? br_handle_ingress_vlan_tunnel+0x3e/0x1c8 [bridge]
  ? br_handle_frame_finish+0x3aa/0x3aa [bridge]
  br_handle_frame+0x2c3/0x377 [bridge]
  ? __skb_pull+0x33/0x51
  ? vlan_do_receive+0x4f/0x36a
  ? br_handle_frame_finish+0x3aa/0x3aa [bridge]
  __netif_receive_skb_core+0x539/0x7c6
  ? __list_del_entry_valid+0x16e/0x1c2
  __netif_receive_skb_list_core+0x6d/0xd6
  netif_receive_skb_list_internal+0x1d9/0x1fa
  gro_normal_list+0x22/0x3e
  dev_gro_receive+0x55b/0x600
  ? detach_buf_split+0x58/0x140
  napi_gro_receive+0x94/0x12e
  virtnet_poll+0x15d/0x315 [virtio_net]
  __napi_poll+0x2c/0x1c9
  net_rx_action+0xe6/0x1fb
  __do_softirq+0x115/0x2d8
  run_ksoftirqd+0x18/0x20
  smpboot_thread_fn+0x183/0x19c
  ? smpboot_unregister_percpu_thread+0x66/0x66
  kthread+0x10a/0x10f
  ? kthread_mod_delayed_work+0xb6/0xb6
  ret_from_fork+0x22/0x30
 ---[ end trace 49f61b07f775fd2b ]---
 dst_release: dst:00000000c02d677a refcnt:-1
 dst_release underflow

Cc: stable@vger.kernel.org
Fixes: 11538d039a ("bridge: vlan dst_metadata hooks in ingress and egress paths")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:06:43 -07:00
Nikolay Aleksandrov
58e2071742 net: bridge: fix vlan tunnel dst null pointer dereference
This patch fixes a tunnel_dst null pointer dereference due to lockless
access in the tunnel egress path. When deleting a vlan tunnel the
tunnel_dst pointer is set to NULL without waiting a grace period (i.e.
while it's still usable) and packets egressing are dereferencing it
without checking. Use READ/WRITE_ONCE to annotate the lockless use of
tunnel_id, use RCU for accessing tunnel_dst and make sure it is read
only once and checked in the egress path. The dst is already properly RCU
protected so we don't need to do anything fancy than to make sure
tunnel_id and tunnel_dst are read only once and checked in the egress path.

Cc: stable@vger.kernel.org
Fixes: 11538d039a ("bridge: vlan dst_metadata hooks in ingress and egress paths")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 14:06:43 -07:00
Eric W. Biederman
06af867944 coredump: Limit what can interrupt coredumps
Olivier Langlois has been struggling with coredumps being incompletely written in
processes using io_uring.

Olivier Langlois <olivier@trillion01.com> writes:
> io_uring is a big user of task_work and any event that io_uring made a
> task waiting for that occurs during the core dump generation will
> generate a TIF_NOTIFY_SIGNAL.
>
> Here are the detailed steps of the problem:
> 1. io_uring calls vfs_poll() to install a task to a file wait queue
>    with io_async_wake() as the wakeup function cb from io_arm_poll_handler()
> 2. wakeup function ends up calling task_work_add() with TWA_SIGNAL
> 3. task_work_add() sets the TIF_NOTIFY_SIGNAL bit by calling
>    set_notify_signal()

The coredump code deliberately supports being interrupted by SIGKILL,
and depends upon prepare_signal to filter out all other signals.   Now
that signal_pending includes wake ups for TIF_NOTIFY_SIGNAL this hack
in dump_emitted by the coredump code no longer works.

Make the coredump code more robust by explicitly testing for all of
the wakeup conditions the coredump code supports.  This prevents
new wakeup conditions from breaking the coredump code, as well
as fixing the current issue.

The filesystem code that the coredump code uses already limits
itself to only aborting on fatal_signal_pending.  So it should
not develop surprising wake-up reasons either.

v2: Don't remove the now unnecessary code in prepare_signal.

Cc: stable@vger.kernel.org
Fixes: 12db8b6900 ("entry: Add support for TIF_NOTIFY_SIGNAL")
Reported-by: Olivier Langlois <olivier@trillion01.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-10 14:02:29 -07:00
Zheng Yongjun
9d44fa3e50 ping: Check return value of function 'ping_queue_rcv_skb'
Function 'ping_queue_rcv_skb' not always return success, which will
also return fail. If not check the wrong return value of it, lead to function
`ping_rcv` return success.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 13:44:55 -07:00
Willem de Bruijn
3bdd5ee0ec skbuff: fix incorrect msg_zerocopy copy notifications
msg_zerocopy signals if a send operation required copying with a flag
in serr->ee.ee_code.

This field can be incorrect as of the below commit, as a result of
both structs uarg and serr pointing into the same skb->cb[].

uarg->zerocopy must be read before skb->cb[] is reinitialized to hold
serr. Similar to other fields len, hi and lo, use a local variable to
temporarily hold the value.

This was not a problem before, when the value was passed as a function
argument.

Fixes: 75518851a2 ("skbuff: Push status and refcounts into sock_zerocopy_callback")
Reported-by: Talal Ahmad <talalahmad@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 13:39:57 -07:00
David S. Miller
388fa7f13d Merge tag 'mlx5-fixes-2021-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5-fixes-2021-06-09
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10 13:38:46 -07:00
Linus Torvalds
f09eacca59 Merge branch 'for-5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
 "This is a high priority but low risk fix for a cgroup1 bug where
  rename(2) can change a cgroup's name to something which can break
  parsing of /proc/PID/cgroup"

* 'for-5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup1: don't allow '\n' in renaming
2021-06-10 12:01:22 -07:00
Bjorn Andersson
142d0b24c1 usb: typec: mux: Fix copy-paste mistake in typec_mux_match
Fix the copy-paste mistake in the return path of typec_mux_match(),
where dev is considered a member of struct typec_switch rather than
struct typec_mux.

The two structs are identical in regards to having the struct device as
the first entry, so this provides no functional change.

Fixes: 3370db3519 ("usb: typec: Registering real device entries for the muxes")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210610002132.3088083-1-bjorn.andersson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-10 20:03:06 +02:00
Mayank Rana
f247f0a82a usb: typec: ucsi: Clear PPM capability data in ucsi_init() error path
If ucsi_init() fails for some reason (e.g. ucsi_register_port()
fails or general communication failure to the PPM), particularly at
any point after the GET_CAPABILITY command had been issued, this
results in unwinding the initialization and returning an error.
However the ucsi structure's ucsi_capability member retains its
current value, including likely a non-zero num_connectors.
And because ucsi_init() itself is done in a workqueue a UCSI
interface driver will be unaware that it failed and may think the
ucsi_register() call was completely successful.  Later, if
ucsi_unregister() is called, due to this stale ucsi->cap value it
would try to access the items in the ucsi->connector array which
might not be in a proper state or not even allocated at all and
results in NULL or invalid pointer dereference.

Fix this by clearing the ucsi->cap value to 0 during the error
path of ucsi_init() in order to prevent a later ucsi_unregister()
from entering the connector cleanup loop.

Fixes: c1b0bc2dab ("usb: typec: Add support for UCSI interface")
Cc: stable@vger.kernel.org
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Mayank Rana <mrana@codeaurora.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210609073535.5094-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-10 20:02:54 +02:00
Joel Stanley
e0e8b6abe8 usb: gadget: fsl: Re-enable driver for ARM SoCs
The commit a390bef7db ("usb: gadget: fsl_mxc_udc: Remove the driver")
dropped the ARCH_MXC dependency from USB_FSL_USB2, leaving it depending
solely on FSL_SOC.

FSL_SOC is powerpc only; it was briefly available on ARM in 2014 but was
removed by commit cfd074ad86 ("ARM: imx: temporarily remove
CONFIG_SOC_FSL from LS1021A"). Therefore the driver can no longer be
enabled on ARM platforms.

This appears to be a mistake as arm64's ARCH_LAYERSCAPE and arm32
SOC_LS1021A SoCs use this symbol. It's enabled in these defconfigs:

arch/arm/configs/imx_v6_v7_defconfig:CONFIG_USB_FSL_USB2=y
arch/arm/configs/multi_v7_defconfig:CONFIG_USB_FSL_USB2=y
arch/powerpc/configs/mgcoge_defconfig:CONFIG_USB_FSL_USB2=y
arch/powerpc/configs/mpc512x_defconfig:CONFIG_USB_FSL_USB2=y

To fix, expand the dependencies so USB_FSL_USB2 can be enabled on the
ARM platforms, and with COMPILE_TEST.

Fixes: a390bef7db ("usb: gadget: fsl_mxc_udc: Remove the driver")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20210610034957.93376-1-joel@jms.id.au
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-10 20:02:16 +02:00
Andy Shevchenko
d5ab95da2a usb: typec: wcove: Use LE to CPU conversion when accessing msg->header
As LKP noticed the Sparse is not happy about strict type handling:
   .../typec/tcpm/wcove.c:380:50: sparse:     expected unsigned short [usertype] header
   .../typec/tcpm/wcove.c:380:50: sparse:     got restricted __le16 const [usertype] header

Fix this by switching to use pd_header_cnt_le() instead of pd_header_cnt()
in the affected code.

Fixes: ae8a2ca8a2 ("usb: typec: Group all TCPCI/TCPM code together")
Fixes: 3c4fb9f169 ("usb: typec: wcove: start using tcpm for USB PD support")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210609172202.83377-1-andriy.shevchenko@linux.intel.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-10 20:01:23 +02:00
Linus Torvalds
29a877d576 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "A mixture of small bug fixes and a small security issue:

   - WARN_ON when IPoIB is automatically moved between namespaces

   - Long standing bug where mlx5 would use the wrong page for the
     doorbell recovery memory if fork is used

   - Security fix for mlx4 that disables the timestamp feature

   - Several crashers for mlx5

   - Plug a recent mlx5 memory leak for the sig_mr"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/mlx5: Fix initializing CQ fragments buffer
  RDMA/mlx5: Delete right entry from MR signature database
  RDMA: Verify port when creating flow rule
  RDMA/mlx5: Block FDB rules when not in switchdev mode
  RDMA/mlx4: Do not map the core_clock page to user space unless enabled
  RDMA/mlx5: Use different doorbell memory for different processes
  RDMA/ipoib: Fix warning caused by destroying non-initial netns
2021-06-10 10:53:04 -07:00
Marc Zyngier
382e6e177b irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry
The arm64 entry code suffers from an annoying issue on taking
a NMI, as it sets PMR to a value that actually allows IRQs
to be acknowledged. This is done for consistency with other parts
of the code, and is in the process of being fixed. This shouldn't
be a problem, as we are not enabling interrupts whilst in NMI
context.

However, in the infortunate scenario that we took a spurious NMI
(retired before the read of IAR) *and* that there is an IRQ pending
at the same time, we'll ack the IRQ in NMI context. Too bad.

In order to avoid deadlocks while running something like perf,
teach the GICv3 driver about this situation: if we were in
a context where no interrupt should have fired, transiently
set PMR to a value that only allows NMIs before acking the pending
interrupt, and restore the original value after that.

This papers over the core issue for the time being, and makes
NMIs great again. Sort of.

Fixes: 4d6a38da8e ("arm64: entry: always set GIC_PRIO_PSR_I_SET during entry")
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/lkml/20210610145731.1350460-1-maz@kernel.org
2021-06-10 17:54:34 +01:00
Robert Marko
e13d112724 hwmon: (tps23861) correct shunt LSB values
Current shunt LSB values got reversed during in the
original driver commit.

So, correct the current shunt LSB values according to
the datasheet.

This caused reading slightly skewed current values.

Fixes: fff7b8ab22 ("hwmon: add Texas Instruments TPS23861 driver")
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Link: https://lore.kernel.org/r/20210609220728.499879-3-robert.marko@sartura.hr
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-10 08:40:09 -07:00
Robert Marko
b325d3526e hwmon: (tps23861) set current shunt value
TPS23861 has a configuration bit for setting of the
current shunt value used on the board.
Its bit 0 of the General Mask 1 register.

According to the datasheet bit values are:
0 for 255 mOhm (Default)
1 for 250 mOhm

So, configure the bit before registering the hwmon
device according to the value passed in the DTS or
default one if none is passed.

This caused potentially reading slightly skewed values
due to max current value being 1.02A when 250mOhm shunt
is used instead of 1.0A when 255mOhm is used.

Fixes: fff7b8ab22 ("hwmon: add Texas Instruments TPS23861 driver")
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Link: https://lore.kernel.org/r/20210609220728.499879-2-robert.marko@sartura.hr
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-10 08:38:52 -07:00
Robert Marko
fb8543fb86 hwmon: (tps23861) define regmap max register
Define the max register address the device supports.
This allows reading the whole register space via
regmap debugfs, without it only register 0x0 is visible.

This was forgotten in the original driver commit.

Fixes: fff7b8ab22 ("hwmon: add Texas Instruments TPS23861 driver")
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Link: https://lore.kernel.org/r/20210609220728.499879-1-robert.marko@sartura.hr
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-10 08:36:19 -07:00
Takashi Iwai
83e197a841 ALSA: seq: Fix race of snd_seq_timer_open()
The timer instance per queue is exclusive, and snd_seq_timer_open()
should have managed the concurrent accesses.  It looks as if it's
checking the already existing timer instance at the beginning, but
it's not right, because there is no protection, hence any later
concurrent call of snd_seq_timer_open() may override the timer
instance easily.  This may result in UAF, as the leftover timer
instance can keep running while the queue itself gets closed, as
spotted by syzkaller recently.

For avoiding the race, add a proper check at the assignment of
tmr->timeri again, and return -EBUSY if it's been already registered.

Reported-by: syzbot+ddc1260a83ed1cbf6fb5@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/000000000000dce34f05c42f110c@google.com
Link: https://lore.kernel.org/r/20210610152059.24633-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-10 17:21:30 +02:00
Johan Hovold
63a8eef70c USB: serial: cp210x: fix CP2102N-A01 modem control
CP2102N revision A01 (firmware version <= 1.0.4) has a buggy
flow-control implementation that uses the ulXonLimit instead of
ulFlowReplace field of the flow-control settings structure (erratum
CP2102N_E104).

A recent change that set the input software flow-control limits
incidentally broke RTS control for these devices when CRTSCTS is not set
as the new limits would always enable hardware flow control.

Fix this by explicitly disabling flow control for the buggy firmware
versions and only updating the input software flow-control limits when
IXOFF is requested. This makes sure that the terminal settings matches
the default zero ulXonLimit (ulFlowReplace) for these devices.

Link: https://lore.kernel.org/r/20210609161509.9459-1-johan@kernel.org
Reported-by: David Frey <dpfrey@gmail.com>
Reported-by: Alex Villacís Lasso <a_villacis@palosanto.com>
Tested-by: Alex Villacís Lasso <a_villacis@palosanto.com>
Fixes: f61309d9c9 ("USB: serial: cp210x: set IXOFF thresholds")
Cc: stable@vger.kernel.org      # 5.12
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-06-10 16:59:00 +02:00
Stephen Boyd
170b763597 drm/msm/dsi: Stash away calculated vco frequency on recalc
A problem was reported on CoachZ devices where the display wouldn't come
up, or it would be distorted. It turns out that the PLL code here wasn't
getting called once dsi_pll_10nm_vco_recalc_rate() started returning the
same exact frequency, down to the Hz, that the bootloader was setting
instead of 0 when the clk was registered with the clk framework.

After commit 001d8dc338 ("drm/msm/dsi: remove temp data from global
pll structure") we use a hardcoded value for the parent clk frequency,
i.e.  VCO_REF_CLK_RATE, and we also hardcode the value for FRAC_BITS,
instead of getting it from the config structure. This combination of
changes to the recalc function allows us to properly calculate the
frequency of the PLL regardless of whether or not the PLL has been
clk_prepare()d or clk_set_rate()d. That's a good improvement.

Unfortunately, this means that now we won't call down into the PLL clk
driver when we call clk_set_rate() because the frequency calculated in
the framework matches the frequency that is set in hardware. If the rate
is the same as what we want it should be OK to not call the set_rate PLL
op. The real problem is that the prepare op in this driver uses a
private struct member to stash away the vco frequency so that it can
call the set_rate op directly during prepare. Once the set_rate op is
never called because recalc_rate told us the rate is the same, we don't
set this private struct member before the prepare op runs, so we try to
call the set_rate function directly with a frequency of 0. This
effectively kills the PLL and configures it for a rate that won't work.
Calling set_rate from prepare is really quite bad and will confuse any
downstream clks about what the rate actually is of their parent. Fixing
that will be a rather large change though so we leave that to later.

For now, let's stash away the rate we calculate during recalc so that
the prepare op knows what frequency to set, instead of 0. This way
things keep working and the display can enable the PLL properly. In the
future, we should remove that code from the prepare op so that it
doesn't even try to call the set rate function.

Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Abhinav Kumar <abhinavk@codeaurora.org>
Fixes: 001d8dc338 ("drm/msm/dsi: remove temp data from global pll structure")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20210608195519.125561-1-swboyd@chromium.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-10 07:57:48 -07:00
Alexander Kuznetsov
b7e24eb1ca cgroup1: don't allow '\n' in renaming
cgroup_mkdir() have restriction on newline usage in names:
$ mkdir $'/sys/fs/cgroup/cpu/test\ntest2'
mkdir: cannot create directory
'/sys/fs/cgroup/cpu/test\ntest2': Invalid argument

But in cgroup1_rename() such check is missed.
This allows us to make /proc/<pid>/cgroup unparsable:
$ mkdir /sys/fs/cgroup/cpu/test
$ mv /sys/fs/cgroup/cpu/test $'/sys/fs/cgroup/cpu/test\ntest2'
$ echo $$ > $'/sys/fs/cgroup/cpu/test\ntest2'
$ cat /proc/self/cgroup
11:pids:/
10:freezer:/
9:hugetlb:/
8:cpuset:/
7:blkio:/user.slice
6:memory:/user.slice
5:net_cls,net_prio:/
4:perf_event:/
3:devices:/user.slice
2:cpu,cpuacct:/test
test2
1:name=systemd:/
0::/

Signed-off-by: Alexander Kuznetsov <wwfq@yandex-team.ru>
Reported-by: Andrey Krasichkov <buglloc@yandex-team.ru>
Acked-by: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-06-10 09:58:50 -04:00
Sean Christopherson
78fcb2c91a KVM: x86: Immediately reset the MMU context when the SMM flag is cleared
Immediately reset the MMU context when the vCPU's SMM flag is cleared so
that the SMM flag in the MMU role is always synchronized with the vCPU's
flag.  If RSM fails (which isn't correctly emulated), KVM will bail
without calling post_leave_smm() and leave the MMU in a bad state.

The bad MMU role can lead to a NULL pointer dereference when grabbing a
shadow page's rmap for a page fault as the initial lookups for the gfn
will happen with the vCPU's SMM flag (=0), whereas the rmap lookup will
use the shadow page's SMM flag, which comes from the MMU (=1).  SMM has
an entirely different set of memslots, and so the initial lookup can find
a memslot (SMM=0) and then explode on the rmap memslot lookup (SMM=1).

  general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
  KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
  CPU: 1 PID: 8410 Comm: syz-executor382 Not tainted 5.13.0-rc5-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:__gfn_to_rmap arch/x86/kvm/mmu/mmu.c:935 [inline]
  RIP: 0010:gfn_to_rmap+0x2b0/0x4d0 arch/x86/kvm/mmu/mmu.c:947
  Code: <42> 80 3c 20 00 74 08 4c 89 ff e8 f1 79 a9 00 4c 89 fb 4d 8b 37 44
  RSP: 0018:ffffc90000ffef98 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff888015b9f414 RCX: ffff888019669c40
  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
  RBP: 0000000000000001 R08: ffffffff811d9cdb R09: ffffed10065a6002
  R10: ffffed10065a6002 R11: 0000000000000000 R12: dffffc0000000000
  R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000000
  FS:  000000000124b300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000028e31000 CR4: 00000000001526e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   rmap_add arch/x86/kvm/mmu/mmu.c:965 [inline]
   mmu_set_spte+0x862/0xe60 arch/x86/kvm/mmu/mmu.c:2604
   __direct_map arch/x86/kvm/mmu/mmu.c:2862 [inline]
   direct_page_fault+0x1f74/0x2b70 arch/x86/kvm/mmu/mmu.c:3769
   kvm_mmu_do_page_fault arch/x86/kvm/mmu.h:124 [inline]
   kvm_mmu_page_fault+0x199/0x1440 arch/x86/kvm/mmu/mmu.c:5065
   vmx_handle_exit+0x26/0x160 arch/x86/kvm/vmx/vmx.c:6122
   vcpu_enter_guest+0x3bdd/0x9630 arch/x86/kvm/x86.c:9428
   vcpu_run+0x416/0xc20 arch/x86/kvm/x86.c:9494
   kvm_arch_vcpu_ioctl_run+0x4e8/0xa40 arch/x86/kvm/x86.c:9722
   kvm_vcpu_ioctl+0x70f/0xbb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3460
   vfs_ioctl fs/ioctl.c:51 [inline]
   __do_sys_ioctl fs/ioctl.c:1069 [inline]
   __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:1055
   do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x440ce9

Cc: stable@vger.kernel.org
Reported-by: syzbot+fb0b6a7e8713aeb0319c@syzkaller.appspotmail.com
Fixes: 9ec19493fb ("KVM: x86: clear SMM flags before loading state while leaving SMM")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210609185619.992058-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-10 09:21:12 -04:00
Alaa Hleihel
2ba0aa2fee IB/mlx5: Fix initializing CQ fragments buffer
The function init_cq_frag_buf() can be called to initialize the current CQ
fragments buffer cq->buf, or the temporary cq->resize_buf that is filled
during CQ resize operation.

However, the offending commit started to use function get_cqe() for
getting the CQEs, the issue with this change is that get_cqe() always
returns CQEs from cq->buf, which leads us to initialize the wrong buffer,
and in case of enlarging the CQ we try to access elements beyond the size
of the current cq->buf and eventually hit a kernel panic.

 [exception RIP: init_cq_frag_buf+103]
  [ffff9f799ddcbcd8] mlx5_ib_resize_cq at ffffffffc0835d60 [mlx5_ib]
  [ffff9f799ddcbdb0] ib_resize_cq at ffffffffc05270df [ib_core]
  [ffff9f799ddcbdc0] llt_rdma_setup_qp at ffffffffc0a6a712 [llt]
  [ffff9f799ddcbe10] llt_rdma_cc_event_action at ffffffffc0a6b411 [llt]
  [ffff9f799ddcbe98] llt_rdma_client_conn_thread at ffffffffc0a6bb75 [llt]
  [ffff9f799ddcbec8] kthread at ffffffffa66c5da1
  [ffff9f799ddcbf50] ret_from_fork_nospec_begin at ffffffffa6d95ddd

Fix it by getting the needed CQE by calling mlx5_frag_buf_get_wqe() that
takes the correct source buffer as a parameter.

Fixes: 388ca8be00 ("IB/mlx5: Implement fragmented completion queue (CQ)")
Link: https://lore.kernel.org/r/90a0e8c924093cfa50a482880ad7e7edb73dc19a.1623309971.git.leonro@nvidia.com
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-10 08:59:34 -03:00
Aharon Landau
6466f03fdf RDMA/mlx5: Delete right entry from MR signature database
The value mr->sig is stored in the entry upon mr allocation, however, ibmr
is wrongly entered here as "old", therefore, xa_cmpxchg() does not replace
the entry with NULL, which leads to the following trace:

 WARNING: CPU: 28 PID: 2078 at drivers/infiniband/hw/mlx5/main.c:3643 mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
 Modules linked in: nvme_rdma nvme_fabrics nvme_core 8021q garp mrp bonding bridge stp llc rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_tad
 CPU: 28 PID: 2078 Comm: reboot Tainted: G               X --------- ---  5.13.0-0.rc2.19.el9.x86_64 #1
 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.9.1 12/07/2018
 RIP: 0010:mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
 Code: 8d bb 70 1f 00 00 be 00 01 00 00 e8 9d 94 ce da 48 3d 00 01 00 00 75 02 5b c3 0f 0b 5b c3 0f 0b 48 83 bb b0 20 00 00 00 74 d5 <0f> 0b eb d1 4
 RSP: 0018:ffffa8db06d33c90 EFLAGS: 00010282
 RAX: 0000000000000000 RBX: ffff97f890a44000 RCX: ffff97f900ec0160
 RDX: 0000000000000000 RSI: 0000000080080001 RDI: ffff97f890a44000
 RBP: ffffffffc0c189b8 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000300 R12: ffff97f890a44000
 R13: ffffffffc0c36030 R14: 00000000fee1dead R15: 0000000000000000
 FS:  00007f0d5a8a3b40(0000) GS:ffff98077fb80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000555acbf4f450 CR3: 00000002a6f56002 CR4: 00000000001706e0
 Call Trace:
  mlx5r_remove+0x39/0x60 [mlx5_ib]
  auxiliary_bus_remove+0x1b/0x30
  __device_release_driver+0x17a/0x230
  device_release_driver+0x24/0x30
  bus_remove_device+0xdb/0x140
  device_del+0x18b/0x3e0
  mlx5_detach_device+0x59/0x90 [mlx5_core]
  mlx5_unload_one+0x22/0x60 [mlx5_core]
  shutdown+0x31/0x3a [mlx5_core]
  pci_device_shutdown+0x34/0x60
  device_shutdown+0x15b/0x1c0
  __do_sys_reboot.cold+0x2f/0x5b
  ? vfs_writev+0xc7/0x140
  ? handle_mm_fault+0xc5/0x290
  ? do_writev+0x6b/0x110
  do_syscall_64+0x40/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: e6fb246cca ("RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()")
Link: https://lore.kernel.org/r/f3f585ea0db59c2a78f94f65eedeafc5a2374993.1623309971.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-10 08:59:34 -03:00
Maor Gottlieb
2adcb4c5a5 RDMA: Verify port when creating flow rule
Validate port value provided by the user and with that remove no longer
needed validation by the driver.  The missing check in the mlx5_ib driver
could cause to the below oops.

Call trace:
  _create_flow_rule+0x2d4/0xf28 [mlx5_ib]
  mlx5_ib_create_flow+0x2d0/0x5b0 [mlx5_ib]
  ib_uverbs_ex_create_flow+0x4cc/0x624 [ib_uverbs]
  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xd4/0x150 [ib_uverbs]
  ib_uverbs_cmd_verbs.isra.7+0xb28/0xc50 [ib_uverbs]
  ib_uverbs_ioctl+0x158/0x1d0 [ib_uverbs]
  do_vfs_ioctl+0xd0/0xaf0
  ksys_ioctl+0x84/0xb4
  __arm64_sys_ioctl+0x28/0xc4
  el0_svc_common.constprop.3+0xa4/0x254
  el0_svc_handler+0x84/0xa0
  el0_svc+0x10/0x26c
 Code: b9401260 f9615681 51000400 8b001c20 (f9403c1a)

Fixes: 436f2ad05a ("IB/core: Export ib_create/destroy_flow through uverbs")
Link: https://lore.kernel.org/r/faad30dc5219a01727f47db3dc2f029d07c82c00.1623309971.git.leonro@nvidia.com
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-10 08:59:33 -03:00
Gustavo A. R. Silva
551912d286 KVM: x86: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple
of warnings by explicitly adding break statements instead of just letting
the code fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Message-Id: <20210528200756.GA39320@embeddedor>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-10 07:48:45 -04:00
ChenXiaoSong
02ffbe6351 KVM: SVM: fix doc warnings
Fix kernel-doc warnings:

arch/x86/kvm/svm/avic.c:233: warning: Function parameter or member 'activate' not described in 'avic_update_access_page'
arch/x86/kvm/svm/avic.c:233: warning: Function parameter or member 'kvm' not described in 'avic_update_access_page'
arch/x86/kvm/svm/avic.c:781: warning: Function parameter or member 'e' not described in 'get_pi_vcpu_info'
arch/x86/kvm/svm/avic.c:781: warning: Function parameter or member 'kvm' not described in 'get_pi_vcpu_info'
arch/x86/kvm/svm/avic.c:781: warning: Function parameter or member 'svm' not described in 'get_pi_vcpu_info'
arch/x86/kvm/svm/avic.c:781: warning: Function parameter or member 'vcpu_info' not described in 'get_pi_vcpu_info'
arch/x86/kvm/svm/avic.c:1009: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Message-Id: <20210609122217.2967131-1-chenxiaosong2@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-10 07:46:54 -04:00
Yanan Wang
95bf69b400 KVM: selftests: Fix compiling errors when initializing the static structure
Errors like below were produced from test_util.c when compiling the KVM
selftests on my local platform.

lib/test_util.c: In function 'vm_mem_backing_src_alias':
lib/test_util.c:177:12: error: initializer element is not constant
    .flag = anon_flags,
            ^~~~~~~~~~
lib/test_util.c:177:12: note: (near initialization for 'aliases[0].flag')

The reason is that we are using non-const expressions to initialize the
static structure, which will probably trigger a compiling error/warning
on stricter GCC versions. Fix it by converting the two const variables
"anon_flags" and "anon_huge_flags" into more stable macros.

Fixes: b3784bc28c ("KVM: selftests: refactor vm_mem_backing_src_type flags")
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20210610085418.35544-1-wangyanan55@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-10 07:46:10 -04:00
Desmond Cheong Zhi Xi
c336a5ee98 drm: Lock pointer access in drm_master_release()
This patch eliminates the following smatch warning:
drivers/gpu/drm/drm_auth.c:320 drm_master_release() warn: unlocked access 'master' (line 318) expected lock '&dev->master_mutex'

The 'file_priv->master' field should be protected by the mutex lock to
'&dev->master_mutex'. This is because other processes can concurrently
modify this field and free the current 'file_priv->master'
pointer. This could result in a use-after-free error when 'master' is
dereferenced in subsequent function calls to
'drm_legacy_lock_master_cleanup()' or to 'drm_lease_revoke()'.

An example of a scenario that would produce this error can be seen
from a similar bug in 'drm_getunique()' that was reported by Syzbot:
https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803

In the Syzbot report, another process concurrently acquired the
device's master mutex in 'drm_setmaster_ioctl()', then overwrote
'fpriv->master' in 'drm_new_set_master()'. The old value of
'fpriv->master' was subsequently freed before the mutex was unlocked.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210609092119.173590-1-desmondcheongzx@gmail.com
2021-06-10 12:22:02 +02:00
Peter Zijlstra
584fd3b318 objtool: Fix .symtab_shndx handling for elf_create_undef_symbol()
When an ELF object uses extended symbol section indexes (IOW it has a
.symtab_shndx section), these must be kept in sync with the regular
symbol table (.symtab).

So for every new symbol we emit, make sure to also emit a
.symtab_shndx value to keep the arrays of equal size.

Note: since we're writing an UNDEF symbol, most GElf_Sym fields will
be 0 and we can repurpose one (st_size) to host the 0 for the xshndx
value.

Fixes: 2f2f7e47f0 ("objtool: Add elf_create_undef_symbol()")
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net
2021-06-10 10:08:24 +02:00
CodyYao-oc
a8383dfb21 x86/nmi_watchdog: Fix old-style NMI watchdog regression on old Intel CPUs
The following commit:

   3a4ac121c2 ("x86/perf: Add hardware performance events support for Zhaoxin CPU.")

Got the old-style NMI watchdog logic wrong and broke it for basically every
Intel CPU where it was active. Which is only truly old CPUs, so few people noticed.

On CPUs with perf events support we turn off the old-style NMI watchdog, so it
was pretty pointless to add the logic for X86_VENDOR_ZHAOXIN to begin with ... :-/

Anyway, the fix is to restore the old logic and add a 'break'.

[ mingo: Wrote a new changelog. ]

Fixes: 3a4ac121c2 ("x86/perf: Add hardware performance events support for Zhaoxin CPU.")
Signed-off-by: CodyYao-oc <CodyYao-oc@zhaoxin.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210607025335.9643-1-CodyYao-oc@zhaoxin.com
2021-06-10 10:04:40 +02:00
Peter Zijlstra
156172a13f irq_work: Make irq_work_queue() NMI-safe again
Someone carelessly put NMI unsafe code in irq_work_queue(), breaking
just about every single user. Also, someone has a terrible comment
style.

Fixes: e2b5bcf9f5 ("irq_work: record irq_work_queue() call stack")
Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/YL+uBq8LzXXZsYVf@hirez.programming.kicks-ass.net
2021-06-10 10:00:08 +02:00
Stefan Agner
6f7ec77cc8 USB: serial: cp210x: fix alternate function for CP2102N QFN20
The QFN20 part has a different GPIO/port function assignment. The
configuration struct bit field ordered as TX/RX/RS485/WAKEUP/CLK
which exactly matches GPIO0-3 for QFN24/28. However, QFN20 has a
different GPIO to primary function assignment.

Special case QFN20 to follow to properly detect which GPIOs are
available.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Link: https://lore.kernel.org/r/51830b2b24118eb0f77c5c9ac64ffb2f519dbb1d.1622218300.git.stefan@agner.ch
Fixes: c8acfe0aad ("USB: serial: cp210x: implement GPIO support for CP2102N")
Cc: stable@vger.kernel.org	# 4.19
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-06-10 09:55:36 +02:00
Thomas Gleixner
efa1655049 x86/fpu: Reset state for all signal restore failures
If access_ok() or fpregs_soft_set() fails in __fpu__restore_sig() then the
function just returns but does not clear the FPU state as it does for all
other fatal failures.

Clear the FPU state for these failures as well.

Fixes: 72a671ced6 ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87mtryyhhz.ffs@nanos.tec.linutronix.de
2021-06-10 08:04:24 +02:00
Aya Levin
54e1217b90 net/mlx5e: Block offload of outer header csum for GRE tunnel
The device is able to offload either the outer header csum or inner
header csum. The driver utilizes the inner csum offload. So, prohibit
setting of tx-gre-csum-segmentation and let it be: off[fixed].

Fixes: 2729984149 ("net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:06 -07:00
Aya Levin
6d6727dddc net/mlx5e: Block offload of outer header csum for UDP tunnels
The device is able to offload either the outer header csum or inner
header csum. The driver utilizes the inner csum offload. Hence, block
setting of tx-udp_tnl-csum-segmentation and set it to off[fixed].

Fixes: b49663c8fb ("net/mlx5e: Add support for UDP tunnel segmentation with outer checksum offload")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:06 -07:00
Shay Drory
7a545077cb Revert "net/mlx5: Arm only EQs with EQEs"
In the scenario described below, an EQ can remain in FIRED state which
can result in missing an interrupt generation.

The scenario:

device                       mlx5_core driver
------                       ----------------
EQ1.eqe generated
EQ1.MSI-X sent
EQ1.state = FIRED
EQ2.eqe generated
                             mlx5_irq()
                               polls - eq1_eqes()
                               arm eq1
                               polls - eq2_eqes()
                               arm eq2
EQ2.MSI-X sent
EQ2.state = FIRED
                              mlx5_irq()
                              polls - eq2_eqes() -- no eqes found
                              driver skips EQ arming;

->EQ2 remains fired, misses generating interrupt.

Hence, always arm the EQ by reverting the cited commit in fixes tag.

Fixes: d894892dda ("net/mlx5: Arm only EQs with EQEs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:05 -07:00
Aya Levin
a6ee6f5f10 net/mlx5e: Fix select queue to consider SKBTX_HW_TSTAMP
Steering packets to PTP-SQ should be done only if the SKB has
SKBTX_HW_TSTAMP set in the tx_flags. While here, take the function into
a header and inline it.
Set the whole condition to select the PTP-SQ to unlikely.

Fixes: 24c22dd091 ("net/mlx5e: Add states to PTP channel")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:05 -07:00
Aya Levin
9ae8c18c5e net/mlx5e: Don't update netdev RQs with PTP-RQ
Since the driver opens the PTP-RQ under channel 0, it appears to the
stack as if the SKB was received on rxq0. So from thew stack POV there
are still the same number of RX queues.

Fixes: 960fbfe222 ("net/mlx5e: Allow coexistence of CQE compression and HW TS PTP")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:05 -07:00
Chris Mi
11f5ac3e05 net/mlx5e: Verify dev is present in get devlink port ndo
When changing eswitch mode, the netdev is detached from the
hardware resources. So verify dev is present in get devlink
port ndo. Otherwise, we will hit the following panic:

[241535.973539] RIP: 0010:__devlink_port_phys_port_name_get+0x13/0x1b0
[241535.976471] RSP: 0018:ffff9eaf0ae1b7c8 EFLAGS: 00010292
[241535.977471] RAX: 000000000002d370 RBX: 000000000002d370 RCX: 0000000000000000
[241535.978479] RDX: 0000000000000010 RSI: ffff9eaf0ae1b858 RDI: 000000000002d370
[241535.979482] RBP: ffff9eaf0ae1b7e0 R08: 000000000000002a R09: ffff8888d54d13da
[241535.980486] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8888e6700000
[241535.981491] R13: ffff9eaf0ae1b858 R14: 0000000000000010 R15: 0000000000000000
[241535.982489] FS:  00007fd374ef3740(0000) GS:ffff88909ea00000(0000) knlGS:0000000000000000
[241535.983494] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[241535.984487] CR2: 000000000002d444 CR3: 000000089fd26006 CR4: 00000000003706e0
[241535.985502] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[241535.986499] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[241535.987477] Call Trace:
[241535.988426]  ? nla_put_64bit+0x71/0xa0
[241535.989368]  devlink_compat_phys_port_name_get+0x50/0xa0
[241535.990312]  dev_get_phys_port_name+0x4b/0x60
[241535.991252]  rtnl_fill_ifinfo+0x57b/0xcb0
[241535.992192]  rtnl_dump_ifinfo+0x58f/0x6d0
[241535.993123]  ? ksize+0x14/0x20
[241535.994033]  ? __alloc_skb+0xe8/0x250
[241535.994935]  netlink_dump+0x17c/0x300
[241535.995821]  netlink_recvmsg+0x1de/0x2c0
[241535.996677]  sock_recvmsg+0x70/0x80
[241535.997518]  ____sys_recvmsg+0x9b/0x1b0
[241535.998360]  ? iovec_from_user+0x82/0x120
[241535.999202]  ? __import_iovec+0x2c/0x130
[241536.000031]  ___sys_recvmsg+0x94/0x130
[241536.000850]  ? __handle_mm_fault+0x56d/0x6e0
[241536.001668]  __sys_recvmsg+0x5f/0xb0
[241536.002464]  ? syscall_enter_from_user_mode+0x2b/0x80
[241536.003242]  __x64_sys_recvmsg+0x1f/0x30
[241536.004008]  do_syscall_64+0x38/0x50
[241536.004767]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[241536.005532] RIP: 0033:0x7fd375014f47

Fixes: 2ff349c5ed ("net/mlx5e: Verify dev is present in some ndos")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Chris Mi <cmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:04 -07:00
Maor Gottlieb
4aaf96ac8b net/mlx5: DR, Don't use SW steering when RoCE is not supported
SW steering uses RC QP to write/read to/from ICM, hence it's not
supported when RoCE is not supported as well.

Fixes: 70605ea545 ("net/mlx5: DR, Expose APIs for direct rule managing")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:04 -07:00
Maor Gottlieb
c189716b2a net/mlx5: Consider RoCE cap before init RDMA resources
Check if RoCE is supported by the device before enable it in
the vport context and create all the RDMA steering objects.

Fixes: 80f09dfc23 ("net/mlx5: Eswitch, enable RoCE loopback traffic")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:04 -07:00
Dima Chumak
a3e5fd9314 net/mlx5e: Fix page reclaim for dead peer hairpin
When adding a hairpin flow, a firmware-side send queue is created for
the peer net device, which claims some host memory pages for its
internal ring buffer. If the peer net device is removed/unbound before
the hairpin flow is deleted, then the send queue is not destroyed which
leads to a stack trace on pci device remove:

[ 748.005230] mlx5_core 0000:08:00.2: wait_func:1094:(pid 12985): MANAGE_PAGES(0x108) timeout. Will cause a leak of a command resource
[ 748.005231] mlx5_core 0000:08:00.2: reclaim_pages:514:(pid 12985): failed reclaiming pages: err -110
[ 748.001835] mlx5_core 0000:08:00.2: mlx5_reclaim_root_pages:653:(pid 12985): failed reclaiming pages (-110) for func id 0x0
[ 748.002171] ------------[ cut here ]------------
[ 748.001177] FW pages counter is 4 after reclaiming all pages
[ 748.001186] WARNING: CPU: 1 PID: 12985 at drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:685 mlx5_reclaim_startup_pages+0x34b/0x460 [mlx5_core]                      [  +0.002771] Modules linked in: cls_flower mlx5_ib mlx5_core ptp pps_core act_mirred sch_ingress openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_umad ib_ipoib iw_cm ib_cm ib_uverbs ib_core overlay fuse [last unloaded: pps_core]
[ 748.007225] CPU: 1 PID: 12985 Comm: tee Not tainted 5.12.0+ #1
[ 748.001376] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 748.002315] RIP: 0010:mlx5_reclaim_startup_pages+0x34b/0x460 [mlx5_core]
[ 748.001679] Code: 28 00 00 00 0f 85 22 01 00 00 48 81 c4 b0 00 00 00 31 c0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 c7 c7 40 cc 19 a1 e8 9f 71 0e e2 <0f> 0b e9 30 ff ff ff 48 c7 c7 a0 cc 19 a1 e8 8c 71 0e e2 0f 0b e9
[ 748.003781] RSP: 0018:ffff88815220faf8 EFLAGS: 00010286
[ 748.001149] RAX: 0000000000000000 RBX: ffff8881b4900280 RCX: 0000000000000000
[ 748.001445] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102a441f51
[ 748.001614] RBP: 00000000000032b9 R08: 0000000000000001 R09: ffffed1054a15ee8
[ 748.001446] R10: ffff8882a50af73b R11: ffffed1054a15ee7 R12: fffffbfff07c1e30
[ 748.001447] R13: dffffc0000000000 R14: ffff8881b492cba8 R15: 0000000000000000
[ 748.001429] FS:  00007f58bd08b580(0000) GS:ffff8882a5080000(0000) knlGS:0000000000000000
[ 748.001695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 748.001309] CR2: 000055a026351740 CR3: 00000001d3b48006 CR4: 0000000000370ea0
[ 748.001506] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 748.001483] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 748.001654] Call Trace:
[ 748.000576]  ? mlx5_satisfy_startup_pages+0x290/0x290 [mlx5_core]
[ 748.001416]  ? mlx5_cmd_teardown_hca+0xa2/0xd0 [mlx5_core]
[ 748.001354]  ? mlx5_cmd_init_hca+0x280/0x280 [mlx5_core]
[ 748.001203]  mlx5_function_teardown+0x30/0x60 [mlx5_core]
[ 748.001275]  mlx5_uninit_one+0xa7/0xc0 [mlx5_core]
[ 748.001200]  remove_one+0x5f/0xc0 [mlx5_core]
[ 748.001075]  pci_device_remove+0x9f/0x1d0
[ 748.000833]  device_release_driver_internal+0x1e0/0x490
[ 748.001207]  unbind_store+0x19f/0x200
[ 748.000942]  ? sysfs_file_ops+0x170/0x170
[ 748.001000]  kernfs_fop_write_iter+0x2bc/0x450
[ 748.000970]  new_sync_write+0x373/0x610
[ 748.001124]  ? new_sync_read+0x600/0x600
[ 748.001057]  ? lock_acquire+0x4d6/0x700
[ 748.000908]  ? lockdep_hardirqs_on_prepare+0x400/0x400
[ 748.001126]  ? fd_install+0x1c9/0x4d0
[ 748.000951]  vfs_write+0x4d0/0x800
[ 748.000804]  ksys_write+0xf9/0x1d0
[ 748.000868]  ? __x64_sys_read+0xb0/0xb0
[ 748.000811]  ? filp_open+0x50/0x50
[ 748.000919]  ? syscall_enter_from_user_mode+0x1d/0x50
[ 748.001223]  do_syscall_64+0x3f/0x80
[ 748.000892]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 748.001026] RIP: 0033:0x7f58bcfb22f7
[ 748.000944] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 748.003925] RSP: 002b:00007fffd7f2aaa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 748.001732] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f58bcfb22f7
[ 748.001426] RDX: 000000000000000d RSI: 00007fffd7f2abc0 RDI: 0000000000000003
[ 748.001746] RBP: 00007fffd7f2abc0 R08: 0000000000000000 R09: 0000000000000001
[ 748.001631] R10: 00000000000001b6 R11: 0000000000000246 R12: 000000000000000d
[ 748.001537] R13: 00005597ac2c24a0 R14: 000000000000000d R15: 00007f58bd084700
[ 748.001564] irq event stamp: 0
[ 748.000787] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[ 748.001399] hardirqs last disabled at (0): [<ffffffff813132cf>] copy_process+0x146f/0x5eb0
[ 748.001854] softirqs last  enabled at (0): [<ffffffff8131330e>] copy_process+0x14ae/0x5eb0
[ 748.013431] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 748.001492] ---[ end trace a6fabd773d1c51ae ]---

Fix by destroying the send queue of a hairpin peer net device that is
being removed/unbound, which returns the allocated ring buffer pages to
the host.

Fixes: 4d8fcf216c ("net/mlx5e: Avoid unbounded peer devices when unpairing TC hairpin rules")
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:03 -07:00
Huy Nguyen
8ad893e516 net/mlx5e: Remove dependency in IPsec initialization flows
Currently, IPsec feature is disabled because mlx5e_build_nic_netdev
is required to be called after mlx5e_ipsec_init. This requirement is
invalid as mlx5e_build_nic_netdev and mlx5e_ipsec_init initialize
independent resources.

Remove ipsec pointer check in mlx5e_build_nic_netdev so that the
two functions can be called at any order.

Fixes: 547eede070 ("net/mlx5e: IPSec, Innova IPSec offload infrastructure")
Signed-off-by: Huy Nguyen <huyn@nvidia.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:03 -07:00
Vlad Buslov
fb1a3132ee net/mlx5e: Fix use-after-free of encap entry in neigh update handler
Function mlx5e_rep_neigh_update() wasn't updated to accommodate rtnl lock
removal from TC filter update path and properly handle concurrent encap
entry insertion/deletion which can lead to following use-after-free:

 [23827.464923] ==================================================================
 [23827.469446] BUG: KASAN: use-after-free in mlx5e_encap_take+0x72/0x140 [mlx5_core]
 [23827.470971] Read of size 4 at addr ffff8881d132228c by task kworker/u20:6/21635
 [23827.472251]
 [23827.472615] CPU: 9 PID: 21635 Comm: kworker/u20:6 Not tainted 5.13.0-rc3+ #5
 [23827.473788] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 [23827.475639] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
 [23827.476731] Call Trace:
 [23827.477260]  dump_stack+0xbb/0x107
 [23827.477906]  print_address_description.constprop.0+0x18/0x140
 [23827.478896]  ? mlx5e_encap_take+0x72/0x140 [mlx5_core]
 [23827.479879]  ? mlx5e_encap_take+0x72/0x140 [mlx5_core]
 [23827.480905]  kasan_report.cold+0x7c/0xd8
 [23827.481701]  ? mlx5e_encap_take+0x72/0x140 [mlx5_core]
 [23827.482744]  kasan_check_range+0x145/0x1a0
 [23827.493112]  mlx5e_encap_take+0x72/0x140 [mlx5_core]
 [23827.494054]  ? mlx5e_tc_tun_encap_info_equal_generic+0x140/0x140 [mlx5_core]
 [23827.495296]  mlx5e_rep_neigh_update+0x41e/0x5e0 [mlx5_core]
 [23827.496338]  ? mlx5e_rep_neigh_entry_release+0xb80/0xb80 [mlx5_core]
 [23827.497486]  ? read_word_at_a_time+0xe/0x20
 [23827.498250]  ? strscpy+0xa0/0x2a0
 [23827.498889]  process_one_work+0x8ac/0x14e0
 [23827.499638]  ? lockdep_hardirqs_on_prepare+0x400/0x400
 [23827.500537]  ? pwq_dec_nr_in_flight+0x2c0/0x2c0
 [23827.501359]  ? rwlock_bug.part.0+0x90/0x90
 [23827.502116]  worker_thread+0x53b/0x1220
 [23827.502831]  ? process_one_work+0x14e0/0x14e0
 [23827.503627]  kthread+0x328/0x3f0
 [23827.504254]  ? _raw_spin_unlock_irq+0x24/0x40
 [23827.505065]  ? __kthread_bind_mask+0x90/0x90
 [23827.505912]  ret_from_fork+0x1f/0x30
 [23827.506621]
 [23827.506987] Allocated by task 28248:
 [23827.507694]  kasan_save_stack+0x1b/0x40
 [23827.508476]  __kasan_kmalloc+0x7c/0x90
 [23827.509197]  mlx5e_attach_encap+0xde1/0x1d40 [mlx5_core]
 [23827.510194]  mlx5e_tc_add_fdb_flow+0x397/0xc40 [mlx5_core]
 [23827.511218]  __mlx5e_add_fdb_flow+0x519/0xb30 [mlx5_core]
 [23827.512234]  mlx5e_configure_flower+0x191c/0x4870 [mlx5_core]
 [23827.513298]  tc_setup_cb_add+0x1d5/0x420
 [23827.514023]  fl_hw_replace_filter+0x382/0x6a0 [cls_flower]
 [23827.514975]  fl_change+0x2ceb/0x4a51 [cls_flower]
 [23827.515821]  tc_new_tfilter+0x89a/0x2070
 [23827.516548]  rtnetlink_rcv_msg+0x644/0x8c0
 [23827.517300]  netlink_rcv_skb+0x11d/0x340
 [23827.518021]  netlink_unicast+0x42b/0x700
 [23827.518742]  netlink_sendmsg+0x743/0xc20
 [23827.519467]  sock_sendmsg+0xb2/0xe0
 [23827.520131]  ____sys_sendmsg+0x590/0x770
 [23827.520851]  ___sys_sendmsg+0xd8/0x160
 [23827.521552]  __sys_sendmsg+0xb7/0x140
 [23827.522238]  do_syscall_64+0x3a/0x70
 [23827.522907]  entry_SYSCALL_64_after_hwframe+0x44/0xae
 [23827.523797]
 [23827.524163] Freed by task 25948:
 [23827.524780]  kasan_save_stack+0x1b/0x40
 [23827.525488]  kasan_set_track+0x1c/0x30
 [23827.526187]  kasan_set_free_info+0x20/0x30
 [23827.526968]  __kasan_slab_free+0xed/0x130
 [23827.527709]  slab_free_freelist_hook+0xcf/0x1d0
 [23827.528528]  kmem_cache_free_bulk+0x33a/0x6e0
 [23827.529317]  kfree_rcu_work+0x55f/0xb70
 [23827.530024]  process_one_work+0x8ac/0x14e0
 [23827.530770]  worker_thread+0x53b/0x1220
 [23827.531480]  kthread+0x328/0x3f0
 [23827.532114]  ret_from_fork+0x1f/0x30
 [23827.532785]
 [23827.533147] Last potentially related work creation:
 [23827.534007]  kasan_save_stack+0x1b/0x40
 [23827.534710]  kasan_record_aux_stack+0xab/0xc0
 [23827.535492]  kvfree_call_rcu+0x31/0x7b0
 [23827.536206]  mlx5e_tc_del_fdb_flow+0x577/0xef0 [mlx5_core]
 [23827.537305]  mlx5e_flow_put+0x49/0x80 [mlx5_core]
 [23827.538290]  mlx5e_delete_flower+0x6d1/0xe60 [mlx5_core]
 [23827.539300]  tc_setup_cb_destroy+0x18e/0x2f0
 [23827.540144]  fl_hw_destroy_filter+0x1d2/0x310 [cls_flower]
 [23827.541148]  __fl_delete+0x4dc/0x660 [cls_flower]
 [23827.541985]  fl_delete+0x97/0x160 [cls_flower]
 [23827.542782]  tc_del_tfilter+0x7ab/0x13d0
 [23827.543503]  rtnetlink_rcv_msg+0x644/0x8c0
 [23827.544257]  netlink_rcv_skb+0x11d/0x340
 [23827.544981]  netlink_unicast+0x42b/0x700
 [23827.545700]  netlink_sendmsg+0x743/0xc20
 [23827.546424]  sock_sendmsg+0xb2/0xe0
 [23827.547084]  ____sys_sendmsg+0x590/0x770
 [23827.547850]  ___sys_sendmsg+0xd8/0x160
 [23827.548606]  __sys_sendmsg+0xb7/0x140
 [23827.549303]  do_syscall_64+0x3a/0x70
 [23827.549969]  entry_SYSCALL_64_after_hwframe+0x44/0xae
 [23827.550853]
 [23827.551217] The buggy address belongs to the object at ffff8881d1322200
 [23827.551217]  which belongs to the cache kmalloc-256 of size 256
 [23827.553341] The buggy address is located 140 bytes inside of
 [23827.553341]  256-byte region [ffff8881d1322200, ffff8881d1322300)
 [23827.555747] The buggy address belongs to the page:
 [23827.556847] page:00000000898762aa refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1d1320
 [23827.558651] head:00000000898762aa order:2 compound_mapcount:0 compound_pincount:0
 [23827.559961] flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
 [23827.561243] raw: 002ffff800010200 dead000000000100 dead000000000122 ffff888100042b40
 [23827.562653] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
 [23827.564112] page dumped because: kasan: bad access detected
 [23827.565439]
 [23827.565932] Memory state around the buggy address:
 [23827.566917]  ffff8881d1322180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [23827.568485]  ffff8881d1322200: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 [23827.569818] >ffff8881d1322280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 [23827.571143]                       ^
 [23827.571879]  ffff8881d1322300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [23827.573283]  ffff8881d1322380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [23827.574654] ==================================================================

Most of the necessary logic is already correctly implemented by
mlx5e_get_next_valid_encap() helper that is used in neigh stats update
handler. Make the handler generic by renaming it to
mlx5e_get_next_matching_encap() and use callback to test whether flow is
matching instead of hardcoded check for 'valid' flag value. Implement
mlx5e_get_next_valid_encap() by calling mlx5e_get_next_matching_encap()
with callback that tests encap MLX5_ENCAP_ENTRY_VALID flag. Implement new
mlx5e_get_next_init_encap() helper by calling
mlx5e_get_next_matching_encap() with callback that tests encap completion
result to be non-error and use it in mlx5e_rep_neigh_update() to safely
iterate over nhe->encap_list.

Remove encap completion logic from mlx5e_rep_update_flows() since the encap
entries passed to this function are already guaranteed to be properly
initialized by similar code in mlx5e_get_next_init_encap().

Fixes: 2a1f1768fa ("net/mlx5e: Refactor neigh update for concurrent execution")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:03 -07:00
Yang Li
2bf8d2ae34 net/mlx5e: Fix an error code in mlx5e_arfs_create_tables()
When the code execute 'if (!priv->fs.arfs->wq)', the value of err is 0.
So, we use -ENOMEM to indicate that the function
create_singlethread_workqueue() return NULL.

Clean up smatch warning:
drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c:373
mlx5e_arfs_create_tables() warn: missing error code 'err'.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Fixes: f6755b80d6 ("net/mlx5e: Dynamic alloc arfs table for netdev when needed")
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 17:20:02 -07:00
David S. Miller
6cde05ab93 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-06-09

This series contains updates to ice driver only.

Maciej informs the user when XDP is not supported due to the driver
being in the 'safe mode' state. He also adds a parameter to Tx queue
configuration to resolve an issue in configuring XDP queues as it cannot
rely on using the number Tx or Rx queues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 15:45:16 -07:00
Marcelo Ricardo Leitner
13c62f5371 net/sched: act_ct: handle DNAT tuple collision
This this the counterpart of 8aa7b526dc ("openvswitch: handle DNAT
tuple collision") for act_ct. From that commit changelog:

"""
With multiple DNAT rules it's possible that after destination
translation the resulting tuples collide.

...

Netfilter handles this case by allocating a null binding for SNAT at
egress by default.  Perform the same operation in openvswitch for DNAT
if no explicit SNAT is requested by the user and allocate a null binding
for SNAT for packets in the "original" direction.
"""

Fixes: 95219afbb9 ("act_ct: support asymmetric conntrack")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 15:34:51 -07:00
Linus Torvalds
cd1245d75c Merge tag 'platform-drivers-x86-v5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
 "Assorted pdx86 bug-fixes and some hardware-id additions for 5.13.

  The mlxreg-hotplug revert is a regression-fix"

* tag 'platform-drivers-x86-v5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/mellanox: mlxreg-hotplug: Revert "move to use request_irq by IRQF_NO_AUTOEN flag"
  platform/surface: dtx: Add missing mutex_destroy() call in failure path
  platform/surface: aggregator: Fix event disable function
  platform/x86: thinkpad_acpi: Add X1 Carbon Gen 9 second fan support
  platform/surface: aggregator_registry: Add support for 13" Intel Surface Laptop 4
  platform/surface: aggregator_registry: Update comments for 15" AMD Surface Laptop 4
2021-06-09 15:23:32 -07:00
Ido Schimmel
d2e381c496 rtnetlink: Fix regression in bridge VLAN configuration
Cited commit started returning errors when notification info is not
filled by the bridge driver, resulting in the following regression:

 # ip link add name br1 type bridge vlan_filtering 1
 # bridge vlan add dev br1 vid 555 self pvid untagged
 RTNETLINK answers: Invalid argument

As long as the bridge driver does not fill notification info for the
bridge device itself, an empty notification should not be considered as
an error. This is explained in commit 59ccaaaa49 ("bridge: dont send
notification when skb->len == 0 in rtnl_bridge_notify").

Fix by removing the error and add a comment to avoid future bugs.

Fixes: a8db57c1d2 ("rtnetlink: Fix missing error code in rtnl_bridge_notify()")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 14:58:26 -07:00
Linus Torvalds
a4c30b8691 Merge tag 'compiler-attributes-for-linus-v5.13-rc6' of git://github.com/ojeda/linux
Pull compiler attribute update from Miguel Ojeda:
 "A trivial update to the compiler attributes: Add 'continue' keyword to
  documentation in comment (from Wei Ming Chen)"

* tag 'compiler-attributes-for-linus-v5.13-rc6' of git://github.com/ojeda/linux:
  Compiler Attributes: Add continue in comment
2021-06-09 14:48:29 -07:00
Linus Torvalds
a25b088c4f Merge tag 'clang-format-for-linus-v5.13-rc6' of git://github.com/ojeda/linux
Pull clang-format update from Miguel Ojeda:
 "The usual update for `clang-format`"

* tag 'clang-format-for-linus-v5.13-rc6' of git://github.com/ojeda/linux:
  clang-format: Update with the latest for_each macro list
2021-06-09 14:47:16 -07:00
David S. Miller
93124d4a90 Merge tag 'mac80211-for-net-2021-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes berg says:

====================
A fair number of fixes:
 * fix more fallout from RTNL locking changes
 * fixes for some of the bugs found by syzbot
 * drop multicast fragments in mac80211 to align
   with the spec and what drivers are doing now
 * fix NULL-ptr deref in radiotap injection
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 14:46:21 -07:00
Jim Mattson
218bf772bd kvm: LAPIC: Restore guard to prevent illegal APIC register access
Per the SDM, "any access that touches bytes 4 through 15 of an APIC
register may cause undefined behavior and must not be executed."
Worse, such an access in kvm_lapic_reg_read can result in a leak of
kernel stack contents. Prior to commit 01402cf810 ("kvm: LAPIC:
write down valid APIC registers"), such an access was explicitly
disallowed. Restore the guard that was removed in that commit.

Fixes: 01402cf810 ("kvm: LAPIC: write down valid APIC registers")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Message-Id: <20210602205224.3189316-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-09 17:25:37 -04:00
Paolo Abeni
a8b897c7bc udp: fix race between close() and udp_abort()
Kaustubh reported and diagnosed a panic in udp_lib_lookup().
The root cause is udp_abort() racing with close(). Both
racing functions acquire the socket lock, but udp{v6}_destroy_sock()
release it before performing destructive actions.

We can't easily extend the socket lock scope to avoid the race,
instead use the SOCK_DEAD flag to prevent udp_abort from doing
any action when the critical race happens.

Diagnosed-and-tested-by: Kaustubh Pandey <kapandey@codeaurora.org>
Fixes: 5d77dca828 ("net: diag: support SOCK_DESTROY for UDP sockets")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 14:08:41 -07:00
Eric Dumazet
dcd01eeac1 inet: annotate data race in inet_send_prepare() and inet_dgram_connect()
Both functions are known to be racy when reading inet_num
as we do not want to grab locks for the common case the socket
has been bound already. The race is resolved in inet_autobind()
by reading again inet_num under the socket lock.

syzbot reported:
BUG: KCSAN: data-race in inet_send_prepare / udp_lib_get_port

write to 0xffff88812cba150e of 2 bytes by task 24135 on cpu 0:
 udp_lib_get_port+0x4b2/0xe20 net/ipv4/udp.c:308
 udp_v6_get_port+0x5e/0x70 net/ipv6/udp.c:89
 inet_autobind net/ipv4/af_inet.c:183 [inline]
 inet_send_prepare+0xd0/0x210 net/ipv4/af_inet.c:807
 inet6_sendmsg+0x29/0x80 net/ipv6/af_inet6.c:639
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
 ___sys_sendmsg net/socket.c:2404 [inline]
 __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490
 __do_sys_sendmmsg net/socket.c:2519 [inline]
 __se_sys_sendmmsg net/socket.c:2516 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff88812cba150e of 2 bytes by task 24132 on cpu 1:
 inet_send_prepare+0x21/0x210 net/ipv4/af_inet.c:806
 inet6_sendmsg+0x29/0x80 net/ipv6/af_inet6.c:639
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg net/socket.c:674 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350
 ___sys_sendmsg net/socket.c:2404 [inline]
 __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490
 __do_sys_sendmmsg net/socket.c:2519 [inline]
 __se_sys_sendmmsg net/socket.c:2516 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516
 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000 -> 0x9db4

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 24132 Comm: syz-executor.2 Not tainted 5.13.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 13:59:53 -07:00
Austin Kim
80ec82e3d2 net: ethtool: clear heap allocations for ethtool function
Several ethtool functions leave heap uncleared (potentially) by
drivers. This will leave the unused portion of heap unchanged and
might copy the full contents back to userspace.

Signed-off-by: Austin Kim <austindh.kim@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09 13:53:31 -07:00
Linus Torvalds
cc6cf827dd Merge tag 'for-5.13-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "A few more fixes that people hit during testing.

  Zoned mode fix:

   - fix 32bit value wrapping when calculating superblock offsets

  Error handling fixes:

   - properly check filesystema and device uuids

   - properly return errors when marking extents as written

   - do not write supers if we have an fs error"

* tag 'for-5.13-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: promote debugging asserts to full-fledged checks in validate_super
  btrfs: return value from btrfs_mark_extent_written() in case of error
  btrfs: zoned: fix zone number to sector/physical calculation
  btrfs: do not write supers if we have an fs error
2021-06-09 13:34:48 -07:00
Maciej Fijalkowski
2e84f6b377 ice: parameterize functions responsible for Tx ring management
Commit ae15e0ba1b ("ice: Change number of XDP Tx queues to match
number of Rx queues") tried to address the incorrect setting of XDP
queue count that was based on the Tx queue count, whereas in theory we
should provide the XDP queue per Rx queue. However, the routines that
setup and destroy the set of Tx resources are still based on the
vsi->num_txq.

Ice supports the asynchronous Tx/Rx queue count, so for a setup where
vsi->num_txq > vsi->num_rxq, ice_vsi_stop_tx_rings and ice_vsi_cfg_txqs
will be accessing the vsi->xdp_rings out of the bounds.

Parameterize two mentioned functions so they get the size of Tx resources
array as the input.

Fixes: ae15e0ba1b ("ice: Change number of XDP Tx queues to match number of Rx queues")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-09 13:15:10 -07:00
Maciej Fijalkowski
ebc5399ea1 ice: add ndo_bpf callback for safe mode netdev ops
ice driver requires a programmable pipeline firmware package in order to
have a support for advanced features. Otherwise, driver falls back to so
called 'safe mode'. For that mode, ndo_bpf callback is not exposed and
when user tries to load XDP program, the following happens:

$ sudo ./xdp1 enp179s0f1
libbpf: Kernel error message: Underlying driver does not support XDP in native mode
link set xdp fd failed

which is sort of confusing, as there is a native XDP support, but not in
the current mode. Improve the user experience by providing the specific
ndo_bpf callback dedicated for safe mode which will make use of extack
to explicitly let the user know that the DDP package is missing and
that's the reason that the XDP can't be loaded onto interface currently.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Fixes: efc2214b60 ("ice: Add support for XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-09 13:15:10 -07:00
Linus Torvalds
2f673816b2 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "Bugfixes, including a TLB flush fix that affects processors without
  nested page tables"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: fix previous commit for 32-bit builds
  kvm: avoid speculation-based attacks from out-of-range memslot accesses
  KVM: x86: Unload MMU on guest TLB flush if TDP disabled to force MMU sync
  KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message
  selftests: kvm: Add support for customized slot0 memory size
  KVM: selftests: introduce P47V64 for s390x
  KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior
  KVM: X86: MMU: Use the correct inherited permissions to get shadow page
  KVM: LAPIC: Write 0 to TMICT should also cancel vmx-preemption timer
  KVM: SVM: Fix SEV SEND_START session length & SEND_UPDATE_DATA query length after commit 238eca821c
2021-06-09 13:09:57 -07:00
Florian Westphal
12f36e9bf6 netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local
The ip6tables rpfilter match has an extra check to skip packets with
"::" source address.

Extend this to ipv6 fib expression.  Else ipv6 duplicate address detection
packets will fail rpf route check -- lookup returns -ENETUNREACH.

While at it, extend the prerouting check to also cover the ingress hook.

Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1543
Fixes: f6d0cbcf09 ("netfilter: nf_tables: add fib expression")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-09 21:11:03 +02:00
Florian Westphal
8294442124 selftests: netfilter: add fib test case
There is a bug report on netfilter.org bugzilla pointing to fib
expression dropping ipv6 DAD packets.

Add a test case that demonstrates this problem.

Next patch excludes icmpv6 packets coming from any to linklocal.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-09 21:06:35 +02:00
Pablo Neira Ayuso
ad9f151e56 netfilter: nf_tables: initialize set before expression setup
nft_set_elem_expr_alloc() needs an initialized set if expression sets on
the NFT_EXPR_GC flag. Move set fields initialization before expression
setup.

[4512935.019450] ==================================================================
[4512935.019456] BUG: KASAN: null-ptr-deref in nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables]
[4512935.019487] Read of size 8 at addr 0000000000000070 by task nft/23532
[4512935.019494] CPU: 1 PID: 23532 Comm: nft Not tainted 5.12.0-rc4+ #48
[...]
[4512935.019502] Call Trace:
[4512935.019505]  dump_stack+0x89/0xb4
[4512935.019512]  ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables]
[4512935.019536]  ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables]
[4512935.019560]  kasan_report.cold.12+0x5f/0xd8
[4512935.019566]  ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables]
[4512935.019590]  nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables]
[4512935.019615]  nf_tables_newset+0xc7f/0x1460 [nf_tables]

Reported-by: syzbot+ce96ca2b1d0b37c6422d@syzkaller.appspotmail.com
Fixes: 65038428b2 ("netfilter: nf_tables: allow to specify stateful expression in set definition")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-09 21:06:35 +02:00
Riwen Lu
78d1355234 hwmon: (scpi-hwmon) shows the negative temperature properly
The scpi hwmon shows the sub-zero temperature in an unsigned integer,
which would confuse the users when the machine works in low temperature
environment. This shows the sub-zero temperature in an signed value and
users can get it properly from sensors.

Signed-off-by: Riwen Lu <luriwen@kylinos.cn>
Tested-by: Xin Chen <chenxin@kylinos.cn>
Link: https://lore.kernel.org/r/20210604030959.736379-1-luriwen@kylinos.cn
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-09 11:51:30 -07:00
Wilken Gottwalt
7656cd2177 hwmon: (corsair-psu) fix suspend behavior
During standby some PSUs turn off the microcontroller. A re-init is
required during resume or the microcontroller stays unresponsive.

Fixes: d115b51e0e ("hwmon: add Corsair PSU HID controller driver")
Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>
Link: https://lore.kernel.org/r/YLjCJiVtu5zgTabI@monster.powergraphx.local
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-09 11:51:30 -07:00
Nobuhiro Iwamatsu
faffc5d857 dt-bindings: hwmon: Fix typo in TI ADS7828 bindings
Fix typo in example for DT binding, changed from 'comatible'
to 'compatible'.

Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org>
Link: https://lore.kernel.org/r/20210531134655.720462-1-iwamatsu@nigauri.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-06-09 11:51:30 -07:00
Ricky Wu
3df4fce739 misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG
aspm (Active State Power Management)
rtsx_comm_set_aspm: this function is for driver to make sure
not enter power saving when processing of init and card_detcct
ASPM_MODE_CFG: 8411 5209 5227 5229 5249 5250
Change back to use original way to control aspm
ASPM_MODE_REG: 5227A 524A 5250A 5260 5261 5228
Keep the new way to control aspm

Fixes: 121e9c6b5c ("misc: rtsx: modify and fix init_hw function")
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Tested-by: Gordon Lack <gordon.lack@dsl.pipex.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Ricky Wu <ricky_wu@realtek.com>
Link: https://lore.kernel.org/r/20210607101634.4948-1-ricky_wu@realtek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 19:10:22 +02:00
Loic Poulain
5f0c2ee1fe bus: mhi: pci-generic: Fix hibernation
This patch fixes crash after resuming from hibernation. The issue
occurs when mhi stack is builtin and so part of the 'restore-kernel',
causing the device to be resumed from 'restored kernel' with a no
more valid context (memory mappings etc...) and leading to spurious
crashes.

This patch fixes the issue by implementing proper freeze/restore
callbacks.

Link: https://lore.kernel.org/r/1622571445-4505-1-git-send-email-loic.poulain@linaro.org
Reported-by: Shujun Wang <wsj20369@163.com>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20210606153741.20725-4-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 19:04:04 +02:00
Wei Yongjun
0b67808ade bus: mhi: pci_generic: Fix possible use-after-free in mhi_pci_remove()
This driver's remove path calls del_timer(). However, that function
does not wait until the timer handler finishes. This means that the
timer handler may still be running after the driver's remove function
has finished, which would result in a use-after-free.

Fix by calling del_timer_sync(), which makes sure the timer handler
has finished, and unable to re-schedule itself.

Link: https://lore.kernel.org/r/20210413160318.2003699-1-weiyongjun1@huawei.com
Fixes: 8562d4fe34 ("mhi: pci_generic: Add health-check")
Cc: stable <stable@vger.kernel.org>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Hemant kumar <hemantk@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20210606153741.20725-3-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 19:04:00 +02:00
Jarvis Jiang
c7711c22c6 bus: mhi: pci_generic: T99W175: update channel name from AT to DUN
According to MHI v1.1 specification, change the channel name of T99W175
from "AT" to "DUN" (Dial-up networking) for both channel 32 and 33,
so that the channels can be bound to the Qcom WWAN control driver, and
device node such as /dev/wwan0p3DUN will be generated, which is very useful
for debugging modem

Link: https://lore.kernel.org/r/20210429014226.21017-1-jarvis.w.jiang@gmail.com
[mani: changed the dev node to /dev/wwan0p3DUN]
Fixes: aac426562f ("bus: mhi: pci_generic: Introduce Foxconn T99W175 support")
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Jarvis Jiang <jarvis.w.jiang@gmail.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20210606153741.20725-2-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 18:58:52 +02:00
Johannes Berg
a9799541ca mac80211: drop multicast fragments
These are not permitted by the spec, just drop them.

Link: https://lore.kernel.org/r/20210609161305.23def022b750.Ibd6dd3cdce573dae262fcdc47f8ac52b883a9c50@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09 16:17:45 +02:00
Johannes Berg
f5baf287f5 mac80211: move interface shutdown out of wiphy lock
When reconfiguration fails, we shut down everything, but we
cannot call cfg80211_shutdown_all_interfaces() with the wiphy
mutex held. Since cfg80211 now calls it on resume errors, we
only need to do likewise for where we call reconfig (whether
directly or indirectly), but not under the wiphy lock.

Cc: stable@vger.kernel.org
Fixes: 2fe8ef1062 ("cfg80211: change netdev registration/unregistration semantics")
Link: https://lore.kernel.org/r/20210608113226.78233c80f548.Iecc104aceb89f0568f50e9670a9cb191a1c8887b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09 16:09:21 +02:00
Johannes Berg
65bec836da cfg80211: shut down interfaces on failed resume
If resume fails, we should shut down all interfaces as the
hardware is probably dead. This was/is already done now in
mac80211, but we need to change that due to locking issues,
so move it here and do it without the wiphy lock held.

Cc: stable@vger.kernel.org
Fixes: 2fe8ef1062 ("cfg80211: change netdev registration/unregistration semantics")
Link: https://lore.kernel.org/r/20210608113226.d564ca69de7c.I2e3c3e5d410b72a4f63bade4fb075df041b3d92f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09 16:09:20 +02:00
Johannes Berg
43076c1e07 cfg80211: fix phy80211 symlink creation
When I moved around the code here, I neglected that we could still
call register_netdev() or similar without the wiphy mutex held,
which then calls cfg80211_register_wdev() - that's also done from
cfg80211_register_netdevice(), but the phy80211 symlink creation
was only there. Now, the symlink isn't needed for a *pure* wdev,
but a netdev not registered via cfg80211_register_wdev() should
still have the symlink, so move the creation to the right place.

Cc: stable@vger.kernel.org
Fixes: 2fe8ef1062 ("cfg80211: change netdev registration/unregistration semantics")
Link: https://lore.kernel.org/r/20210608113226.a5dc4c1e488c.Ia42fe663cefe47b0883af78c98f284c5555bbe5d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09 16:09:20 +02:00
Johannes Berg
adaed1b9da mac80211: fix 'reset' debugfs locking
cfg80211 now calls suspend/resume with the wiphy lock
held, and while there's a problem with that needing
to be fixed, we should do the same in debugfs.

Cc: stable@vger.kernel.org
Fixes: a05829a722 ("cfg80211: avoid holding the RTNL when calling the driver")
Link: https://lore.kernel.org/r/20210608113226.14020430e449.I78e19db0a55a8295a376e15ac4cf77dbb4c6fb51@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09 16:09:18 +02:00
Andy Shevchenko
7c3e8d9df2 serial: 8250_exar: Avoid NULL pointer dereference at ->exit()
It's possible that during ->exit() the private_data is NULL,
for instance when there was no GPIO device instantiated.
Due to this we may not dereference it. Add a respective check.

Note, for now ->exit() only makes sense when GPIO device
was instantiated, that's why we may use the check for entire
function.

Fixes: 81171e7d31 ("serial: 8250_exar: Constify the software nodes")
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20210608144239.12697-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 14:40:48 +02:00
Mika Westerberg
159d8c274f ACPI: Pass the same capabilities to the _OSC regardless of the query flag
Commit 719e1f561a ("ACPI: Execute platform _OSC also with query bit
clear") makes acpi_bus_osc_negotiate_platform_control() not only query
the platforms capabilities but it also commits the result back to the
firmware to report which capabilities are supported by the OS back to
the firmware

On certain systems the BIOS loads SSDT tables dynamically based on the
capabilities the OS claims to support. However, on these systems the
_OSC actually clears some of the bits (under certain conditions) so what
happens is that now when we call the _OSC twice the second time we pass
the cleared values and that results errors like below to appear on the
system log:

  ACPI BIOS Error (bug): Could not resolve symbol [\_PR.PR00._CPC], AE_NOT_FOUND (20210105/psargs-330)
  ACPI Error: Aborting method \_PR.PR01._CPC due to previous error (AE_NOT_FOUND) (20210105/psparse-529)

In addition the ACPI 6.4 spec says following [1]:

  If the OS declares support of a feature in the Support Field in one
  call to _OSC, then it must preserve the set state of that bit
  (declaring support for that feature) in all subsequent calls.

Based on the above we can fix the issue by passing the same set of
capabilities to the platform wide _OSC in both calls regardless of the
query flag.

While there drop the context.ret.length checks which were wrong to begin
with (as the length is number of bytes not elements). This is already
checked in acpi_run_osc() that also returns an error in that case.

Includes fixes by Hans de Goede.

[1] https://uefi.org/specs/ACPI/6.4/06_Device_Configuration/Device_Configuration.html#sequence-of-osc-calls

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213023
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1963717
Fixes: 719e1f561a ("ACPI: Execute platform _OSC also with query bit clear")
Cc: 5.12+ <stable@vger.kernel.org> # 5.12+
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-09 14:20:11 +02:00
Linus Walleij
c8a5704439 drm/mcde: Fix off by 10^3 in calculation
The calclulation of how many bytes we stuff into the
DSI pipeline for video mode panels is off by three
orders of magnitude because we did not account for the
fact that the DRM mode clock is in kilohertz rather
than hertz.

This used to be:
drm_mode_vrefresh(mode) * mode->htotal * mode->vtotal
which would become for example for s6e63m0:
60 x 514 x 831 = 25628040 Hz, but mode->clock is
25628 as it is in kHz.

This affects only the Samsung GT-I8190 "Golden" phone
right now since it is the only MCDE device with a video
mode display.

Curiously some specimen work with this code and wild
settings in the EOL and empty packets at the end of the
display, but I have noticed an eeire flicker until now.
Others were not so lucky and got black screens.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reported-by: Stephan Gerhold <stephan@gerhold.net>
Fixes: 920dd1b142 ("drm/mcde: Use mode->clock instead of reverse calculating it from the vrefresh")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Stephan Gerhold <stephan@gerhold.net>
Reviewed-by: Stephan Gerhold <stephan@gerhold.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608213318.3897858-1-linus.walleij@linaro.org
2021-06-09 14:03:29 +02:00
Bjorn Andersson
30e9857a13 pinctrl: qcom: Make it possible to select SC8180x TLMM
It's currently not possible to select the SC8180x TLMM driver, due to it
selecting PINCTRL_MSM, rather than depending on the same. Fix this.

Fixes: 97423113ec ("pinctrl: qcom: Add sc8180x TLMM driver")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210608180702.2064253-1-bjorn.andersson@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-06-09 13:15:20 +02:00
Thomas Gleixner
510b80a6a0 x86/pkru: Write hardware init value to PKRU when xstate is init
When user space brings PKRU into init state, then the kernel handling is
broken:

  T1 user space
     xsave(state)
     state.header.xfeatures &= ~XFEATURE_MASK_PKRU;
     xrstor(state)

  T1 -> kernel
     schedule()
       XSAVE(S) -> T1->xsave.header.xfeatures[PKRU] == 0
       T1->flags |= TIF_NEED_FPU_LOAD;

       wrpkru();

     schedule()
       ...
       pk = get_xsave_addr(&T1->fpu->state.xsave, XFEATURE_PKRU);
       if (pk)
	 wrpkru(pk->pkru);
       else
	 wrpkru(DEFAULT_PKRU);

Because the xfeatures bit is 0 and therefore the value in the xsave
storage is not valid, get_xsave_addr() returns NULL and switch_to()
writes the default PKRU. -> FAIL #1!

So that wrecks any copy_to/from_user() on the way back to user space
which hits memory which is protected by the default PKRU value.

Assumed that this does not fail (pure luck) then T1 goes back to user
space and because TIF_NEED_FPU_LOAD is set it ends up in

  switch_fpu_return()
      __fpregs_load_activate()
        if (!fpregs_state_valid()) {
  	 load_XSTATE_from_task();
        }

But if nothing touched the FPU between T1 scheduling out and back in,
then the fpregs_state is still valid which means switch_fpu_return()
does nothing and just clears TIF_NEED_FPU_LOAD. Back to user space with
DEFAULT_PKRU loaded. -> FAIL #2!

The fix is simple: if get_xsave_addr() returns NULL then set the
PKRU value to 0 instead of the restrictive default PKRU value in
init_pkru_value.

 [ bp: Massage in minor nitpicks from folks. ]

Fixes: 0cecca9d03 ("x86/fpu: Eager switch PKRU state")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210608144346.045616965@linutronix.de
2021-06-09 12:12:45 +02:00
Lars-Peter Clausen
e9de1ecade staging: ralink-gdma: Remove incorrect author information
Lars did not write the ralink-gdma driver. Looks like his name just got
copy&pasted from another similar DMA driver. Remove his name from the
copyright and MODULE_AUTHOR.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20210607100119.26983-1-lars@metafoo.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 12:07:52 +02:00
Wenli Looi
43c85d770d staging: rtl8723bs: Fix uninitialized variables
The sinfo.pertid and sinfo.generation variables are not initialized and
it causes a crash when we use this as a wireless access point.

[  456.873025] ------------[ cut here ]------------
[  456.878198] kernel BUG at mm/slub.c:3968!
[  456.882680] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM

  [ snip ]

[  457.271004] Backtrace:
[  457.273733] [<c02b7ee4>] (kfree) from [<c0e2a470>] (nl80211_send_station+0x954/0xfc4)
[  457.282481]  r9:eccca0c0 r8:e8edfec0 r7:00000000 r6:00000011 r5:e80a9480 r4:e8edfe00
[  457.291132] [<c0e29b1c>] (nl80211_send_station) from [<c0e2b18c>] (cfg80211_new_sta+0x90/0x1cc)
[  457.300850]  r10:e80a9480 r9:e8edfe00 r8:ea678cca r7:00000a20 r6:00000000 r5:ec46d000
[  457.309586]  r4:ec46d9e0
[  457.312433] [<c0e2b0fc>] (cfg80211_new_sta) from [<bf086684>] (rtw_cfg80211_indicate_sta_assoc+0x80/0x9c [r8723bs])
[  457.324095]  r10:00009930 r9:e85b9d80 r8:bf091050 r7:00000000 r6:00000000 r5:0000001c
[  457.332831]  r4:c1606788
[  457.335692] [<bf086604>] (rtw_cfg80211_indicate_sta_assoc [r8723bs]) from [<bf03df38>] (rtw_stassoc_event_callback+0x1c8/0x1d4 [r8723bs])
[  457.349489]  r7:ea678cc0 r6:000000a1 r5:f1225f84 r4:f086b000
[  457.355845] [<bf03dd70>] (rtw_stassoc_event_callback [r8723bs]) from [<bf048e4c>] (mlme_evt_hdl+0x8c/0xb4 [r8723bs])
[  457.367601]  r7:c1604900 r6:f086c4b8 r5:00000000 r4:f086c000
[  457.373959] [<bf048dc0>] (mlme_evt_hdl [r8723bs]) from [<bf03693c>] (rtw_cmd_thread+0x198/0x3d8 [r8723bs])
[  457.384744]  r5:f086e000 r4:f086c000
[  457.388754] [<bf0367a4>] (rtw_cmd_thread [r8723bs]) from [<c014a214>] (kthread+0x170/0x174)
[  457.398083]  r10:ed7a57e8 r9:bf0367a4 r8:f086b000 r7:e8ede000 r6:00000000 r5:e9975200
[  457.406828]  r4:e8369900
[  457.409653] [<c014a0a4>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[  457.417718] Exception stack(0xe8edffb0 to 0xe8edfff8)
[  457.423356] ffa0:                                     00000000 00000000 00000000 00000000
[  457.432492] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  457.441618] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  457.449006]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c014a0a4
[  457.457750]  r4:e9975200
[  457.460574] Code: 1a000003 e5953004 e3130001 1a000000 (e7f001f2)
[  457.467381] ---[ end trace 4acbc8c15e9e6aa7 ]---

Link: https://forum.armbian.com/topic/14727-wifi-ap-kernel-bug-in-kernel-5444/
Fixes: 8689c051a2 ("cfg80211: dynamically allocate per-tid stats for station info")
Fixes: f5ea9120be ("nl80211: add generation number to all dumps")
Signed-off-by: Wenli Looi <wlooi@ucalgary.ca>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608064620.74059-1-wlooi@ucalgary.ca
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 12:06:19 +02:00
Yang Yingliang
fbf649cd6d usb: misc: brcmstb-usb-pinmap: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.

Fixes: 517c4c44b3 ("usb: Add driver to allow any GPIO to be used for 7211 USB signals")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210605080914.2057758-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 11:04:27 +02:00
Marian-Cristian Rotariu
d00889080a usb: dwc3: ep0: fix NULL pointer exception
There is no validation of the index from dwc3_wIndex_to_dep() and we might
be referring a non-existing ep and trigger a NULL pointer exception. In
certain configurations we might use fewer eps and the index might wrongly
indicate a larger ep index than existing.

By adding this validation from the patch we can actually report a wrong
index back to the caller.

In our usecase we are using a composite device on an older kernel, but
upstream might use this fix also. Unfortunately, I cannot describe the
hardware for others to reproduce the issue as it is a proprietary
implementation.

[   82.958261] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a4
[   82.966891] Mem abort info:
[   82.969663]   ESR = 0x96000006
[   82.972703]   Exception class = DABT (current EL), IL = 32 bits
[   82.978603]   SET = 0, FnV = 0
[   82.981642]   EA = 0, S1PTW = 0
[   82.984765] Data abort info:
[   82.987631]   ISV = 0, ISS = 0x00000006
[   82.991449]   CM = 0, WnR = 0
[   82.994409] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c6210ccc
[   83.000999] [00000000000000a4] pgd=0000000053aa5003, pud=0000000053aa5003, pmd=0000000000000000
[   83.009685] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[   83.026433] Process irq/62-dwc3 (pid: 303, stack limit = 0x000000003985154c)
[   83.033470] CPU: 0 PID: 303 Comm: irq/62-dwc3 Not tainted 4.19.124 #1
[   83.044836] pstate: 60000085 (nZCv daIf -PAN -UAO)
[   83.049628] pc : dwc3_ep0_handle_feature+0x414/0x43c
[   83.054558] lr : dwc3_ep0_interrupt+0x3b4/0xc94

...

[   83.141788] Call trace:
[   83.144227]  dwc3_ep0_handle_feature+0x414/0x43c
[   83.148823]  dwc3_ep0_interrupt+0x3b4/0xc94
[   83.181546] ---[ end trace aac6b5267d84c32f ]---

Signed-off-by: Marian-Cristian Rotariu <marian.c.rotariu@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608162650.58426-1-marian.c.rotariu@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 11:03:32 +02:00
Linyu Yuan
305f670846 usb: gadget: eem: fix wrong eem header operation
when skb_clone() or skb_copy_expand() fail,
it should pull skb with lengh indicated by header,
or not it will read network data and check it as header.

Cc: <stable@vger.kernel.org>
Signed-off-by: Linyu Yuan <linyyuan@codeaurora.com>
Link: https://lore.kernel.org/r/20210608233547.3767-1-linyyuan@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:57:52 +02:00
Andy Shevchenko
184fa76b87 usb: typec: intel_pmc_mux: Put ACPI device using acpi_dev_put()
For ACPI devices we have a symmetric API to put them, so use it in the driver.

Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20210607205007.71458-3-andy.shevchenko@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:56:14 +02:00
Andy Shevchenko
843fabdd76 usb: typec: intel_pmc_mux: Add missed error check for devm_ioremap_resource()
devm_ioremap_resource() can return an error, add missed check for it.

Fixes: 43d596e322 ("usb: typec: intel_pmc_mux: Check the port status before connect")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210607205007.71458-2-andy.shevchenko@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:56:11 +02:00
Andy Shevchenko
1a85b350a7 usb: typec: intel_pmc_mux: Put fwnode in error case during ->probe()
device_get_next_child_node() bumps a reference counting of a returned variable.
We have to balance it whenever we return to the caller.

Fixes: 6701adfa96 ("usb: typec: driver for Intel PMC mux control")
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210607205007.71458-1-andy.shevchenko@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:56:08 +02:00
Kyle Tso
5ab14ab1f2 usb: typec: tcpm: Do not finish VDM AMS for retrying Responses
If the VDM responses couldn't be sent successfully, it doesn't need to
finish the AMS until the retry count reaches the limit.

Fixes: 0908c5aca3 ("usb: typec: tcpm: AMS and Collision Avoidance")
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable <stable@vger.kernel.org>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210606081452.764032-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:52:15 +02:00
Maciej Żenczykowski
032e288097 usb: fix various gadget panics on 10gbps cabling
usb_assign_descriptors() is called with 5 parameters,
the last 4 of which are the usb_descriptor_header for:
  full-speed (USB1.1 - 12Mbps [including USB1.0 low-speed @ 1.5Mbps),
  high-speed (USB2.0 - 480Mbps),
  super-speed (USB3.0 - 5Gbps),
  super-speed-plus (USB3.1 - 10Gbps).

The differences between full/high/super-speed descriptors are usually
substantial (due to changes in the maximum usb block size from 64 to 512
to 1024 bytes and other differences in the specs), while the difference
between 5 and 10Gbps descriptors may be as little as nothing
(in many cases the same tuning is simply good enough).

However if a gadget driver calls usb_assign_descriptors() with
a NULL descriptor for super-speed-plus and is then used on a max 10gbps
configuration, the kernel will crash with a null pointer dereference,
when a 10gbps capable device port + cable + host port combination shows up.
(This wouldn't happen if the gadget max-speed was set to 5gbps, but
it of course defaults to the maximum, and there's no real reason to
artificially limit it)

The fix is to simply use the 5gbps descriptor as the 10gbps descriptor,
if a 10gbps descriptor wasn't provided.

Obviously this won't fix the problem if the 5gbps descriptor is also
NULL, but such cases can't be so trivially solved (and any such gadgets
are unlikely to be used with USB3 ports any way).

Cc: Felipe Balbi <balbi@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210609024459.1126080-1-zenczykowski@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:40:08 +02:00
Thomas Gleixner
12f7764ac6 x86/process: Check PF_KTHREAD and not current->mm for kernel threads
switch_fpu_finish() checks current->mm as indicator for kernel threads.
That's wrong because kernel threads can temporarily use a mm of a user
process via kthread_use_mm().

Check the task flags for PF_KTHREAD instead.

Fixes: 0cecca9d03 ("x86/fpu: Eager switch PKRU state")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210608144345.912645927@linutronix.de
2021-06-09 10:39:04 +02:00
Maciej Żenczykowski
90c4d05780 usb: fix various gadgets null ptr deref on 10gbps cabling.
This avoids a null pointer dereference in
f_{ecm,eem,hid,loopback,printer,rndis,serial,sourcesink,subset,tcm}
by simply reusing the 5gbps config for 10gbps.

Fixes: eaef50c760 ("usb: gadget: Update usb_assign_descriptors for SuperSpeedPlus")
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Michael R Sweet <msweet@msweet.org>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Pawel Laszczak <pawell@cadence.com>
Cc: Peter Chen <peter.chen@nxp.com>
Cc: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Cc: Wei Ming Chen <jj251510319013@gmail.com>
Cc: Will McVicker <willmcvicker@google.com>
Cc: Zqiang <qiang.zhang@windriver.com>
Reviewed-By: Lorenzo Colitti <lorenzo@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20210608044141.3898496-1-zenczykowski@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:37:13 +02:00
Mario Limonciello
d1658268e4 usb: pci-quirks: disable D3cold on xhci suspend for s2idle on AMD Renoir
The XHCI controller is required to enter D3hot rather than D3cold for AMD
s2idle on this hardware generation.

Otherwise, the 'Controller Not Ready' (CNR) bit is not being cleared by
host in resume and eventually this results in xhci resume failures during
the s2idle wakeup.

Link: https://lore.kernel.org/linux-usb/1612527609-7053-1-git-send-email-Prike.Liang@amd.com/
Suggested-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable <stable@vger.kernel.org> # 5.11+
Link: https://lore.kernel.org/r/20210527154534.8900-1-mario.limonciello@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:36:00 +02:00
Maciej Żenczykowski
1958ff5ad2 usb: f_ncm: only first packet of aggregate needs to start timer
The reasoning for this change is that if we already had
a packet pending, then we also already had a pending timer,
and as such there is no need to reschedule it.

This also prevents packets getting delayed 60 ms worst case
under a tiny packet every 290us transmit load, by keeping the
timeout always relative to the first queued up packet.
(300us delay * 16KB max aggregation / 80 byte packet =~ 60 ms)

As such the first packet is now at most delayed by 300us.

Under low transmit load, this will simply result in us sending
a shorter aggregate, as originally intended.

This patch has the benefit of greatly reducing (by ~10 factor
with 1500 byte frames aggregated into 16 kiB) the number of
(potentially pretty costly) updates to the hrtimer.

Cc: Brooke Basile <brookebasile@gmail.com>
Cc: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20210608085438.813960-1-zenczykowski@gmail.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:34:18 +02:00
Maciej Żenczykowski
3370139745 USB: f_ncm: ncm_bitrate (speed) is unsigned
[  190.544755] configfs-gadget gadget: notify speed -44967296

This is because 4250000000 - 2**32 is -44967296.

Fixes: 9f6ce4240a ("usb: gadget: f_ncm.c added")
Cc: Brooke Basile <brookebasile@gmail.com>
Cc: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Yauheni Kaliuta <yauheni.kaliuta@nokia.com>
Cc: Linux USB Mailing List <linux-usb@vger.kernel.org>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608005344.3762668-1-zenczykowski@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:26:43 +02:00
Rui Miguel Silva
40d9e03f41 MAINTAINERS: usb: add entry for isp1760
Giving support for isp1763 made a little revival to this driver, add
entry in the MAINTAINERS file with me as maintainer.

Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Rui Miguel Silva <rui.silva@linaro.org>
Link: https://lore.kernel.org/r/20210607170054.220975-1-rui.silva@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-09 10:08:50 +02:00
Greg Kroah-Hartman
a39b7ba35d Merge tag 'usb-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-linus
Peter writes:

Two bug fixes for cdns3 and cdnsp

* tag 'usb-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb:
  usb: cdnsp: Fix deadlock issue in cdnsp_thread_irq_handler
  usb: cdns3: Enable TDL_CHK only for OUT ep
2021-06-09 10:05:01 +02:00
Greg Kroah-Hartman
1ca01c0805 Merge tag 'usb-serial-5.13-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Jonah writes:

USB-serial fixes for 5.13-rc5

Here's a fix for some pipe-direction mismatches in the quatech2 driver,
and a couple of new device ids for ftdi_sio and omninet (and a related
trivial cleanup).

All but the ftdi_sio commit have been in linux-next, and with no
reported issues.

* tag 'usb-serial-5.13-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: ftdi_sio: add NovaTech OrionMX product ID
  USB: serial: omninet: update driver description
  USB: serial: omninet: add device id for Zyxel Omni 56K Plus
  USB: serial: quatech2: fix control-request directions
2021-06-09 10:04:17 +02:00
Andy Lutomirski
d8778e393a x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer
Both Intel and AMD consider it to be architecturally valid for XRSTOR to
fail with #PF but nonetheless change the register state.  The actual
conditions under which this might occur are unclear [1], but it seems
plausible that this might be triggered if one sibling thread unmaps a page
and invalidates the shared TLB while another sibling thread is executing
XRSTOR on the page in question.

__fpu__restore_sig() can execute XRSTOR while the hardware registers
are preserved on behalf of a different victim task (using the
fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but
modify the registers.

If this happens, then there is a window in which __fpu__restore_sig()
could schedule out and the victim task could schedule back in without
reloading its own FPU registers. This would result in part of the FPU
state that __fpu__restore_sig() was attempting to load leaking into the
victim task's user-visible state.

Invalidate preserved FPU registers on XRSTOR failure to prevent this
situation from corrupting any state.

[1] Frequent readers of the errata lists might imagine "complex
    microarchitectural conditions".

Fixes: 1d731e731c ("x86/fpu: Add a fastpath to __fpu__restore_sig()")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210608144345.758116583@linutronix.de
2021-06-09 09:49:38 +02:00
Thomas Gleixner
484cea4f36 x86/fpu: Prevent state corruption in __fpu__restore_sig()
The non-compacted slowpath uses __copy_from_user() and copies the entire
user buffer into the kernel buffer, verbatim.  This means that the kernel
buffer may now contain entirely invalid state on which XRSTOR will #GP.
validate_user_xstate_header() can detect some of that corruption, but that
leaves the onus on callers to clear the buffer.

Prior to XSAVES support, it was possible just to reinitialize the buffer,
completely, but with supervisor states that is not longer possible as the
buffer clearing code split got it backwards. Fixing that is possible but
not corrupting the state in the first place is more robust.

Avoid corruption of the kernel XSAVE buffer by using copy_user_to_xstate()
which validates the XSAVE header contents before copying the actual states
to the kernel. copy_user_to_xstate() was previously only called for
compacted-format kernel buffers, but it works for both compacted and
non-compacted forms.

Using it for the non-compacted form is slower because of multiple
__copy_from_user() operations, but that cost is less important than robust
code in an already slow path.

[ Changelog polished by Dave Hansen ]

Fixes: b860eb8dce ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates")
Reported-by: syzbot+2067e764dbcd10721e2e@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210608144345.611833074@linutronix.de
2021-06-09 09:28:21 +02:00
Paolo Bonzini
4422829e80 kvm: fix previous commit for 32-bit builds
array_index_nospec does not work for uint64_t on 32-bit builds.
However, the size of a memory slot must be less than 20 bits wide
on those system, since the memory slot must fit in the user
address space.  So just store it in an unsigned long.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-09 01:49:13 -04:00
Aleksander Jan Bajkowski
f2386cf7c5 net: lantiq: disable interrupt before sheduling NAPI
This patch fixes TX hangs with threaded NAPI enabled. The scheduled
NAPI seems to be executed in parallel with the interrupt on second
thread. Sometimes it happens that ltq_dma_disable_irq() is executed
after xrx200_tx_housekeeping(). The symptom is that TX interrupts
are disabled in the DMA controller. As a result, the TX hangs after
a few seconds of the iperf test. Scheduling NAPI after disabling
interrupts fixes this issue.

Tested on Lantiq xRX200 (BT Home Hub 5A).

Fixes: 9423361da5 ("net: lantiq: Disable IRQs only if NAPI gets scheduled ")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 19:16:32 -07:00
Fabrizio Castro
8929ef8d4d media: dt-bindings: media: renesas,drif: Fix fck definition
dt_binding_check reports the below error with the latest schema:

Documentation/devicetree/bindings/media/renesas,drif.yaml:
  properties:clock-names:maxItems: False schema does not allow 1
Documentation/devicetree/bindings/media/renesas,drif.yaml:
  ignoring, error in schema: properties: clock-names: maxItems

This patch fixes the problem.

Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210408202436.3706-1-fabrizio.castro.jz@renesas.com
2021-06-08 20:03:57 -05:00
Shay Agroskin
504fd6a539 net: ena: fix DMA mapping function issues in XDP
This patch fixes several bugs found when (DMA/LLQ) mapping a packet for
transmission. The mapping procedure makes the transmitted packet
accessible by the device.
When using LLQ, this requires copying the packet's header to push header
(which would be passed to LLQ) and creating DMA mapping for the payload
(if the packet doesn't fit the maximum push length).
When not using LLQ, we map the whole packet with DMA.

The following bugs are fixed in the code:
    1. Add support for non-LLQ machines:
       The ena_xdp_tx_map_frame() function assumed that LLQ is
       supported, and never mapped the whole packet using DMA. On some
       instances, which don't support LLQ, this causes loss of traffic.

    2. Wrong DMA buffer length passed to device:
       When using LLQ, the first 'tx_max_header_size' bytes of the
       packet would be copied to push header. The rest of the packet
       would be copied to a DMA'd buffer.

    3. Freeing the XDP buffer twice in case of a mapping error:
       In case a buffer DMA mapping fails, the function uses
       xdp_return_frame_rx_napi() to free the RX buffer and returns from
       the function with an error. XDP frames that fail to xmit get
       freed by the kernel and so there is no need for this call.

Fixes: 548c4940b9 ("net: ena: Implement XDP_TX action")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 16:41:02 -07:00
Vladimir Oltean
1650bdb1c5 net: dsa: felix: re-enable TX flow control in ocelot_port_flush()
Because flow control is set up statically in ocelot_init_port(), and not
in phylink_mac_link_up(), what happens is that after the blamed commit,
the flow control remains disabled after the port flushing procedure.

Fixes: eb4733d7cf ("net: dsa: felix: implement port flushing on .phylink_mac_link_down")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 16:34:23 -07:00
Pavel Skripkin
49bfcbfd98 net: rds: fix memory leak in rds_recvmsg
Syzbot reported memory leak in rds. The problem
was in unputted refcount in case of error.

int rds_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
		int msg_flags)
{
...

	if (!rds_next_incoming(rs, &inc)) {
		...
	}

After this "if" inc refcount incremented and

	if (rds_cmsg_recv(inc, msg, rs)) {
		ret = -EFAULT;
		goto out;
	}
...
out:
	return ret;
}

in case of rds_cmsg_recv() fail the refcount won't be
decremented. And it's easy to see from ftrace log, that
rds_inc_addref() don't have rds_inc_put() pair in
rds_recvmsg() after rds_cmsg_recv()

 1)               |  rds_recvmsg() {
 1)   3.721 us    |    rds_inc_addref();
 1)   3.853 us    |    rds_message_inc_copy_to_user();
 1) + 10.395 us   |    rds_cmsg_recv();
 1) + 34.260 us   |  }

Fixes: bdbe6fbc6a ("RDS: recv.c")
Reported-and-tested-by: syzbot+5134cdf021c4ed5aaa5f@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 16:32:17 -07:00
Paolo Bonzini
da27a83fd6 kvm: avoid speculation-based attacks from out-of-range memslot accesses
KVM's mechanism for accessing guest memory translates a guest physical
address (gpa) to a host virtual address using the right-shifted gpa
(also known as gfn) and a struct kvm_memory_slot.  The translation is
performed in __gfn_to_hva_memslot using the following formula:

      hva = slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE

It is expected that gfn falls within the boundaries of the guest's
physical memory.  However, a guest can access invalid physical addresses
in such a way that the gfn is invalid.

__gfn_to_hva_memslot is called from kvm_vcpu_gfn_to_hva_prot, which first
retrieves a memslot through __gfn_to_memslot.  While __gfn_to_memslot
does check that the gfn falls within the boundaries of the guest's
physical memory or not, a CPU can speculate the result of the check and
continue execution speculatively using an illegal gfn. The speculation
can result in calculating an out-of-bounds hva.  If the resulting host
virtual address is used to load another guest physical address, this
is effectively a Spectre gadget consisting of two consecutive reads,
the second of which is data dependent on the first.

Right now it's not clear if there are any cases in which this is
exploitable.  One interesting case was reported by the original author
of this patch, and involves visiting guest page tables on x86.  Right
now these are not vulnerable because the hva read goes through get_user(),
which contains an LFENCE speculation barrier.  However, there are
patches in progress for x86 uaccess.h to mask kernel addresses instead of
using LFENCE; once these land, a guest could use speculation to read
from the VMM's ring 3 address space.  Other architectures such as ARM
already use the address masking method, and would be susceptible to
this same kind of data-dependent access gadgets.  Therefore, this patch
proactively protects from these attacks by masking out-of-bounds gfns
in __gfn_to_hva_memslot, which blocks speculation of invalid hvas.

Sean Christopherson noted that this patch does not cover
kvm_read_guest_offset_cached.  This however is limited to a few bytes
past the end of the cache, and therefore it is unlikely to be useful in
the context of building a chain of data dependent accesses.

Reported-by: Artemiy Margaritov <artemiy.margaritov@gmail.com>
Co-developed-by: Artemiy Margaritov <artemiy.margaritov@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 17:12:05 -04:00
Lai Jiangshan
b53e84eed0 KVM: x86: Unload MMU on guest TLB flush if TDP disabled to force MMU sync
When using shadow paging, unload the guest MMU when emulating a guest TLB
flush to ensure all roots are synchronized.  From the guest's perspective,
flushing the TLB ensures any and all modifications to its PTEs will be
recognized by the CPU.

Note, unloading the MMU is overkill, but is done to mirror KVM's existing
handling of INVPCID(all) and ensure the bug is squashed.  Future cleanup
can be done to more precisely synchronize roots when servicing a guest
TLB flush.

If TDP is enabled, synchronizing the MMU is unnecessary even if nested
TDP is in play, as a "legacy" TLB flush from L1 does not invalidate L1's
TDP mappings.  For EPT, an explicit INVEPT is required to invalidate
guest-physical mappings; for NPT, guest mappings are always tagged with
an ASID and thus can only be invalidated via the VMCB's ASID control.

This bug has existed since the introduction of KVM_VCPU_FLUSH_TLB.
It was only recently exposed after Linux guests stopped flushing the
local CPU's TLB prior to flushing remote TLBs (see commit 4ce94eabac,
"x86/mm/tlb: Flush remote and local TLBs concurrently"), but is also
visible in Windows 10 guests.

Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Fixes: f38a7b7526 ("KVM: X86: support paravirtualized help for TLB shootdowns")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
[sean: massaged comment and changelog]
Message-Id: <20210531172256.2908-1-jiangshanlai@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 17:10:21 -04:00
Coly Li
41fe8d088e bcache: avoid oversized read request in cache missing code path
In the cache missing code path of cached device, if a proper location
from the internal B+ tree is matched for a cache miss range, function
cached_dev_cache_miss() will be called in cache_lookup_fn() in the
following code block,
[code block 1]
  526         unsigned int sectors = KEY_INODE(k) == s->iop.inode
  527                 ? min_t(uint64_t, INT_MAX,
  528                         KEY_START(k) - bio->bi_iter.bi_sector)
  529                 : INT_MAX;
  530         int ret = s->d->cache_miss(b, s, bio, sectors);

Here s->d->cache_miss() is the call backfunction pointer initialized as
cached_dev_cache_miss(), the last parameter 'sectors' is an important
hint to calculate the size of read request to backing device of the
missing cache data.

Current calculation in above code block may generate oversized value of
'sectors', which consequently may trigger 2 different potential kernel
panics by BUG() or BUG_ON() as listed below,

1) BUG_ON() inside bch_btree_insert_key(),
[code block 2]
   886         BUG_ON(b->ops->is_extents && !KEY_SIZE(k));
2) BUG() inside biovec_slab(),
[code block 3]
   51         default:
   52                 BUG();
   53                 return NULL;

All the above panics are original from cached_dev_cache_miss() by the
oversized parameter 'sectors'.

Inside cached_dev_cache_miss(), parameter 'sectors' is used to calculate
the size of data read from backing device for the cache missing. This
size is stored in s->insert_bio_sectors by the following lines of code,
[code block 4]
  909    s->insert_bio_sectors = min(sectors, bio_sectors(bio) + reada);

Then the actual key inserting to the internal B+ tree is generated and
stored in s->iop.replace_key by the following lines of code,
[code block 5]
  911   s->iop.replace_key = KEY(s->iop.inode,
  912                    bio->bi_iter.bi_sector + s->insert_bio_sectors,
  913                    s->insert_bio_sectors);
The oversized parameter 'sectors' may trigger panic 1) by BUG_ON() from
the above code block.

And the bio sending to backing device for the missing data is allocated
with hint from s->insert_bio_sectors by the following lines of code,
[code block 6]
  926    cache_bio = bio_alloc_bioset(GFP_NOWAIT,
  927                 DIV_ROUND_UP(s->insert_bio_sectors, PAGE_SECTORS),
  928                 &dc->disk.bio_split);
The oversized parameter 'sectors' may trigger panic 2) by BUG() from the
agove code block.

Now let me explain how the panics happen with the oversized 'sectors'.
In code block 5, replace_key is generated by macro KEY(). From the
definition of macro KEY(),
[code block 7]
  71 #define KEY(inode, offset, size)                                  \
  72 ((struct bkey) {                                                  \
  73      .high = (1ULL << 63) | ((__u64) (size) << 20) | (inode),     \
  74      .low = (offset)                                              \
  75 })

Here 'size' is 16bits width embedded in 64bits member 'high' of struct
bkey. But in code block 1, if "KEY_START(k) - bio->bi_iter.bi_sector" is
very probably to be larger than (1<<16) - 1, which makes the bkey size
calculation in code block 5 is overflowed. In one bug report the value
of parameter 'sectors' is 131072 (= 1 << 17), the overflowed 'sectors'
results the overflowed s->insert_bio_sectors in code block 4, then makes
size field of s->iop.replace_key to be 0 in code block 5. Then the 0-
sized s->iop.replace_key is inserted into the internal B+ tree as cache
missing check key (a special key to detect and avoid a racing between
normal write request and cache missing read request) as,
[code block 8]
  915   ret = bch_btree_insert_check_key(b, &s->op, &s->iop.replace_key);

Then the 0-sized s->iop.replace_key as 3rd parameter triggers the bkey
size check BUG_ON() in code block 2, and causes the kernel panic 1).

Another kernel panic is from code block 6, is by the bvecs number
oversized value s->insert_bio_sectors from code block 4,
        min(sectors, bio_sectors(bio) + reada)
There are two possibility for oversized reresult,
- bio_sectors(bio) is valid, but bio_sectors(bio) + reada is oversized.
- sectors < bio_sectors(bio) + reada, but sectors is oversized.

From a bug report the result of "DIV_ROUND_UP(s->insert_bio_sectors,
PAGE_SECTORS)" from code block 6 can be 344, 282, 946, 342 and many
other values which larther than BIO_MAX_VECS (a.k.a 256). When calling
bio_alloc_bioset() with such larger-than-256 value as the 2nd parameter,
this value will eventually be sent to biovec_slab() as parameter
'nr_vecs' in following code path,
   bio_alloc_bioset() ==> bvec_alloc() ==> biovec_slab()
Because parameter 'nr_vecs' is larger-than-256 value, the panic by BUG()
in code block 3 is triggered inside biovec_slab().

From the above analysis, we know that the 4th parameter 'sector' sent
into cached_dev_cache_miss() may cause overflow in code block 5 and 6,
and finally cause kernel panic in code block 2 and 3. And if result of
bio_sectors(bio) + reada exceeds valid bvecs number, it may also trigger
kernel panic in code block 3 from code block 6.

Now the almost-useless readahead size for cache missing request back to
backing device is removed, this patch can fix the oversized issue with
more simpler method.
- add a local variable size_limit,  set it by the minimum value from
  the max bkey size and max bio bvecs number.
- set s->insert_bio_sectors by the minimum value from size_limit,
  sectors, and the sectors size of bio.
- replace sectors by s->insert_bio_sectors to do bio_next_split.

By the above method with size_limit, s->insert_bio_sectors will never
result oversized replace_key size or bio bvecs number. And split bio
'miss' from bio_next_split() will always match the size of 'cache_bio',
that is the current maximum bio size we can sent to backing device for
fetching the cache missing data.

Current problmatic code can be partially found since Linux v3.13-rc1,
therefore all maintained stable kernels should try to apply this fix.

Reported-by: Alexander Ullrich <ealex1979@gmail.com>
Reported-by: Diego Ercolani <diego.ercolani@gmail.com>
Reported-by: Jan Szubiak <jan.szubiak@linuxpolska.pl>
Reported-by: Marco Rebhan <me@dblsaiko.net>
Reported-by: Matthias Ferdinand <bcache@mfedv.net>
Reported-by: Victor Westerhuis <victor@westerhu.is>
Reported-by: Vojtech Pavlik <vojtech@suse.cz>
Reported-and-tested-by: Rolf Fokkens <rolf@rolffokkens.nl>
Reported-and-tested-by: Thorsten Knabe <linux@thorsten-knabe.de>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Takashi Iwai <tiwai@suse.com>
Link: https://lore.kernel.org/r/20210607125052.21277-3-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-08 15:06:03 -06:00
Coly Li
1616a4c2ab bcache: remove bcache device self-defined readahead
For read cache missing, bcache defines a readahead size for the read I/O
request to the backing device for the missing data. This readahead size
is initialized to 0, and almost no one uses it to avoid unnecessary read
amplifying onto backing device and write amplifying onto cache device.
Considering upper layer file system code has readahead logic allready
and works fine with readahead_cache_policy sysfile interface, we don't
have to keep bcache self-defined readahead anymore.

This patch removes the bcache self-defined readahead for cache missing
request for backing device, and the readahead sysfs file interfaces are
removed as well.

This is the preparation for next patch to fix potential kernel panic due
to oversized request in a simpler method.

Reported-by: Alexander Ullrich <ealex1979@gmail.com>
Reported-by: Diego Ercolani <diego.ercolani@gmail.com>
Reported-by: Jan Szubiak <jan.szubiak@linuxpolska.pl>
Reported-by: Marco Rebhan <me@dblsaiko.net>
Reported-by: Matthias Ferdinand <bcache@mfedv.net>
Reported-by: Victor Westerhuis <victor@westerhu.is>
Reported-by: Vojtech Pavlik <vojtech@suse.cz>
Reported-and-tested-by: Rolf Fokkens <rolf@rolffokkens.nl>
Reported-and-tested-by: Thorsten Knabe <linux@thorsten-knabe.de>
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Takashi Iwai <tiwai@suse.com>
Link: https://lore.kernel.org/r/20210607125052.21277-2-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-06-08 15:06:03 -06:00
Liangyan
3e08a9f976 tracing: Correct the length check which causes memory corruption
We've suffered from severe kernel crashes due to memory corruption on
our production environment, like,

Call Trace:
[1640542.554277] general protection fault: 0000 [#1] SMP PTI
[1640542.554856] CPU: 17 PID: 26996 Comm: python Kdump: loaded Tainted:G
[1640542.556629] RIP: 0010:kmem_cache_alloc+0x90/0x190
[1640542.559074] RSP: 0018:ffffb16faa597df8 EFLAGS: 00010286
[1640542.559587] RAX: 0000000000000000 RBX: 0000000000400200 RCX:
0000000006e931bf
[1640542.560323] RDX: 0000000006e931be RSI: 0000000000400200 RDI:
ffff9a45ff004300
[1640542.560996] RBP: 0000000000400200 R08: 0000000000023420 R09:
0000000000000000
[1640542.561670] R10: 0000000000000000 R11: 0000000000000000 R12:
ffffffff9a20608d
[1640542.562366] R13: ffff9a45ff004300 R14: ffff9a45ff004300 R15:
696c662f65636976
[1640542.563128] FS:  00007f45d7c6f740(0000) GS:ffff9a45ff840000(0000)
knlGS:0000000000000000
[1640542.563937] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1640542.564557] CR2: 00007f45d71311a0 CR3: 000000189d63e004 CR4:
00000000003606e0
[1640542.565279] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[1640542.566069] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[1640542.566742] Call Trace:
[1640542.567009]  anon_vma_clone+0x5d/0x170
[1640542.567417]  __split_vma+0x91/0x1a0
[1640542.567777]  do_munmap+0x2c6/0x320
[1640542.568128]  vm_munmap+0x54/0x70
[1640542.569990]  __x64_sys_munmap+0x22/0x30
[1640542.572005]  do_syscall_64+0x5b/0x1b0
[1640542.573724]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1640542.575642] RIP: 0033:0x7f45d6e61e27

James Wang has reproduced it stably on the latest 4.19 LTS.
After some debugging, we finally proved that it's due to ftrace
buffer out-of-bound access using a debug tool as follows:
[   86.775200] BUG: Out-of-bounds write at addr 0xffff88aefe8b7000
[   86.780806]  no_context+0xdf/0x3c0
[   86.784327]  __do_page_fault+0x252/0x470
[   86.788367]  do_page_fault+0x32/0x140
[   86.792145]  page_fault+0x1e/0x30
[   86.795576]  strncpy_from_unsafe+0x66/0xb0
[   86.799789]  fetch_memory_string+0x25/0x40
[   86.804002]  fetch_deref_string+0x51/0x60
[   86.808134]  kprobe_trace_func+0x32d/0x3a0
[   86.812347]  kprobe_dispatcher+0x45/0x50
[   86.816385]  kprobe_ftrace_handler+0x90/0xf0
[   86.820779]  ftrace_ops_assist_func+0xa1/0x140
[   86.825340]  0xffffffffc00750bf
[   86.828603]  do_sys_open+0x5/0x1f0
[   86.832124]  do_syscall_64+0x5b/0x1b0
[   86.835900]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

commit b220c049d5 ("tracing: Check length before giving out
the filter buffer") adds length check to protect trace data
overflow introduced in 0fc1b09ff1, seems that this fix can't prevent
overflow entirely, the length check should also take the sizeof
entry->array[0] into account, since this array[0] is filled the
length of trace data and occupy addtional space and risk overflow.

Link: https://lkml.kernel.org/r/20210607125734.1770447-1-liangyan.peng@linux.alibaba.com

Cc: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Xunlei Pang <xlpang@linux.alibaba.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: b220c049d5 ("tracing: Check length before giving out the filter buffer")
Reviewed-by: Xunlei Pang <xlpang@linux.alibaba.com>
Reviewed-by: yinbinbin <yinbinbin@alibabacloud.com>
Reviewed-by: Wetp Zhang <wetp.zy@linux.alibaba.com>
Tested-by: James Wang <jnwang@linux.alibaba.com>
Signed-off-by: Liangyan <liangyan.peng@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-08 16:44:00 -04:00
Steven Rostedt (VMware)
6c14133d2d ftrace: Do not blindly read the ip address in ftrace_bug()
It was reported that a bug on arm64 caused a bad ip address to be used for
updating into a nop in ftrace_init(), but the error path (rightfully)
returned -EINVAL and not -EFAULT, as the bug caused more than one error to
occur. But because -EINVAL was returned, the ftrace_bug() tried to report
what was at the location of the ip address, and read it directly. This
caused the machine to panic, as the ip was not pointing to a valid memory
address.

Instead, read the ip address with copy_from_kernel_nofault() to safely
access the memory, and if it faults, report that the address faulted,
otherwise report what was in that location.

Link: https://lore.kernel.org/lkml/20210607032329.28671-1-mark-pk.tsai@mediatek.com/

Cc: stable@vger.kernel.org
Fixes: 05736a427f ("ftrace: warn on failure to disable mcount callers")
Reported-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-08 16:44:00 -04:00
Masami Hiramatsu
824afd55e9 tools/bootconfig: Fix a build error accroding to undefined fallthrough
Since the "fallthrough" is defined only in the kernel, building
lib/bootconfig.c as a part of user-space tools causes a build
error.

Add a dummy fallthrough to avoid the build error.

Link: https://lkml.kernel.org/r/162087519356.442660.11385099982318160180.stgit@devnote2

Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 4c1ca831ad ("Revert "lib: Revert use of fallthrough pseudo-keyword in lib/"")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-08 16:44:00 -04:00
Zhen Lei
e8ba0b2b64 tools/bootconfig: Fix error return code in apply_xbc()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.

Link: https://lkml.kernel.org/r/20210508034216.2277-1-thunder.leizhen@huawei.com

Fixes: a995e6bc05 ("tools/bootconfig: Fix to check the write failure correctly")
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-08 16:43:59 -04:00
Mark Bloch
edc0b0bccc RDMA/mlx5: Block FDB rules when not in switchdev mode
Allow creating FDB steering rules only when in switchdev mode.

The only software model where a userspace application can manipulate
FDB entries is when it manages the eswitch. This is only possible in
switchdev mode where we expose a single RDMA device with representors
for all the vports that are connected to the eswitch.

Fixes: 52438be441 ("RDMA/mlx5: Allow inserting a steering rule to the FDB")
Link: https://lore.kernel.org/r/e928ae7c58d07f104716a2a8d730963d1bd01204.1623052923.git.leonro@nvidia.com
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 17:02:15 -03:00
David S. Miller
df693f13a1 Merge tag 'batadv-net-pullrequest-20210608' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:

====================
Here is a batman-adv bugfix:

 - Avoid WARN_ON timing related checks, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:11:21 -07:00
Nicolas Dichtel
9bb392f624 vrf: fix maximum MTU
My initial goal was to fix the default MTU, which is set to 65536, ie above
the maximum defined in the driver: 65535 (ETH_MAX_MTU).

In fact, it's seems more consistent, wrt min_mtu, to set the max_mtu to
IP6_MAX_MTU (65535 + sizeof(struct ipv6hdr)) and use it by default.

Let's also, for consistency, set the mtu in vrf_setup(). This function
calls ether_setup(), which set the mtu to 1500. Thus, the whole mtu config
is done in the same function.

Before the patch:
$ ip link add blue type vrf table 1234
$ ip link list blue
9: blue: <NOARP,MASTER> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether fa:f5:27:70:24:2a brd ff:ff:ff:ff:ff:ff
$ ip link set dev blue mtu 65535
$ ip link set dev blue mtu 65536
Error: mtu greater than device maximum.

Fixes: 5055376a3b ("net: vrf: Fix ping failed when vrf mtu is set to 0")
CC: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:46:23 -07:00
gushengxian
d439aa33a9 net: appletalk: fix the usage of preposition
The preposition "for" should be changed to preposition "of".

Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:37:41 -07:00
Zheng Yongjun
5ac6b198d7 net: ipv4: Remove unneed BUG() function
When 'nla_parse_nested_deprecated' failed, it's no need to
BUG() here, return -EINVAL is ok.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:36:48 -07:00
Nanyong Sun
d612c3f3fa net: ipv4: fix memory leak in netlbl_cipsov4_add_std
Reported by syzkaller:
BUG: memory leak
unreferenced object 0xffff888105df7000 (size 64):
comm "syz-executor842", pid 360, jiffies 4294824824 (age 22.546s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000e67ed558>] kmalloc include/linux/slab.h:590 [inline]
[<00000000e67ed558>] kzalloc include/linux/slab.h:720 [inline]
[<00000000e67ed558>] netlbl_cipsov4_add_std net/netlabel/netlabel_cipso_v4.c:145 [inline]
[<00000000e67ed558>] netlbl_cipsov4_add+0x390/0x2340 net/netlabel/netlabel_cipso_v4.c:416
[<0000000006040154>] genl_family_rcv_msg_doit.isra.0+0x20e/0x320 net/netlink/genetlink.c:739
[<00000000204d7a1c>] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
[<00000000204d7a1c>] genl_rcv_msg+0x2bf/0x4f0 net/netlink/genetlink.c:800
[<00000000c0d6a995>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504
[<00000000d78b9d2c>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
[<000000009733081b>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
[<000000009733081b>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340
[<00000000d5fd43b8>] netlink_sendmsg+0x789/0xc70 net/netlink/af_netlink.c:1929
[<000000000a2d1e40>] sock_sendmsg_nosec net/socket.c:654 [inline]
[<000000000a2d1e40>] sock_sendmsg+0x139/0x170 net/socket.c:674
[<00000000321d1969>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350
[<00000000964e16bc>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404
[<000000001615e288>] __sys_sendmsg+0xd3/0x190 net/socket.c:2433
[<000000004ee8b6a5>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:47
[<00000000171c7cee>] entry_SYSCALL_64_after_hwframe+0x44/0xae

The memory of doi_def->map.std pointing is allocated in
netlbl_cipsov4_add_std, but no place has freed it. It should be
freed in cipso_v4_doi_free which frees the cipso DOI resource.

Fixes: 96cb8e3313 ("[NetLabel]: CIPSOv4 and Unlabeled packet integration")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:36:04 -07:00
Jonathan Marek
ce86c239e4 drm/msm/a6xx: avoid shadow NULL reference in failure path
If a6xx_hw_init() fails before creating the shadow_bo, the a6xx_pm_suspend
code referencing it will crash. Change the condition to one that avoids
this problem (note: creation of shadow_bo is behind this same condition)

Fixes: e8b0b994c3 ("drm/msm/a6xx: Clear shadow on suspend")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20210513171431.18632-6-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-08 11:26:45 -07:00
Jonathan Marek
b4387eaf38 drm/msm/a6xx: fix incorrectly set uavflagprd_inv field for A650
Value was shifted in the wrong direction, resulting in the field always
being zero, which is incorrect for A650.

Fixes: d0bac4e9cd ("drm/msm/a6xx: set ubwc config for A640 and A650")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20210513171431.18632-4-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-08 11:26:45 -07:00
Jonathan Marek
4084340369 drm/msm/a6xx: update/fix CP_PROTECT initialization
Update CP_PROTECT register programming based on downstream.

A6XX_PROTECT_RW is renamed to A6XX_PROTECT_NORDWR to make things aligned
and also be more clear about what it does.

Note that this required switching to use the CP_ALWAYS_ON_COUNTER as the
GMU counter is not accessible from the cmdstream.  Which also means
using the CPU counter for the msm_gpu_submit_flush() tracepoint (as
catapult depends on being able to compare this to the start/end values
captured in cmdstream).  This may need to be revisited when IFPC is
enabled.

Also, compared to downstream, this opens up CP_PERFCTR_CP_SEL as the
userspace performance tooling (fdperf and pps-producer) expect to be
able to configure the CP counters.

Fixes: 4b565ca5a2 ("drm/msm: Add A6XX device support")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20210513171431.18632-5-jonathan@marek.ca
[switch to CP_ALWAYS_ON_COUNTER, open up CP_PERFCNTR_CP_SEL, and spiff
 up commit msg]
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-08 11:26:45 -07:00
Chen Li
ab8363d387 radeon: use memcpy_to/fromio for UVD fw upload
I met a gpu addr bug recently and the kernel log
tells me the pc is memcpy/memset and link register is
radeon_uvd_resume.

As we know, in some architectures, optimized memcpy/memset
may not work well on device memory. Trival memcpy_toio/memset_io
can fix this problem.

BTW, amdgpu has already done it in:
commit ba0b2275a6 ("drm/amdgpu: use memcpy_to/fromio for UVD fw upload"),
that's why it has no this issue on the same gpu and platform.

Signed-off-by: Chen Li <chenli@uniontech.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 14:05:11 -04:00
Gustavo A. R. Silva
924f41e52f drm/amd/pm: Fix fall-through warning for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of letting the code fall
through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 14:04:46 -04:00
Rohit Khaire
c247c021b1 drm/amdgpu: Fix incorrect register offsets for Sienna Cichlid
RLC_CP_SCHEDULERS and RLC_SPARE_INT0 have different
offsets for Sienna Cichlid

Signed-off-by: Rohit Khaire <rohit.khaire@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 14:03:55 -04:00
Michel Dänzer
b71a52f447 drm/amdgpu: Use drm_dbg_kms for reporting failure to get a GEM FB
drm_err meant broken user space could spam dmesg.

Fixes: f258907fdd "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 14:03:09 -04:00
Changfeng
2a48b5911c drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create
It will cause error when alloc memory larger than 128KB in
amdgpu_bo_create->kzalloc. So it needs to switch kzalloc to kvzalloc.

Call Trace:
   alloc_pages_current+0x6a/0xe0
   kmalloc_order+0x32/0xb0
   kmalloc_order_trace+0x1e/0x80
   __kmalloc+0x249/0x2d0
   amdgpu_bo_create+0x102/0x500 [amdgpu]
   ? xas_create+0x264/0x3e0
   amdgpu_bo_create_vm+0x32/0x60 [amdgpu]
   amdgpu_vm_pt_create+0xf5/0x260 [amdgpu]
   amdgpu_vm_init+0x1fd/0x4d0 [amdgpu]

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 14:01:56 -04:00
Sean Christopherson
f31500b0d4 KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message
Use the __string() machinery provided by the tracing subystem to make a
copy of the string literals consumed by the "nested VM-Enter failed"
tracepoint.  A complete copy is necessary to ensure that the tracepoint
can't outlive the data/memory it consumes and deference stale memory.

Because the tracepoint itself is defined by kvm, if kvm-intel and/or
kvm-amd are built as modules, the memory holding the string literals
defined by the vendor modules will be freed when the module is unloaded,
whereas the tracepoint and its data in the ring buffer will live until
kvm is unloaded (or "indefinitely" if kvm is built-in).

This bug has existed since the tracepoint was added, but was recently
exposed by a new check in tracing to detect exactly this type of bug.

  fmt: '%s%s
  ' current_buffer: ' vmx_dirty_log_t-140127  [003] ....  kvm_nested_vmenter_failed: '
  WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0
  CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184
  Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
  RIP: 0010:trace_check_vprintf+0x3be/0x3e0
  Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20
  RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282
  RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027
  RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8
  RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8
  R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4
  R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000
  FS:  00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0
  Call Trace:
   trace_event_printf+0x5e/0x80
   trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm]
   print_trace_line+0x1dd/0x4e0
   s_show+0x45/0x150
   seq_read_iter+0x2d5/0x4c0
   seq_read+0x106/0x150
   vfs_read+0x98/0x180
   ksys_read+0x5f/0xe0
   do_syscall_64+0x40/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Cc: Steven Rostedt <rostedt@goodmis.org>
Fixes: 380e0055bc ("KVM: nVMX: trace nested VM-Enter failures detected by H/W")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Message-Id: <20210607175748.674002-1-seanjc@google.com>
2021-06-08 13:30:49 -04:00
Linus Torvalds
368094df48 Merge tag 'for-linus-5.13b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
 "A single patch fixing a Xen related security bug: a malicious guest
  might be able to trigger a 'use after free' issue in the xen-netback
  driver"

* tag 'for-linus-5.13b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen-netback: take a reference to the RX task thread
2021-06-08 10:29:39 -07:00
Zhenzhong Duan
f53b16ad64 selftests: kvm: Add support for customized slot0 memory size
Until commit 39fe2fc966 ("selftests: kvm: make allocation of extra
memory take effect", 2021-05-27), parameter extra_mem_pages was used
only to calculate the page table size for all the memory chunks,
because real memory allocation happened with calls of
vm_userspace_mem_region_add() after vm_create_default().

Commit 39fe2fc966 however changed the meaning of extra_mem_pages to
the size of memory slot 0.  This makes the memory allocation more
flexible, but makes it harder to account for the number of
pages needed for the page tables.  For example, memslot_perf_test
has a small amount of memory in slot 0 but a lot in other slots,
and adding that memory twice (both in slot 0 and with later
calls to vm_userspace_mem_region_add()) causes an error that
was fixed in commit 000ac42953 ("selftests: kvm: fix overlapping
addresses in memslot_perf_test", 2021-05-29)

Since both uses are sensible, add a new parameter slot0_mem_pages
to vm_create_with_vcpus() and some comments to clarify the meaning of
slot0_mem_pages and extra_mem_pages.  With this change,
memslot_perf_test can go back to passing the number of memory
pages as extra_mem_pages.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20210608233816.423958-4-zhenzhong.duan@intel.com>
[Squashed in a single patch and rewrote the commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 13:29:10 -04:00
Linus Torvalds
374aeb91db Merge tag 'orphans-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull orphan section fixes from Kees Cook:
 "These two corner case fixes have been in -next for about a week:

   - Avoid orphan section in ARM cpuidle (Arnd Bergmann)

   - Avoid orphan section with !SMP (Nathan Chancellor)"

* tag 'orphans-v5.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  vmlinux.lds.h: Avoid orphan section with !SMP
  ARM: cpuidle: Avoid orphan section warning
2021-06-08 10:25:20 -07:00
Kees Cook
591a22c14d proc: Track /proc/$pid/attr/ opener mm_struct
Commit bfb819ea20 ("proc: Check /proc/$pid/attr/ writes against file opener")
tried to make sure that there could not be a confusion between the opener of
a /proc/$pid/attr/ file and the writer. It used struct cred to make sure
the privileges didn't change. However, there were existing cases where a more
privileged thread was passing the opened fd to a differently privileged thread
(during container setup). Instead, use mm_struct to track whether the opener
and writer are still the same process. (This is what several other proc files
already do, though for different reasons.)

Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Reported-by: Andrea Righi <andrea.righi@canonical.com>
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Fixes: bfb819ea20 ("proc: Check /proc/$pid/attr/ writes against file opener")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-08 10:24:09 -07:00
Christian Borntraeger
1bc603af73 KVM: selftests: introduce P47V64 for s390x
s390x can have up to 47bits of physical guest and 64bits of virtual
address  bits. Add a new address mode to avoid errors of testcases
going beyond 47bits.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20210608123954.10991-1-borntraeger@de.ibm.com>
Fixes: ef4c9f4f65 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
Cc: stable@vger.kernel.org
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 13:19:19 -04:00
Lai Jiangshan
af3511ff7f KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior
In record_steal_time(), st->preempted is read twice, and
trace_kvm_pv_tlb_flush() might output result inconsistent if
kvm_vcpu_flush_tlb_guest() see a different st->preempted later.

It is a very trivial problem and hardly has actual harm and can be
avoided by reseting and reading st->preempted in atomic way via xchg().

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>

Message-Id: <20210531174628.10265-1-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 13:15:20 -04:00
Alexey Minnekhanov
45f5669005 drm/msm: Init mm_list before accessing it for use_vram path
Fix NULL pointer dereference caused by update_inactive()
trying to list_del() an uninitialized mm_list who's
prev/next pointers are NULL.

Fixes: 64fcbde772 ("drm/msm: Track potentially evictable objects")
Signed-off-by: Alexey Minnekhanov <alexeymin@postmarketos.org>
Link: https://lore.kernel.org/r/20210518102624.1193955-1-alexeymin@postmarketos.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-06-08 10:08:04 -07:00
Linus Torvalds
4c8684fe55 Merge tag 'spi-fix-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A small set of SPI fixes that have come up since the merge window, all
  fairly small fixes for rare cases"

* tag 'spi-fix-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: stm32-qspi: Always wait BUSY bit to be cleared in stm32_qspi_wait_cmd()
  spi: spi-zynq-qspi: Fix some wrong goto jumps & missing error code
  spi: Cleanup on failure of initial setup
  spi: bcm2835: Fix out-of-bounds access with more than 4 slaves
2021-06-08 09:45:00 -07:00
Linus Torvalds
9b1111fa80 Merge tag 'regulator-fix-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
 "A collection of fixes for the regulator API that have come up since
  the merge window, including a big batch of fixes from Axel Lin's usual
  careful and detailed review.

  The one stand out fix here is Dmitry Baryshkov's fix for an issue
  where we fail to power on the parents of always on regulators during
  system startup if they weren't already powered on"

* tag 'regulator-fix-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (21 commits)
  regulator: rt4801: Fix NULL pointer dereference if priv->enable_gpios is NULL
  regulator: hi6421v600: Fix .vsel_mask setting
  regulator: bd718x7: Fix the BUCK7 voltage setting on BD71837
  regulator: atc260x: Fix n_voltages and min_sel for pickable linear ranges
  regulator: rtmv20: Fix to make regcache value first reading back from HW
  regulator: mt6315: Fix function prototype for mt6315_map_mode
  regulator: rtmv20: Add Richtek to Kconfig text
  regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks
  regulator: hisilicon: use the correct HiSilicon copyright
  regulator: bd71828: Fix .n_voltages settings
  regulator: bd70528: Fix off-by-one for buck123 .n_voltages setting
  regulator: max77620: Silence deferred probe error
  regulator: max77620: Use device_set_of_node_from_dev()
  regulator: scmi: Fix off-by-one for linear regulators .n_voltages setting
  regulator: core: resolve supply for boot-on/always-on regulators
  regulator: fixed: Ensure enable_counter is correct if reg_domain_disable fails
  regulator: Check ramp_delay_table for regulator_set_ramp_delay_regmap
  regulator: fan53880: Fix missing n_voltages setting
  regulator: da9121: Return REGULATOR_MODE_INVALID for invalid mode
  regulator: fan53555: fix TCS4525 voltage calulation
  ...
2021-06-08 09:41:16 -07:00
Lai Jiangshan
b1bd5cba33 KVM: X86: MMU: Use the correct inherited permissions to get shadow page
When computing the access permissions of a shadow page, use the effective
permissions of the walk up to that point, i.e. the logic AND of its parents'
permissions.  Two guest PxE entries that point at the same table gfn need to
be shadowed with different shadow pages if their parents' permissions are
different.  KVM currently uses the effective permissions of the last
non-leaf entry for all non-leaf entries.  Because all non-leaf SPTEs have
full ("uwx") permissions, and the effective permissions are recorded only
in role.access and merged into the leaves, this can lead to incorrect
reuse of a shadow page and eventually to a missing guest protection page
fault.

For example, here is a shared pagetable:

   pgd[]   pud[]        pmd[]            virtual address pointers
                     /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--)
        /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-)
   pgd-|           (shared pmd[] as above)
        \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--)
                     \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--)

  pud1 and pud2 point to the same pmd table, so:
  - ptr1 and ptr3 points to the same page.
  - ptr2 and ptr4 points to the same page.

(pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries)

- First, the guest reads from ptr1 first and KVM prepares a shadow
  page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1.
  "u--" comes from the effective permissions of pgd, pud1 and
  pmd1, which are stored in pt->access.  "u--" is used also to get
  the pagetable for pud1, instead of "uw-".

- Then the guest writes to ptr2 and KVM reuses pud1 which is present.
  The hypervisor set up a shadow page for ptr2 with pt->access is "uw-"
  even though the pud1 pmd (because of the incorrect argument to
  kvm_mmu_get_page in the previous step) has role.access="u--".

- Then the guest reads from ptr3.  The hypervisor reuses pud1's
  shadow pmd for pud2, because both use "u--" for their permissions.
  Thus, the shadow pmd already includes entries for both pmd1 and pmd2.

- At last, the guest writes to ptr4.  This causes no vmexit or pagefault,
  because pud1's shadow page structures included an "uw-" page even though
  its role.access was "u--".

Any kind of shared pagetable might have the similar problem when in
virtual machine without TDP enabled if the permissions are different
from different ancestors.

In order to fix the problem, we change pt->access to be an array, and
any access in it will not include permissions ANDed from child ptes.

The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/
Remember to test it with TDP disabled.

The problem had existed long before the commit 41074d07c7 ("KVM: MMU:
Fix inherited permissions for emulated guest pte updates"), and it
is hard to find which is the culprit.  So there is no fixes tag here.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20210603052455.21023-1-jiangshanlai@gmail.com>
Cc: stable@vger.kernel.org
Fixes: cea0f0e7ea ("[PATCH] KVM: MMU: Shadow page table caching")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 12:29:53 -04:00
Wanpeng Li
e898da784a KVM: LAPIC: Write 0 to TMICT should also cancel vmx-preemption timer
According to the SDM 10.5.4.1:

  A write of 0 to the initial-count register effectively stops the local
  APIC timer, in both one-shot and periodic mode.

However, the lapic timer oneshot/periodic mode which is emulated by vmx-preemption
timer doesn't stop by writing 0 to TMICT since vmx->hv_deadline_tsc is still
programmed and the guest will receive the spurious timer interrupt later. This
patch fixes it by also cancelling the vmx-preemption timer when writing 0 to
the initial-count register.

Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1623050385-100988-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 12:22:26 -04:00
Ashish Kalra
4f13d471e5 KVM: SVM: Fix SEV SEND_START session length & SEND_UPDATE_DATA query length after commit 238eca821c
Commit 238eca821c ("KVM: SVM: Allocate SEV command structures on local stack")
uses the local stack to allocate the structures used to communicate with the PSP,
which were earlier being kzalloced. This breaks SEV live migration for
computing the SEND_START session length and SEND_UPDATE_DATA query length as
session_len and trans_len and hdr_len fields are not zeroed respectively for
the above commands before issuing the SEV Firmware API call, hence the
firmware returns incorrect session length and update data header or trans length.

Also the SEV Firmware API returns SEV_RET_INVALID_LEN firmware error
for these length query API calls, and the return value and the
firmware error needs to be passed to the userspace as it is, so
need to remove the return check in the KVM code.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Message-Id: <20210607061532.27459-1-Ashish.Kalra@amd.com>
Fixes: 238eca821c ("KVM: SVM: Allocate SEV command structures on local stack")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-08 12:21:55 -04:00
Desmond Cheong Zhi Xi
b436acd1cf drm: Fix use-after-free read in drm_getunique()
There is a time-of-check-to-time-of-use error in drm_getunique() due
to retrieving file_priv->master prior to locking the device's master
mutex.

An example can be seen in the crash report of the use-after-free error
found by Syzbot:
https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f803

In the report, the master pointer was used after being freed. This is
because another process had acquired the device's master mutex in
drm_setmaster_ioctl(), then overwrote fpriv->master in
drm_new_set_master(). The old value of fpriv->master was subsequently
freed before the mutex was unlocked.

To fix this, we lock the device's master mutex before retrieving the
pointer from from fpriv->master. This patch passes the Syzbot
reproducer test.

Reported-by: syzbot+c3a706cec1ea99e1c693@syzkaller.appspotmail.com
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608110436.239583-1-desmondcheongzx@gmail.com
2021-06-08 18:18:58 +02:00
Mark Rutland
8a11e84b80 drm/vc4: fix vc4_atomic_commit_tail() logic
In vc4_atomic_commit_tail() we iterate of the set of old CRTCs, and
attempt to wait on any channels which are still in use. When we iterate
over the CRTCs, we have:

* `i` - the index of the CRTC
* `channel` - the channel a CRTC is using

When we check the channel state, we consult:

  old_hvs_state->fifo_state[channel].in_use

... but when we wait for the channel, we erroneously wait on:

  old_hvs_state->fifo_state[i].pending_commit

... rather than:

   old_hvs_state->fifo_state[channel].pending_commit

... and this bogus access has been observed to result in boot-time hangs
on some arm64 configurations, and can be detected using KASAN. FIx this
by using the correct index.

I've tested this on a Raspberry Pi 3 model B v1.2 with KASAN.

Trimmed KASAN splat:

| ==================================================================
| BUG: KASAN: slab-out-of-bounds in vc4_atomic_commit_tail+0x1cc/0x910
| Read of size 8 at addr ffff000007360440 by task kworker/u8:0/7
| CPU: 2 PID: 7 Comm: kworker/u8:0 Not tainted 5.13.0-rc3-00009-g694c523e7267 #3
|
| Hardware name: Raspberry Pi 3 Model B (DT)
| Workqueue: events_unbound deferred_probe_work_func
| Call trace:
|  dump_backtrace+0x0/0x2b4
|  show_stack+0x1c/0x30
|  dump_stack+0xfc/0x168
|  print_address_description.constprop.0+0x2c/0x2c0
|  kasan_report+0x1dc/0x240
|  __asan_load8+0x98/0xd4
|  vc4_atomic_commit_tail+0x1cc/0x910
|  commit_tail+0x100/0x210
| ...
|
| Allocated by task 7:
|  kasan_save_stack+0x2c/0x60
|  __kasan_kmalloc+0x90/0xb4
|  vc4_hvs_channels_duplicate_state+0x60/0x1a0
|  drm_atomic_get_private_obj_state+0x144/0x230
|  vc4_atomic_check+0x40/0x73c
|  drm_atomic_check_only+0x998/0xe60
|  drm_atomic_commit+0x34/0x94
|  drm_client_modeset_commit_atomic+0x2f4/0x3a0
|  drm_client_modeset_commit_locked+0x8c/0x230
|  drm_client_modeset_commit+0x38/0x60
|  drm_fb_helper_set_par+0x104/0x17c
|  fbcon_init+0x43c/0x970
|  visual_init+0x14c/0x1e4
| ...
|
| The buggy address belongs to the object at ffff000007360400
|  which belongs to the cache kmalloc-128 of size 128
| The buggy address is located 64 bytes inside of
|  128-byte region [ffff000007360400, ffff000007360480)
| The buggy address belongs to the page:
| page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7360
| flags: 0x3fffc0000000200(slab|node=0|zone=0|lastcpupid=0xffff)
| raw: 03fffc0000000200 dead000000000100 dead000000000122 ffff000004c02300
| raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
| page dumped because: kasan: bad access detected
|
| Memory state around the buggy address:
|  ffff000007360300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
|  ffff000007360380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
| >ffff000007360400: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
|                                            ^
|  ffff000007360480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
|  ffff000007360500: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
| ==================================================================

Link: https://lore.kernel.org/r/4d0c8318-bad8-2be7-e292-fc8f70c198de@samsung.com
Link: https://lore.kernel.org/linux-arm-kernel/20210607151740.moncryl5zv3ahq4s@gilmour
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Emma Anholt <emma@anholt.net>
Cc: Maxime Ripard <maxime@cerno.tech>
Cc: Will Deacon <will@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608085513.2069-1-mark.rutland@arm.com
2021-06-08 17:02:17 +02:00
Takashi Iwai
a0309c3448 Merge tag 'asoc-fix-v5.13-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.13

A collection of fixes and device ID updates that have come up in the
past few -rcs, none of which stand out particularly.
2021-06-08 16:59:19 +02:00
Tom Lendacky
8d651ee9c7 x86/ioremap: Map EFI-reserved memory as encrypted for SEV
Some drivers require memory that is marked as EFI boot services
data. In order for this memory to not be re-used by the kernel
after ExitBootServices(), efi_mem_reserve() is used to preserve it
by inserting a new EFI memory descriptor and marking it with the
EFI_MEMORY_RUNTIME attribute.

Under SEV, memory marked with the EFI_MEMORY_RUNTIME attribute needs to
be mapped encrypted by Linux, otherwise the kernel might crash at boot
like below:

  EFI Variables Facility v0.08 2004-May-17
  general protection fault, probably for non-canonical address 0x3597688770a868b2: 0000 [#1] SMP NOPTI
  CPU: 13 PID: 1 Comm: swapper/0 Not tainted 5.12.4-2-default #1 openSUSE Tumbleweed
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:efi_mokvar_entry_next
  [...]
  Call Trace:
   efi_mokvar_sysfs_init
   ? efi_mokvar_table_init
   do_one_initcall
   ? __kmalloc
   kernel_init_freeable
   ? rest_init
   kernel_init
   ret_from_fork

Expand the __ioremap_check_other() function to additionally check for
this other type of boot data reserved at runtime and indicate that it
should be mapped encrypted for an SEV guest.

 [ bp: Massage commit message. ]

Fixes: 58c909022a ("efi: Support for MOK variable config table")
Reported-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Joerg Roedel <jroedel@suse.de>
Cc: <stable@vger.kernel.org> # 5.10+
Link: https://lkml.kernel.org/r/20210608095439.12668-2-joro@8bytes.org
2021-06-08 16:26:55 +02:00
Geert Uytterhoeven
6687cd72aa mmc: renesas_sdhi: Fix HS400 on R-Car M3-W+
R-Car M3-W ES3.0 is marketed as R-Car M3-W+ (R8A77961), and has its own
compatible value "renesas,r8a77961".

Hence using soc_device_match() with soc_id = "r8a7796" and revision =
"ES3.*" does not actually match running on an R-Car M3-W+ SoC.

Fix this by matching with soc_id = "r8a77961" instead.

Fixes: a38c078fea ("mmc: renesas_sdhi: Avoid bad TAP in HS400")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/ee8af5d631f5331139ffea714539030d97352e93.1622811525.git.geert+renesas@glider.be
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-06-08 14:56:54 +02:00
Jon Hunter
aceda401e8 spi: tegra20-slink: Ensure SPI controller reset is deasserted
Commit 4782c0a5dd ("clk: tegra: Don't deassert reset on enabling
clocks") removed some legacy code for handling resets on Tegra from
within the Tegra clock code. This exposed an issue in the Tegra20 slink
driver where the SPI controller reset was not being deasserted as needed
during probe. This is causing the Tegra30 Cardhu platform to hang on
boot. Fix this by ensuring the SPI controller reset is deasserted during
probe.

Fixes: 4782c0a5dd ("clk: tegra: Don't deassert reset on enabling clocks")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20210608071518.93037-1-jonathanh@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-08 13:48:38 +01:00
Wolfram Sang
2c9017d0b5 mmc: renesas_sdhi: abort tuning when timeout detected
We have to bring the eMMC from sending-data state back to transfer state
once we detected a CRC error (timeout) during tuning. So, send a stop
command via mmc_abort_tuning().

Fixes: 4f11997773 ("mmc: tmio: Add tuning support")
Reported-by Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20210602073435.5955-1-wsa+renesas@sang-engineering.com
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-06-08 14:34:27 +02:00
Jeremy Szu
600dd2a7e8 ALSA: hda/realtek: fix mute/micmute LEDs for HP ZBook Power G8
The HP ZBook Power G8 using ALC236 codec which using 0x02 to
control mute LED and 0x01 to control micmute LED.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210608114750.32009-1-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-08 14:01:18 +02:00
Hui Wang
57c9e21a49 ALSA: hda/realtek: headphone and mic don't work on an Acer laptop
There are 2 issues on this machine, the 1st one is mic's plug/unplug
can't be detected, that is because the mic is set to manual detecting
mode, need to apply ALC255_FIXUP_XIAOMI_HEADSET_MIC to set it to auto
detecting mode. The other one is headphone's plug/unplug can't be
detected by pulseaudio, that is because the pulseaudio will use
ucm2/sof-hda-dsp on this machine, and the ucm2 only handle
'Headphone Jack', but on this machine the headphone's pincfg sets the
location to Front, then the alsa mixer name is "Front Headphone Jack"
instead of "Headphone Jack", so override the pincfg to change location
to Left.

BugLink: http://bugs.launchpad.net/bugs/1930188
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20210608024600.6198-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-08 14:00:41 +02:00
Christian König
2d2ddb589d drm/ttm: fix deref of bo->ttm without holding the lock v2
We need to grab the resv lock first before doing that check.

v2 (chk): simplify the change for -fixes

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210528130041.1683-1-christian.koenig@amd.com
2021-06-08 13:24:15 +02:00
Johannes Berg
d5befb224e mac80211: fix deadlock in AP/VLAN handling
Syzbot reports that when you have AP_VLAN interfaces that are up
and close the AP interface they belong to, we get a deadlock. No
surprise - since we dev_close() them with the wiphy mutex held,
which goes back into the netdev notifier in cfg80211 and tries to
acquire the wiphy mutex there.

To fix this, we need to do two things:
 1) prevent changing iftype while AP_VLANs are up, we can't
    easily fix this case since cfg80211 already calls us with
    the wiphy mutex held, but change_interface() is relatively
    rare in drivers anyway, so changing iftype isn't used much
    (and userspace has to fall back to down/change/up anyway)
 2) pull the dev_close() loop over VLANs out of the wiphy mutex
    section in the normal stop case

Cc: stable@vger.kernel.org
Reported-by: syzbot+452ea4fbbef700ff0a56@syzkaller.appspotmail.com
Fixes: a05829a722 ("cfg80211: avoid holding the RTNL when calling the driver")
Link: https://lore.kernel.org/r/20210517160322.9b8f356c0222.I392cb0e2fa5a1a94cf2e637555d702c7e512c1ff@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-08 11:33:07 +02:00
Ming Lei
1e0d4e6225 scsi: core: Only put parent device if host state differs from SHOST_CREATED
get_device(shost->shost_gendev.parent) is called after host state has
switched to SHOST_RUNNING. scsi_host_dev_release() shouldn't release the
parent device if host state is still SHOST_CREATED.

Link: https://lore.kernel.org/r/20210602133029.2864069-5-ming.lei@redhat.com
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-07 22:12:44 -04:00
Ming Lei
11714026c0 scsi: core: Put .shost_dev in failure path if host state changes to RUNNING
scsi_host_dev_release() only frees dev_name when host state is
SHOST_CREATED. After host state has changed to SHOST_RUNNING,
scsi_host_dev_release() no longer cleans up.

Fix this by doing a put_device(&shost->shost_dev) in the failure path when
host state is SHOST_RUNNING. Move get_device(&shost->shost_gendev) before
device_add(&shost->shost_dev) so that scsi_host_cls_release() can do a put
on this reference.

Link: https://lore.kernel.org/r/20210602133029.2864069-4-ming.lei@redhat.com
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Hannes Reinecke <hare@suse.de>
Reported-by: John Garry <john.garry@huawei.com>
Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-07 22:12:44 -04:00
Ming Lei
3719f4ff04 scsi: core: Fix failure handling of scsi_add_host_with_dma()
When scsi_add_host_with_dma() returns failure, the caller will call
scsi_host_put(shost) to release everything allocated for this host
instance. Consequently we can't also free allocated stuff in
scsi_add_host_with_dma(), otherwise we will end up with a double free.

Strictly speaking, host resource allocations should have been done in
scsi_host_alloc(). However, the allocations may need information which is
not yet provided by the driver when that function is called. So leave the
allocations where they are but rely on host device's release handler to
free resources.

Link: https://lore.kernel.org/r/20210602133029.2864069-3-ming.lei@redhat.com
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-07 22:12:44 -04:00
Ming Lei
66a834d092 scsi: core: Fix error handling of scsi_host_alloc()
After device is initialized via device_initialize(), or its name is set via
dev_set_name(), the device has to be freed via put_device().  Otherwise
device name will be leaked because it is allocated dynamically in
dev_set_name().

Fix the leak by replacing kfree() with put_device(). Since
scsi_host_dev_release() properly handles IDA and kthread removal, remove
special-casing these from the error handling as well.

Link: https://lore.kernel.org/r/20210602133029.2864069-2-ming.lei@redhat.com
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-07 22:12:44 -04:00
Kev Jackson
11fc79fc9f libbpf: Fixes incorrect rx_ring_setup_done
When calling xsk_socket__create_shared(), the logic at line 1097 marks a
boolean flag true within the xsk_umem structure to track setup progress
in order to support multiple calls to the function.  However, instead of
marking umem->tx_ring_setup_done, the code incorrectly sets
umem->rx_ring_setup_done.  This leads to improper behaviour when
creating and destroying xsk and umem structures.

Multiple calls to this function is documented as supported.

Fixes: ca7a83e248 ("libbpf: Only create rx and tx XDP rings when necessary")
Signed-off-by: Kev Jackson <foamdino@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/YL4aU4f3Aaik7CN0@linux-dev
2021-06-07 17:44:03 -07:00
David Ahern
7a6b1ab747 neighbour: allow NUD_NOARP entries to be forced GCed
IFF_POINTOPOINT interfaces use NUD_NOARP entries for IPv6. It's possible to
fill up the neighbour table with enough entries that it will overflow for
valid connections after that.

This behaviour is more prevalent after commit 58956317c8 ("neighbor:
Improve garbage collection") is applied, as it prevents removal from
entries that are not NUD_FAILED, unless they are more than 5s old.

Fixes: 58956317c8 (neighbor: Improve garbage collection)
Reported-by: Kasper Dupont <kasperd@gjkwv.06.feb.2021.kasperd.net>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 15:25:47 -07:00
Pavel Skripkin
a47c397bb2 revert "net: kcm: fix memory leak in kcm_sendmsg"
In commit c47cc30499 ("net: kcm: fix memory leak in kcm_sendmsg")
I misunderstood the root case of the memory leak and came up with
completely broken fix.

So, simply revert this commit to avoid GPF reported by
syzbot.

Im so sorry for this situation.

Fixes: c47cc30499 ("net: kcm: fix memory leak in kcm_sendmsg")
Reported-by: syzbot+65badd5e74ec62cb67dc@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 13:34:37 -07:00
David S. Miller
aaab3076d7 Merge branch 'mlxsw-fixes'
Merge branch 'mlxsw-fixes'

Ido Schimmel says:

====================
mlxsw: Thermal and qdisc fixes

Patches #1-#2 fix wrong validation of burst size in qdisc code and a
user triggerable WARN_ON().

Patch #3 fixes a regression in thermal monitoring of transceiver modules
and gearboxes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 13:12:08 -07:00
Mykola Kostenok
2fd8d84ce3 mlxsw: core: Set thermal zone polling delay argument to real value at init
Thermal polling delay argument for modules and gearboxes thermal zones
used to be initialized with zero value, while actual delay was used to
be set by mlxsw_thermal_set_mode() by thermal operation callback
set_mode(). After operations set_mode()/get_mode() have been removed by
cited commits, modules and gearboxes thermal zones always have polling
time set to zero and do not perform temperature monitoring.

Set non-zero "polling_delay" in thermal_zone_device_register() routine,
thus, the relevant thermal zones will perform thermal monitoring.

Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Fixes: 5d7bd8aa7c ("thermal: Simplify or eliminate unnecessary set_mode() methods")
Fixes: 1ee14820fd ("thermal: remove get_mode() operation of drivers")
Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 13:11:41 -07:00
Petr Machata
d566ed04e4 mlxsw: spectrum_qdisc: Pass handle, not band number to find_class()
In mlxsw Qdisc offload, find_class() is an operation that yields a qdisc
offload descriptor given a parental qdisc descriptor and a class handle. In
__mlxsw_sp_qdisc_ets_graft() however, a band number is passed to that
function instead of a handle. This can lead to a trigger of a WARN_ON
with the following splat:

 WARNING: CPU: 3 PID: 808 at drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c:1356 __mlxsw_sp_qdisc_ets_graft+0x115/0x130 [mlxsw_spectrum]
 [...]
 Call Trace:
  mlxsw_sp_setup_tc_prio+0xe3/0x100 [mlxsw_spectrum]
  qdisc_offload_graft_helper+0x35/0xa0
  prio_graft+0x176/0x290 [sch_prio]
  qdisc_graft+0xb3/0x540
  tc_modify_qdisc+0x56a/0x8a0
  rtnetlink_rcv_msg+0x12c/0x370
  netlink_rcv_skb+0x49/0xf0
  netlink_unicast+0x1f6/0x2b0
  netlink_sendmsg+0x1fb/0x410
  ____sys_sendmsg+0x1f3/0x220
  ___sys_sendmsg+0x70/0xb0
  __sys_sendmsg+0x54/0xa0
  do_syscall_64+0x3a/0x70
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Since the parent handle is not passed with the offload information, compute
it from the band number and qdisc handle.

Fixes: 28052e618b04 ("mlxsw: spectrum_qdisc: Track children per qdisc")
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 13:11:41 -07:00
Petr Machata
306b9228c0 mlxsw: reg: Spectrum-3: Enforce lowest max-shaper burst size of 11
A max-shaper is the HW component responsible for delaying egress traffic
above a configured transmission rate. Burst size is the amount of traffic
that is allowed to pass without accounting. The burst size value needs to
be such that it can be expressed as 2^BS * 512 bits, where BS lies in a
certain ASIC-dependent range. mlxsw enforces that this holds before
attempting to configure the shaper.

The assumption for Spectrum-3 was that the lower limit of BS would be 5,
like for Spectrum-1. But as of now, the limit is still 11. Therefore fix
the driver accordingly, so that incorrect values are rejected early with a
proper message.

Fixes: 23effa2479 ("mlxsw: reg: Add max_shaper_bs to QoS ETS Element Configuration")
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 13:11:41 -07:00
Ido Schimmel
51c96a561f ethtool: Fix NULL pointer dereference during module EEPROM dump
When get_module_eeprom_by_page() is not implemented by the driver, NULL
pointer dereference can occur [1].

Fix by testing if get_module_eeprom_by_page() is implemented instead of
get_module_info().

[1]
 BUG: kernel NULL pointer dereference, address: 0000000000000000
 [...]
 CPU: 0 PID: 251 Comm: ethtool Not tainted 5.13.0-rc3-custom-00940-g3822d0670c9d #989
 Call Trace:
  eeprom_prepare_data+0x101/0x2d0
  ethnl_default_doit+0xc2/0x290
  genl_family_rcv_msg_doit+0xdc/0x140
  genl_rcv_msg+0xd7/0x1d0
  netlink_rcv_skb+0x49/0xf0
  genl_rcv+0x1f/0x30
  netlink_unicast+0x1f6/0x2c0
  netlink_sendmsg+0x1f9/0x400
  __sys_sendto+0xe1/0x130
  __x64_sys_sendto+0x1b/0x20
  do_syscall_64+0x3a/0x70
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: c97a31f66e ("ethtool: wire in generic SFP module access")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 13:10:34 -07:00
Marc Dionne
dc2557308e afs: Fix partial writeback of large files on fsync and close
In commit e87b03f583 ("afs: Prepare for use of THPs"), the return
value for afs_write_back_from_locked_page was changed from a number
of pages to a length in bytes.  The loop in afs_writepages_region uses
the return value to compute the index that will be used to find dirty
pages in the next iteration, but treats it as a number of pages and
wrongly multiplies it by PAGE_SIZE.  This gives a very large index value,
potentially skipping any dirty data that was not covered in the first
pass, which is limited to 256M.

This causes fsync(), and indirectly close(), to only do a partial
writeback of a large file's dirty data.  The rest is eventually written
back by background threads after dirty_expire_centisecs.

Fixes: e87b03f583 ("afs: Prepare for use of THPs")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20210604175504.4055-1-marc.c.dionne@gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-07 12:56:05 -07:00
Srinivasa Rao Mandadapu
c8a4556d98 ASoC: qcom: lpass-cpu: Fix pop noise during audio capture begin
This patch fixes PoP noise of around 15ms observed during audio
capture begin.
Enables BCLK and LRCLK in snd_soc_dai_ops prepare call for
introducing some delay before capture start.

(am from https://patchwork.kernel.org/patch/12276369/)
(also found at https://lore.kernel.org/r/20210524142114.18676-1-srivasam@codeaurora.org)

Co-developed-by: Judy Hsiao <judyhsiao@chromium.org>
Signed-off-by: Judy Hsiao <judyhsiao@chromium.org>
Signed-off-by: Srinivasa Rao Mandadapu <srivasam@codeaurora.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210604154545.1198337-1-judyhsiao@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-07 15:54:08 +01:00
Roger Pau Monne
107866a8eb xen-netback: take a reference to the RX task thread
Do this in order to prevent the task from being freed if the thread
returns (which can be triggered by the frontend) before the call to
kthread_stop done as part of the backend tear down. Not taking the
reference will lead to a use-after-free in that scenario. Such
reference was taken before but dropped as part of the rework done in
2ac061ce97.

Reintroduce the reference taking and add a comment this time
explaining why it's needed.

This is XSA-374 / CVE-2021-28691.

Fixes: 2ac061ce97 ('xen/netback: cleanup init and deinit code')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-06-07 15:13:15 +02:00
Zhang Rui
f1ffa9d4cc Revert "ACPI: sleep: Put the FACS table after using it"
Commit 95722237cb ("ACPI: sleep: Put the FACS table after using it")
puts the FACS table during initialization.

But the hardware signature bits in the FACS table need to be accessed,
after every hibernation, to compare with the original hardware
signature.

So there is no reason to release the FACS table mapping after
initialization.

This reverts commit 95722237cb.

An alternative solution is to use acpi_gbl_FACS variable instead, which
is mapped by the ACPICA core and never released.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=212277
Reported-by: Stephan Hohe <sth.dev@tejp.de>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.8+ <stable@vger.kernel.org> # 5.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-07 15:00:52 +02:00
Saravana Kannan
9bf3797796 drm/sun4i: dw-hdmi: Make HDMI PHY into a platform device
On sunxi boards that use HDMI output, HDMI device probe keeps being
avoided indefinitely with these repeated messages in dmesg:

  platform 1ee0000.hdmi: probe deferral - supplier 1ef0000.hdmi-phy
    not ready

There's a fwnode_link being created with fw_devlink=on between hdmi
and hdmi-phy nodes, because both nodes have 'compatible' property set.

Fw_devlink code assumes that nodes that have compatible property
set will also have a device associated with them by some driver
eventually. This is not the case with the current sun8i-hdmi
driver.

This commit makes sun8i-hdmi-phy into a proper platform device
and fixes the display pipeline probe on sunxi boards that use HDMI.

More context: https://lkml.org/lkml/2021/5/16/203

Signed-off-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Ondrej Jirman <megous@megous.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210607085836.2827429-1-megous@megous.com
2021-06-07 13:41:31 +02:00
Alexander Gordeev
1874cb13d5 s390/mcck: fix invalid KVM guest condition check
Wrong condition check is used to decide if a machine check hit
while in KVM guest. As result of this check the instruction
following the SIE critical section might be considered as still
in KVM guest and _CIF_MCCK_GUEST CPU flag mistakenly set as
result.

Fixes: c929500d7a ("s390/nmi: s390: New low level handling for machine check happening in guest")
Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-07 12:12:03 +02:00
Alexander Gordeev
5bcbe3285f s390/mcck: fix calculation of SIE critical section size
The size of SIE critical section is calculated wrongly
as result of a missed subtraction in commit 0b0ed657fe
("s390: remove critical section cleanup from entry.S")

Fixes: 0b0ed657fe ("s390: remove critical section cleanup from entry.S")
Cc: <stable@vger.kernel.org>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-07 12:12:03 +02:00
Sergio Paracuellos
eb367d875f pinctrl: ralink: rt2880: avoid to error in calls is pin is already enabled
In 'rt2880_pmx_group_enable' driver is printing an error and returning
-EBUSY if a pin has been already enabled. This begets anoying messages
in the caller when this happens like the following:

rt2880-pinmux pinctrl: pcie is already enabled
mt7621-pci 1e140000.pcie: Error applying setting, reverse things back

To avoid this just print the already enabled message in the pinctrl
driver and return 0 instead to not confuse the user with a real
bad problem.

Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Link: https://lore.kernel.org/r/20210604055337.20407-1-sergio.paracuellos@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-06-07 09:08:53 +02:00
Guillaume Ranquet
9041575348 dmaengine: mediatek: use GFP_NOWAIT instead of GFP_ATOMIC in prep_dma
As recommended by the doc in:
Documentation/drivers-api/dmaengine/provider.rst

Use GFP_NOWAIT to not deplete the emergency pool.

Signed-off-by: Guillaume Ranquet <granquet@baylibre.com>

Link: https://lore.kernel.org/r/20210513192642.29446-4-granquet@baylibre.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-07 12:23:47 +05:30
Guillaume Ranquet
2537b40b0a dmaengine: mediatek: do not issue a new desc if one is still current
Avoid issuing a new desc if one is still being processed as this can
lead to some desc never being marked as completed.

Signed-off-by: Guillaume Ranquet <granquet@baylibre.com>

Link: https://lore.kernel.org/r/20210513192642.29446-3-granquet@baylibre.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-07 12:23:47 +05:30
Guillaume Ranquet
0a2ff58f9f dmaengine: mediatek: free the proper desc in desc_free handler
The desc_free handler assumed that the desc we want to free was always
 the current one associated with the channel.

This is seldom the case and this is causing use after free crashes in
 multiple places (tx/rx/terminate...).

  BUG: KASAN: use-after-free in mtk_uart_apdma_rx_handler+0x120/0x304

  Call trace:
   dump_backtrace+0x0/0x1b0
   show_stack+0x24/0x34
   dump_stack+0xe0/0x150
   print_address_description+0x8c/0x55c
   __kasan_report+0x1b8/0x218
   kasan_report+0x14/0x20
   __asan_load4+0x98/0x9c
   mtk_uart_apdma_rx_handler+0x120/0x304
   mtk_uart_apdma_irq_handler+0x50/0x80
   __handle_irq_event_percpu+0xe0/0x210
   handle_irq_event+0x8c/0x184
   handle_fasteoi_irq+0x1d8/0x3ac
   __handle_domain_irq+0xb0/0x110
   gic_handle_irq+0x50/0xb8
   el0_irq_naked+0x60/0x6c

  Allocated by task 3541:
   __kasan_kmalloc+0xf0/0x1b0
   kasan_kmalloc+0x10/0x1c
   kmem_cache_alloc_trace+0x90/0x2dc
   mtk_uart_apdma_prep_slave_sg+0x6c/0x1a0
   mtk8250_dma_rx_complete+0x220/0x2e4
   vchan_complete+0x290/0x340
   tasklet_action_common+0x220/0x298
   tasklet_action+0x28/0x34
   __do_softirq+0x158/0x35c

  Freed by task 3541:
   __kasan_slab_free+0x154/0x224
   kasan_slab_free+0x14/0x24
   slab_free_freelist_hook+0xf8/0x15c
   kfree+0xb4/0x278
   mtk_uart_apdma_desc_free+0x34/0x44
   vchan_complete+0x1bc/0x340
   tasklet_action_common+0x220/0x298
   tasklet_action+0x28/0x34
   __do_softirq+0x158/0x35c

  The buggy address belongs to the object at ffff000063606800
   which belongs to the cache kmalloc-256 of size 256
  The buggy address is located 176 bytes inside of
   256-byte region [ffff000063606800, ffff000063606900)
  The buggy address belongs to the page:
  page:fffffe00016d8180 refcount:1 mapcount:0 mapping:ffff00000302f600 index:0x0 compound_mapcount: 0
  flags: 0xffff00000010200(slab|head)
  raw: 0ffff00000010200 dead000000000100 dead000000000122 ffff00000302f600
  raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
  page dumped because: kasan: bad access detected

Signed-off-by: Guillaume Ranquet <granquet@baylibre.com>

Link: https://lore.kernel.org/r/20210513192642.29446-2-granquet@baylibre.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-07 12:23:47 +05:30
Manivannan Sadhasivam
0e4bf265b1 pinctrl: qcom: Fix duplication in gpio_groups
"gpio52" and "gpio53" are duplicated in gpio_groups, fix them!

Fixes: ac43c44a7a ("pinctrl: qcom: Add SDX55 pincontrol driver")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210526082857.174682-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-06-07 00:18:55 +02:00
Christophe Leroy
8e11d62e2e powerpc/mem: Add back missing header to fix 'no previous prototype' error
Commit b26e8f2725 ("powerpc/mem: Move cache flushing functions into
mm/cacheflush.c") removed asm/sparsemem.h which is required when
CONFIG_MEMORY_HOTPLUG is selected to get the declaration of
create_section_mapping().

Add it back.

Fixes: b26e8f2725 ("powerpc/mem: Move cache flushing functions into mm/cacheflush.c")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3e5b63bb3daab54a1eb9c20221c2e9528c4db9b3.1622883330.git.christophe.leroy@csgroup.eu
2021-06-06 21:43:11 +10:00
Takashi Sakamoto
9981b20a5e ALSA: firewire-lib: fix the context to call snd_pcm_stop_xrun()
In the workqueue to queue wake-up event, isochronous context is not
processed, thus it's useless to check context for the workqueue to switch
status of runtime for PCM substream to XRUN. On the other hand, in
software IRQ context of 1394 OHCI, it's needed.

This commit fixes the bug introduced when tasklet was replaced with
workqueue.

Cc: <stable@vger.kernel.org>
Fixes: 2b3d2987d8 ("ALSA: firewire: Replace tasklet with work")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20210605091054.68866-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-05 14:52:58 +02:00
Jeremy Szu
dfb06401b4 ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 840 Aero G8
The HP EliteBook 840 Aero G8 using ALC285 codec which using 0x04 to
control mute LED and 0x01 to control micmute LED.
In the other hand, there is no output from right channel of speaker.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210605082539.41797-3-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-05 14:51:32 +02:00
Jeremy Szu
61d3e87468 ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP EliteBook x360 1040 G8
The HP EliteBook x360 1040 G8 using ALC285 codec which using 0x04 to control
mute LED and 0x01 to control micmute LED.
In the other hand, there is no output from right channel of speaker.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210605082539.41797-2-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-05 14:51:15 +02:00
Jeremy Szu
15d295b560 ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Elite Dragonfly G2
The HP Elite Dragonfly G2 using ALC285 codec which using 0x04 to control
mute LED and 0x01 to control micmute LED.
In the other hand, there is no output from right channel of speaker.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210605082539.41797-1-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-05 14:50:52 +02:00
George McCollister
bc96c72df3 USB: serial: ftdi_sio: add NovaTech OrionMX product ID
Add PID for the NovaTech OrionMX so it can be automatically detected.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-06-05 12:26:01 +02:00
Mykola Kostenok
701b54bcb7 platform/mellanox: mlxreg-hotplug: Revert "move to use request_irq by IRQF_NO_AUTOEN flag"
It causes mlxreg-hotplug probing failure: request_threaded_irq()
 returns -EINVAL due to true value of condition:
((irqflags & IRQF_SHARED) && (irqflags & IRQF_NO_AUTOEN))
after flag "IRQF_NO_AUTOEN" has been added to:
	err = devm_request_irq(&pdev->dev, priv->irq,
			       mlxreg_hotplug_irq_handler, IRQF_TRIGGER_FALLING
			       | IRQF_SHARED | IRQF_NO_AUTOEN,
			       "mlxreg-hotplug", priv);

This reverts commit bee3ecfed0 ("platform/mellanox: mlxreg-hotplug: move
to use request_irq by IRQF_NO_AUTOEN flag").

Signed-off-by: Mykola Kostenok <c_mykolak@nvidia.com>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210603172827.2599908-1-c_mykolak@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-04 22:03:13 +02:00
Maximilian Luz
6325ce1542 platform/surface: dtx: Add missing mutex_destroy() call in failure path
When we fail to open the device file due to DTX being shut down, the
mutex is initialized but never destroyed. We are destroying it when
releasing the file, so add the missing call in the failure path as well.

Fixes: 1d60999283 ("platform/surface: Add DTX driver")
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Link: https://lore.kernel.org/r/20210604132540.533036-1-luzmaximilian@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-04 22:00:27 +02:00
Dietmar Eggemann
f501b6a231 debugfs: Fix debugfs_read_file_str()
Read the entire size of the buffer, including the trailing new line
character.
Discovered while reading the sched domain names of CPU0:

before:

cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
SMTMCDIE

after:

cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
SMT
MC
DIE

Fixes: 9af0440ec8 ("debugfs: Implement debugfs_create_str()")
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20210527091105.258457-1-dietmar.eggemann@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 15:01:08 +02:00
Oder Chiou
49783c6f4a ASoC: rt5682: Fix the fast discharge for headset unplugging in soundwire mode
Based on ("5a15cd7fce20b1fd4aece6a0240e2b58cd6a225d"), the setting also
should be set in soundwire mode.

Signed-off-by: Oder Chiou <oder_chiou@realtek.com>
Link: https://lore.kernel.org/r/20210604063150.29925-1-oder_chiou@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-04 12:54:19 +01:00
Wesley Cheng
6fc1db5e62 usb: gadget: f_fs: Ensure io_completion_wq is idle during unbind
During unbind, ffs_func_eps_disable() will be executed, resulting in
completion callbacks for any pending USB requests.  When using AIO,
irrespective of the completion status, io_data work is queued to
io_completion_wq to evaluate and handle the completed requests.  Since
work runs asynchronously to the unbind() routine, there can be a
scenario where the work runs after the USB gadget has been fully
removed, resulting in accessing of a resource which has been already
freed. (i.e. usb_ep_free_request() accessing the USB ep structure)

Explicitly drain the io_completion_wq, instead of relying on the
destroy_workqueue() (in ffs_data_put()) to make sure no pending
completion work items are running.

Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1621644261-1236-1-git-send-email-wcheng@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 13:40:46 +02:00
Li Jun
024236abeb usb: typec: tcpm: cancel send discover hrtimer when unregister tcpm port
Like the state_machine_timer, we should also cancel possible pending
send discover identity hrtimer when unregister tcpm port.

Fixes: c34e85fa69 ("usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work")
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Li Jun <jun.li@nxp.com>
Link: https://lore.kernel.org/r/1622627829-11070-3-git-send-email-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 13:28:01 +02:00
Li Jun
7ade4805e2 usb: typec: tcpm: cancel frs hrtimer when unregister tcpm port
Like the state_machine_timer, we should also cancel possible pending
frs hrtimer when unregister tcpm port.

Fixes: 8dc4bd0736 ("usb: typec: tcpm: Add support for Sink Fast Role SWAP(FRS)")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Li Jun <jun.li@nxp.com>
Link: https://lore.kernel.org/r/1622627829-11070-2-git-send-email-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 13:27:58 +02:00
Li Jun
3a13ff7ef4 usb: typec: tcpm: cancel vdm and state machine hrtimer when unregister tcpm port
A pending hrtimer may expire after the kthread_worker of tcpm port
is destroyed, see below kernel dump when do module unload, fix it
by cancel the 2 hrtimers.

[  111.517018] Unable to handle kernel paging request at virtual address ffff8000118cb880
[  111.518786] blk_update_request: I/O error, dev sda, sector 60061185 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  111.526594] Mem abort info:
[  111.526597]   ESR = 0x96000047
[  111.526600]   EC = 0x25: DABT (current EL), IL = 32 bits
[  111.526604]   SET = 0, FnV = 0
[  111.526607]   EA = 0, S1PTW = 0
[  111.526610] Data abort info:
[  111.526612]   ISV = 0, ISS = 0x00000047
[  111.526615]   CM = 0, WnR = 1
[  111.526619] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041d75000
[  111.526623] [ffff8000118cb880] pgd=10000001bffff003, p4d=10000001bffff003, pud=10000001bfffe003, pmd=10000001bfffa003, pte=0000000000000000
[  111.526642] Internal error: Oops: 96000047 [#1] PREEMPT SMP
[  111.526647] Modules linked in: dwc3_imx8mp dwc3 phy_fsl_imx8mq_usb [last unloaded: tcpci]
[  111.526663] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-rc4-00927-gebbe9dbd802c-dirty #36
[  111.526670] Hardware name: NXP i.MX8MPlus EVK board (DT)
[  111.526674] pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO BTYPE=--)
[  111.526681] pc : queued_spin_lock_slowpath+0x1a0/0x390
[  111.526695] lr : _raw_spin_lock_irqsave+0x88/0xb4
[  111.526703] sp : ffff800010003e20
[  111.526706] x29: ffff800010003e20 x28: ffff00017f380180
[  111.537156] buffer_io_error: 6 callbacks suppressed
[  111.537162] Buffer I/O error on dev sda1, logical block 60040704, async page read
[  111.539932]  x27: ffff00017f3801c0
[  111.539938] x26: ffff800010ba2490 x25: 0000000000000000 x24: 0000000000000001
[  111.543025] blk_update_request: I/O error, dev sda, sector 60061186 op 0x0:(READ) flags 0x0 phys_seg 7 prio class 0
[  111.548304]
[  111.548306] x23: 00000000000000c0 x22: ffff0000c2a9f184 x21: ffff00017f380180
[  111.551374] Buffer I/O error on dev sda1, logical block 60040705, async page read
[  111.554499]
[  111.554503] x20: ffff0000c5f14210 x19: 00000000000000c0 x18: 0000000000000000
[  111.557391] Buffer I/O error on dev sda1, logical block 60040706, async page read
[  111.561218]
[  111.561222] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[  111.564205] Buffer I/O error on dev sda1, logical block 60040707, async page read
[  111.570887] x14: 00000000000000f5 x13: 0000000000000001 x12: 0000000000000040
[  111.570902] x11: ffff0000c05ac6d8
[  111.583420] Buffer I/O error on dev sda1, logical block 60040708, async page read
[  111.588978]  x10: 0000000000000000 x9 : 0000000000040000
[  111.588988] x8 : 0000000000000000
[  111.597173] Buffer I/O error on dev sda1, logical block 60040709, async page read
[  111.605766]  x7 : ffff00017f384880 x6 : ffff8000118cb880
[  111.605777] x5 : ffff00017f384880
[  111.611094] Buffer I/O error on dev sda1, logical block 60040710, async page read
[  111.617086]  x4 : 0000000000000000 x3 : ffff0000c2a9f184
[  111.617096] x2 : ffff8000118cb880
[  111.622242] Buffer I/O error on dev sda1, logical block 60040711, async page read
[  111.626927]  x1 : ffff8000118cb880 x0 : ffff00017f384888
[  111.626938] Call trace:
[  111.626942]  queued_spin_lock_slowpath+0x1a0/0x390
[  111.795809]  kthread_queue_work+0x30/0xc0
[  111.799828]  state_machine_timer_handler+0x20/0x30
[  111.804624]  __hrtimer_run_queues+0x140/0x1e0
[  111.808990]  hrtimer_interrupt+0xec/0x2c0
[  111.813004]  arch_timer_handler_phys+0x38/0x50
[  111.817456]  handle_percpu_devid_irq+0x88/0x150
[  111.821991]  __handle_domain_irq+0x80/0xe0
[  111.826093]  gic_handle_irq+0xc0/0x140
[  111.829848]  el1_irq+0xbc/0x154
[  111.832991]  arch_cpu_idle+0x1c/0x2c
[  111.836572]  default_idle_call+0x24/0x6c
[  111.840497]  do_idle+0x238/0x2ac
[  111.843729]  cpu_startup_entry+0x2c/0x70
[  111.847657]  rest_init+0xdc/0xec
[  111.850890]  arch_call_rest_init+0x14/0x20
[  111.854988]  start_kernel+0x508/0x540
[  111.858659] Code: 910020e0 8b0200c2 f861d884 aa0203e1 (f8246827)
[  111.864760] ---[ end trace 308b9a4a3dcb73ac ]---
[  111.869381] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[  111.876258] SMP: stopping secondary CPUs
[  111.880185] Kernel Offset: disabled
[  111.883673] CPU features: 0x00001001,20000846
[  111.888031] Memory Limit: none
[  111.891090] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---

Fixes: 3ed8e1c2ac ("usb: typec: tcpm: Migrate workqueue to RT priority for processing events")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Li Jun <jun.li@nxp.com>
Link: https://lore.kernel.org/r/1622627829-11070-1-git-send-email-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 13:27:52 +02:00
Kyle Tso
063933f47a usb: typec: tcpm: Properly handle Alert and Status Messages
When receiving Alert Message, if it is not unexpected but is
unsupported for some reason, the port should return Not_Supported
Message response.

Also, according to PD3.0 Spec 6.5.2.1.4 Event Flags Field, the
OTP/OVP/OCP flags in the Event Flags field in Status Message no longer
require Get_PPS_Status Message to clear them. Thus remove it when
receiving Status Message with those flags being set.

In addition, add the missing AMS operations for Status Message.

Fixes: 64f7c494a3 ("typec: tcpm: Add support for sink PPS related messages")
Fixes: 0908c5aca3 ("usb: typec: tcpm: AMS and Collision Avoidance")
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210531164928.2368606-1-kyletso@google.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 13:24:16 +02:00
Nikolay Borisov
aefd7f7065 btrfs: promote debugging asserts to full-fledged checks in validate_super
Syzbot managed to trigger this assert while performing its fuzzing.
Turns out it's better to have those asserts turned into full-fledged
checks so that in case buggy btrfs images are mounted the users gets
an error and mounting is stopped. Alternatively with CONFIG_BTRFS_ASSERT
disabled such image would have been erroneously allowed to be mounted.

Reported-by: syzbot+a6bf271c02e4fe66b4e4@syzkaller.appspotmail.com
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add uuids to the messages ]
Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-04 13:12:06 +02:00
Ritesh Harjani
e7b2ec3d3d btrfs: return value from btrfs_mark_extent_written() in case of error
We always return 0 even in case of an error in btrfs_mark_extent_written().
Fix it to return proper error value in case of a failure. All callers
handle it.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-04 13:11:58 +02:00
Naohiro Aota
5b434df877 btrfs: zoned: fix zone number to sector/physical calculation
In btrfs_get_dev_zone_info(), we have "u32 sb_zone" and calculate "sector_t
sector" by shifting it. But, this "sector" is calculated in 32bit, leading
it to be 0 for the 2nd superblock copy.

Since zone number is u32, shifting it to sector (sector_t) or physical
address (u64) can easily trigger a missing cast bug like this.

This commit introduces helpers to convert zone number to sector/LBA, so we
won't fall into the same pitfall again.

Reported-by: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>
Fixes: 12659251ca ("btrfs: implement log-structured superblock for ZONED mode")
CC: stable@vger.kernel.org # 5.11+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-04 13:11:50 +02:00
Josef Bacik
165ea85f14 btrfs: do not write supers if we have an fs error
Error injection testing uncovered a pretty severe problem where we could
end up committing a super that pointed to the wrong tree roots,
resulting in transid mismatch errors.

The way we commit the transaction is we update the super copy with the
current generations and bytenrs of the important roots, and then copy
that into our super_for_commit.  Then we allow transactions to continue
again, we write out the dirty pages for the transaction, and then we
write the super.  If the write out fails we'll bail and skip writing the
supers.

However since we've allowed a new transaction to start, we can have a
log attempting to sync at this point, which would be blocked on
fs_info->tree_log_mutex.  Once the commit fails we're allowed to do the
log tree commit, which uses super_for_commit, which now points at fs
tree's that were not written out.

Fix this by checking BTRFS_FS_STATE_ERROR once we acquire the
tree_log_mutex.  This way if the transaction commit fails we're sure to
see this bit set and we can skip writing the super out.  This patch
fixes this specific transid mismatch error I was seeing with this
particular error path.

CC: stable@vger.kernel.org # 5.12+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-04 13:11:38 +02:00
Neil Armstrong
4d2aa178d2 usb: dwc3-meson-g12a: fix usb2 PHY glue init when phy0 is disabled
When only PHY1 is used (for example on Odroid-HC4), the regmap init code
uses the usb2 ports when doesn't initialize the PHY1 regmap entry.

This fixes:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
pc : regmap_update_bits_base+0x40/0xa0
lr : dwc3_meson_g12a_usb2_init_phy+0x4c/0xf8
...
Call trace:
regmap_update_bits_base+0x40/0xa0
dwc3_meson_g12a_usb2_init_phy+0x4c/0xf8
dwc3_meson_g12a_usb2_init+0x7c/0xc8
dwc3_meson_g12a_usb_init+0x28/0x48
dwc3_meson_g12a_probe+0x298/0x540
platform_probe+0x70/0xe0
really_probe+0xf0/0x4d8
driver_probe_device+0xfc/0x168
...

Fixes: 013af227f5 ("usb: dwc3: meson-g12a: handle the phy and glue registers separately")
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210601084830.260196-1-narmstrong@baylibre.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 12:58:55 +02:00
Christophe JAILLET
1d0d3d818e usb: dwc3: meson-g12a: Disable the regulator in the error handling path of the probe
If an error occurs after a successful 'regulator_enable()' call,
'regulator_disable()' must be called.

Fix the error handling path of the probe accordingly.

The remove function doesn't need to be fixed, because the
'regulator_disable()' call is already hidden in 'dwc3_meson_g12a_suspend()'
which is called via 'pm_runtime_set_suspended()' in the remove function.

Fixes: c99993376f ("usb: dwc3: Add Amlogic G12A DWC3 glue")
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/79df054046224bbb0716a8c5c2082650290eec86.1621616013.git.christophe.jaillet@wanadoo.fr
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 12:55:27 +02:00
Greg Kroah-Hartman
757d2e6065 Merge tag 'phy-fixes-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy into char-misc-linus
Vinod writes:

phy: fixes for 5.13

Phy driver fixes for few drivers: cadence, mtk-tphy, sparx5, wiz mostly
fixing error code and checking return codes etc

* tag 'phy-fixes-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: Sparx5 Eth SerDes: check return value after calling platform_get_resource()
  phy: ralink: phy-mt7621-pci: drop 'of_match_ptr' to fix -Wunused-const-variable
  phy: ti: Fix an error code in wiz_probe()
  phy: phy-mtk-tphy: Fix some resource leaks in mtk_phy_init()
  phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe()
  phy: usb: Fix misuse of IS_ENABLED
2021-06-04 11:43:33 +02:00
Kyle Tso
80137c1873 usb: typec: tcpm: Fix misuses of AMS invocation
tcpm_ams_start is used to initiate an AMS as well as checking Collision
Avoidance conditions but not for flagging passive AMS (initiated by the
port partner). Fix the misuses of tcpm_ams_start in tcpm_pd_svdm.

ATTENTION doesn't need responses so the AMS flag is not needed here.

Fixes: 0bc3ee9288 ("usb: typec: tcpm: Properly interrupt VDM AMS")
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210601123151.3441914-5-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 11:43:01 +02:00
Kyle Tso
7ac5051035 usb: typec: tcpm: Introduce snk_vdo_v1 for SVDM version 1.0
The ID Header VDO and Product VDOs defined in USB PD Spec rev 2.0 and
rev 3.1 are quite different. Add an additional array snk_vdo_v1 and
send it as the response to the port partner if it only supports SVDM
version 1.0.

Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210601123151.3441914-4-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 11:43:01 +02:00
Kyle Tso
55b54c269b dt-bindings: connector: Add PD rev 2.0 VDO definition
Add the VDO definition for USB PD rev 2.0 in the bindings and define a
new property snk-vdos-v1 containing legacy VDOs as the responses to the
port partner which only supports PD rev 2.0.

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210601123151.3441914-3-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 11:43:01 +02:00
Kyle Tso
f41bfc7e9c usb: typec: tcpm: Correct the responses in SVDM Version 2.0 DFP
In USB PD Spec Rev 3.1 Ver 1.0, section "6.12.5 Applicability of
Structured VDM Commands", DFP is allowed and recommended to respond to
Discovery Identity with ACK. And in section "6.4.4.2.5.1 Commands other
than Attention", NAK should be returned only when receiving Messages
with invalid fields, Messages in wrong situation, or unrecognize
Messages.

Still keep the original design for SVDM Version 1.0 for backward
compatibilities.

Fixes: 193a68011f ("staging: typec: tcpm: Respond to Discover Identity commands")
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210601123151.3441914-2-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 11:43:01 +02:00
Alexandru Elisei
8f11fe7e40 Revert "usb: dwc3: core: Add shutdown callback for dwc3"
This reverts commit 568262bf54.

The commit causes the following panic when shutting down a rockpro64-v2
board:

[..]
[   41.684569] xhci-hcd xhci-hcd.2.auto: USB bus 1 deregistered
[   41.686301] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0
[   41.687096] Mem abort info:
[   41.687345]   ESR = 0x96000004
[   41.687615]   EC = 0x25: DABT (current EL), IL = 32 bits
[   41.688082]   SET = 0, FnV = 0
[   41.688352]   EA = 0, S1PTW = 0
[   41.688628] Data abort info:
[   41.688882]   ISV = 0, ISS = 0x00000004
[   41.689219]   CM = 0, WnR = 0
[   41.689481] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000073b2000
[   41.690046] [00000000000000a0] pgd=0000000000000000, p4d=0000000000000000
[   41.690654] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   41.691143] Modules linked in:
[   41.691416] CPU: 5 PID: 1 Comm: shutdown Not tainted 5.13.0-rc4 #43
[   41.691966] Hardware name: Pine64 RockPro64 v2.0 (DT)
[   41.692409] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[   41.692937] pc : down_read_interruptible+0xec/0x200
[   41.693373] lr : simple_recursive_removal+0x48/0x280
[   41.693815] sp : ffff800011fab910
[   41.694107] x29: ffff800011fab910 x28: ffff0000008fe480 x27: ffff0000008fe4d8
[   41.694736] x26: ffff800011529a90 x25: 00000000000000a0 x24: ffff800011edd030
[   41.695364] x23: 0000000000000080 x22: 0000000000000000 x21: ffff800011f23994
[   41.695992] x20: ffff800011f23998 x19: ffff0000008fe480 x18: ffffffffffffffff
[   41.696620] x17: 000c0400bb44ffff x16: 0000000000000009 x15: ffff800091faba3d
[   41.697248] x14: 0000000000000004 x13: 0000000000000000 x12: 0000000000000020
[   41.697875] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : 6f6c746364716e62
[   41.698502] x8 : 7f7f7f7f7f7f7f7f x7 : fefefeff6364626d x6 : 0000000000000440
[   41.699130] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000000a0
[   41.699758] x2 : 0000000000000001 x1 : 0000000000000000 x0 : 00000000000000a0
[   41.700386] Call trace:
[   41.700602]  down_read_interruptible+0xec/0x200
[   41.701003]  debugfs_remove+0x5c/0x80
[   41.701328]  dwc3_debugfs_exit+0x1c/0x6c
[   41.701676]  dwc3_remove+0x34/0x1a0
[   41.701988]  platform_remove+0x28/0x60
[   41.702322]  __device_release_driver+0x188/0x22c
[   41.702730]  device_release_driver+0x2c/0x44
[   41.703106]  bus_remove_device+0x124/0x130
[   41.703468]  device_del+0x16c/0x424
[   41.703777]  platform_device_del.part.0+0x1c/0x90
[   41.704193]  platform_device_unregister+0x28/0x44
[   41.704608]  of_platform_device_destroy+0xe8/0x100
[   41.705031]  device_for_each_child_reverse+0x64/0xb4
[   41.705470]  of_platform_depopulate+0x40/0x84
[   41.705853]  __dwc3_of_simple_teardown+0x20/0xd4
[   41.706260]  dwc3_of_simple_shutdown+0x14/0x20
[   41.706652]  platform_shutdown+0x28/0x40
[   41.706998]  device_shutdown+0x158/0x330
[   41.707344]  kernel_power_off+0x38/0x7c
[   41.707684]  __do_sys_reboot+0x16c/0x2a0
[   41.708029]  __arm64_sys_reboot+0x28/0x34
[   41.708383]  invoke_syscall+0x48/0x114
[   41.708716]  el0_svc_common.constprop.0+0x44/0xdc
[   41.709131]  do_el0_svc+0x28/0x90
[   41.709426]  el0_svc+0x2c/0x54
[   41.709698]  el0_sync_handler+0xa4/0x130
[   41.710045]  el0_sync+0x198/0x1c0
[   41.710342] Code: c8047c62 35ffff84 17fffe5f f9800071 (c85ffc60)
[   41.710881] ---[ end trace 406377df5178f75c ]---
[   41.711299] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   41.712084] Kernel Offset: disabled
[   41.712391] CPU features: 0x10001031,20000846
[   41.712775] Memory Limit: none
[   41.713049] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

As Felipe explained: "dwc3_shutdown() is just called dwc3_remove()
directly, then we end up calling debugfs_remove_recursive() twice."

Reverting the commit fixes the panic.

Fixes: 568262bf54 ("usb: dwc3: core: Add shutdown callback for dwc3")
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20210603151742.298243-1-alexandru.elisei@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 10:40:57 +02:00
Kyle Tso
9257bd80b9 dt-bindings: connector: Replace BIT macro with generic bit ops
BIT macro is not defined. Replace it with generic bit operations.

Fixes: 630dce2810 ("dt-bindings: connector: Add SVDM VDO properties")
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210527121029.583611-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-04 10:40:39 +02:00
Axel Lin
cb2381cbec regulator: rt4801: Fix NULL pointer dereference if priv->enable_gpios is NULL
devm_gpiod_get_array_optional may return NULL if no GPIO was assigned.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20210603094944.1114156-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03 19:35:48 +01:00
Jack Pham
8d396bb0a5 usb: dwc3: debugfs: Add and remove endpoint dirs dynamically
The DWC3 DebugFS directory and files are currently created once
during probe.  This includes creation of subdirectories for each
of the gadget's endpoints.  This works fine for peripheral-only
controllers, as dwc3_core_init_mode() calls dwc3_gadget_init()
just prior to calling dwc3_debugfs_init().

However, for dual-role controllers, dwc3_core_init_mode() will
instead call dwc3_drd_init() which is problematic in a few ways.
First, the initial state must be determined, then dwc3_set_mode()
will have to schedule drd_work and by then dwc3_debugfs_init()
could have already been invoked.  Even if the initial mode is
peripheral, dwc3_gadget_init() happens after the DebugFS files
are created, and worse so if the initial state is host and the
controller switches to peripheral much later.  And secondly,
even if the gadget endpoints' debug entries were successfully
created, if the controller exits peripheral mode, its dwc3_eps
are freed so the debug files would now hold stale references.

So it is best if the DebugFS endpoint entries are created and
removed dynamically at the same time the underlying dwc3_eps are.
Do this by calling dwc3_debugfs_create_endpoint_dir() as each
endpoint is created, and conversely remove the DebugFS entry when
the endpoint is freed.

Fixes: 41ce1456e1 ("usb: dwc3: core: make dwc3_set_mode() work properly")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210529192932.22912-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03 20:28:23 +02:00
Shay Drory
404e5a1269 RDMA/mlx4: Do not map the core_clock page to user space unless enabled
Currently when mlx4 maps the hca_core_clock page to the user space there
are read-modifiable registers, one of which is semaphore, on this page as
well as the clock counter. If user reads the wrong offset, it can modify
the semaphore and hang the device.

Do not map the hca_core_clock page to the user space unless the device has
been put in a backwards compatibility mode to support this feature.

After this patch, mlx4 core_clock won't be mapped to user space on the
majority of existing devices and the uverbs device time feature in
ibv_query_rt_values_ex() will be disabled.

Fixes: 52033cfb5a ("IB/mlx4: Add mmap call to map the hardware clock")
Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.com
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 14:19:53 -03:00
Mark Zhang
a0ffb4c12f RDMA/mlx5: Use different doorbell memory for different processes
In a fork scenario, the parent and child can have same virtual address and
also share the uverbs fd.  That causes to the list_for_each_entry search
return same doorbell physical page for all processes, even though that
page has been COW' or copied.

This patch takes the mm_struct into consideration during search, to make
sure that VA's belonging to different processes are not intermixed.

Resolves the malfunction of uverbs after fork in some specific cases.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/feacc23fe0bc6e1088c6824d5583798745e72405.1622726212.git.leonro@nvidia.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 14:19:53 -03:00
Trond Myklebust
c3aba897c6 NFSv4: Fix second deadlock in nfs4_evict_inode()
If the inode is being evicted but has to return a layout first, then
that too can cause a deadlock in the corner case where the server
reboots.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03 10:14:42 -04:00
Trond Myklebust
dfe1fe75e0 NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()
If the inode is being evicted, but has to return a delegation first,
then it can cause a deadlock in the corner case where the server reboots
before the delegreturn completes, but while the call to iget5_locked() in
nfs4_opendata_get_inode() is waiting for the inode free to complete.
Since the open call still holds a session slot, the reboot recovery
cannot proceed.

In order to break the logjam, we can turn the delegation return into a
privileged operation for the case where we're evicting the inode. We
know that in that case, there can be no other state recovery operation
that conflicts.

Reported-by: zhangxiaoxu (A) <zhangxiaoxu5@huawei.com>
Fixes: 5fcdfacc01 ("NFSv4: Return delegations synchronously in evict_inode")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03 10:14:42 -04:00
Chuck Lever
d1b5c230e9 NFS: FMODE_READ and friends are C macros, not enum types
Address a sparse warning:

  CHECK   fs/nfs/nfstrace.c
fs/nfs/nfstrace.c: note: in included file (through /home/cel/src/linux/rpc-over-tls/include/trace/trace_events.h, /home/cel/src/linux/rpc-over-tls/include/trace/define_trace.h, ...):
fs/nfs/./nfstrace.h:424:1: warning: incorrect type in initializer (different base types)
fs/nfs/./nfstrace.h:424:1:    expected unsigned long eval_value
fs/nfs/./nfstrace.h:424:1:    got restricted fmode_t [usertype]
fs/nfs/./nfstrace.h:425:1: warning: incorrect type in initializer (different base types)
fs/nfs/./nfstrace.h:425:1:    expected unsigned long eval_value
fs/nfs/./nfstrace.h:425:1:    got restricted fmode_t [usertype]
fs/nfs/./nfstrace.h:426:1: warning: incorrect type in initializer (different base types)
fs/nfs/./nfstrace.h:426:1:    expected unsigned long eval_value
fs/nfs/./nfstrace.h:426:1:    got restricted fmode_t [usertype]

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03 10:14:42 -04:00
Dan Carpenter
09226e8303 NFS: Fix a potential NULL dereference in nfs_get_client()
None of the callers are expecting NULL returns from nfs_get_client() so
this code will lead to an Oops.  It's better to return an error
pointer.  I expect that this is dead code so hopefully no one is
affected.

Fixes: 31434f496a ("nfs: check hostname in nfs_get_client")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03 10:14:42 -04:00
Anna Schumaker
476bdb04c5 NFS: Fix use-after-free in nfs4_init_client()
KASAN reports a use-after-free when attempting to mount two different
exports through two different NICs that belong to the same server.

Olga was able to hit this with kernels starting somewhere between 5.7
and 5.10, but I traced the patch that introduced the clear_bit() call to
4.13. So something must have changed in the refcounting of the clp
pointer to make this call to nfs_put_client() the very last one.

Fixes: 8dcbec6d20 ("NFSv41: Handle EXCHID4_FLAG_CONFIRMED_R during NFSv4.1 migration")
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03 10:14:42 -04:00
Scott Mayhew
0b4f132b15 NFS: Ensure the NFS_CAP_SECURITY_LABEL capability is set when appropriate
Commit ce62b114bb ("NFS: Split attribute support out from the server
capabilities") removed the logic from _nfs4_server_capabilities() that
sets the NFS_CAP_SECURITY_LABEL capability based on the presence of
FATTR4_WORD2_SECURITY_LABEL in the attr_bitmask of the server's response.
Now NFS_CAP_SECURITY_LABEL is never set, which breaks labelled NFS.

This was replaced with logic that clears the NFS_ATTR_FATTR_V4_SECURITY_LABEL
bit in the newly added fattr_valid field based on the absence of
FATTR4_WORD2_SECURITY_LABEL in the attr_bitmask of the server's response.
This essentially has no effect since there's nothing looks for that bit
in fattr_supported.

So revert that part of the commit, but adding the logic that sets
NFS_CAP_SECURITY_LABEL near where the other capabilities are set in
_nfs4_server_capabilities().

Fixes: ce62b114bb ("NFS: Split attribute support out from the server capabilities")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03 10:14:42 -04:00
Dietmar Eggemann
68d7a19068 sched/fair: Fix util_est UTIL_AVG_UNCHANGED handling
The util_est internal UTIL_AVG_UNCHANGED flag which is used to prevent
unnecessary util_est updates uses the LSB of util_est.enqueued. It is
exposed via _task_util_est() (and task_util_est()).

Commit 92a801e5d5 ("sched/fair: Mask UTIL_AVG_UNCHANGED usages")
mentions that the LSB is lost for util_est resolution but
find_energy_efficient_cpu() checks if task_util_est() returns 0 to
return prev_cpu early.

_task_util_est() returns the max value of util_est.ewma and
util_est.enqueued or'ed w/ UTIL_AVG_UNCHANGED.
So task_util_est() returning the max of task_util() and
_task_util_est() will never return 0 under the default
SCHED_FEAT(UTIL_EST, true).

To fix this use the MSB of util_est.enqueued instead and keep the flag
util_est internal, i.e. don't export it via _task_util_est().

The maximal possible util_avg value for a task is 1024 so the MSB of
'unsigned int util_est.enqueued' isn't used to store a util value.

As a caveat the code behind the util_est_se trace point has to filter
UTIL_AVG_UNCHANGED to see the real util_est.enqueued value which should
be easy to do.

This also fixes an issue report by Xuewen Yan that util_est_update()
only used UTIL_AVG_UNCHANGED for the subtrahend of the equation:

  last_enqueued_diff = ue.enqueued - (task_util() | UTIL_AVG_UNCHANGED)

Fixes: b89997aa88 sched/pelt: Fix task util_est update filtering
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Xuewen Yan <xuewen.yan@unisoc.com>
Reviewed-by: Vincent Donnefort <vincent.donnefort@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20210602145808.1562603-1-dietmar.eggemann@arm.com
2021-06-03 15:47:23 +02:00
Patrice Chotard
d38fa9a155 spi: stm32-qspi: Always wait BUSY bit to be cleared in stm32_qspi_wait_cmd()
In U-boot side, an issue has been encountered when QSPI source clock is
running at low frequency (24 MHz for example), waiting for TCF bit to be
set didn't ensure that all data has been send out the FIFO, we should also
wait that BUSY bit is cleared.

To prevent similar issue in kernel driver, we implement similar behavior
by always waiting BUSY bit to be cleared.

Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com>
Link: https://lore.kernel.org/r/20210603073421.8441-1-patrice.chotard@foss.st.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03 13:55:36 +01:00
Axel Lin
50bec7fb4c regulator: hi6421v600: Fix .vsel_mask setting
Take ldo3_voltages as example, the ARRAY_SIZE(ldo3_voltages) is 16.
i.e. the valid selector is 0 ~ 0xF.
But in current code the vsel_mask is "(1 << 15) - 1", i.e. 0x7FFF. Fix it.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20210529013236.373847-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03 13:55:31 +01:00
Richard Weinberger
8bef925e37 ASoC: tas2562: Fix TDM_CFG0_SAMPRATE values
TAS2562_TDM_CFG0_SAMPRATE_MASK starts at bit 1, not 0.
So all values need to be left shifted by 1.

Signed-off-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20210530203446.19022-1-richard@nod.at
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03 13:55:21 +01:00
Jerome Brunet
d031d99b02 ASoC: meson: gx-card: fix sound-dai dt schema
There is a fair amount of warnings when running 'make dtbs_check' with
amlogic,gx-sound-card.yaml.

Ex:
arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:1: missing phandle tag in 0
arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:2: missing phandle tag in 0
arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0: [66, 0, 0] is too long

The reason is that the sound-dai phandle provided has cells, and in such
case the schema should use 'phandle-array' instead of 'phandle'.

Fixes: fd00366b8e ("ASoC: meson: gx: add sound card dt-binding documentation")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20210524093448.357140-1-jbrunet@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03 13:55:20 +01:00
Mark Pearson
320232caf1 ASoC: AMD Renoir: Remove fix for DMI entry on Lenovo 2020 platforms
Unfortunately the previous patch to fix issues using the AMD ACP bridge
has the side effect of breaking the dmic in other cases and needs to be
reverted.

Removing the changes while we revisit the fix and find something better.
Apologies for the churn.

Suggested-by: Gabriel Craciunescu <unix.or.die@gmail.com>
Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Link: https://lore.kernel.org/r/20210602171251.3243-1-markpearson@lenovo.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03 13:55:19 +01:00
Yang Yingliang
acbef0922c dmaengine: ipu: fix doc warning in ipu_irq.c
Fix the following make W=1 warning and correct description:

  drivers/dma/ipu/ipu_irq.c:238: warning: expecting prototype for ipu_irq_map(). Prototype was for ipu_irq_unmap() instead

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210603072425.2973570-1-yangyingliang@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-03 18:17:26 +05:30
Zou Wei
dea8464ddf dmaengine: rcar-dmac: Fix PM reference leak in rcar_dmac_probe()
pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
Fix it by replacing it with pm_runtime_resume_and_get to keep usage
counter balanced.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/1622442963-54095-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-03 16:37:37 +05:30
Maximilian Luz
b430e1d65e platform/surface: aggregator: Fix event disable function
Disabling events silently fails due to the wrong command ID being used.
Instead of the command ID for the disable call, the command ID for the
enable call was being used. This causes the disable call to enable the
event instead. As the event is already enabled when we call this
function, the EC silently drops this command and does nothing.

Use the correct command ID for disabling the event to fix this.

Fixes: c167b9c7e3 ("platform/surface: Add Surface Aggregator subsystem")
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Link: https://lore.kernel.org/r/20210603000636.568846-1-luzmaximilian@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-03 13:07:34 +02:00
Vincent Guittot
fcf6631f37 sched/pelt: Ensure that *_sum is always synced with *_avg
Rounding in PELT calculation happening when entities are attached/detached
of a cfs_rq can result into situations where util/runnable_avg is not null
but util/runnable_sum is. This is normally not possible so we need to
ensure that util/runnable_sum stays synced with util/runnable_avg.

detach_entity_load_avg() is the last place where we don't sync
util/runnable_sum with util/runnbale_avg when moving some sched_entities

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210601085832.12626-1-vincent.guittot@linaro.org
2021-06-03 12:55:55 +02:00
Arnd Bergmann
dad7b9896a ARM: 9081/1: fix gcc-10 thumb2-kernel regression
When building the kernel wtih gcc-10 or higher using the
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y flag, the compiler picks a slightly
different set of registers for the inline assembly in cpu_init() that
subsequently results in a corrupt kernel stack as well as remaining in
FIQ mode. If a banked register is used for the last argument, the wrong
version of that register gets loaded into CPSR_c.  When building in Arm
mode, the arguments are passed as immediate values and the bug cannot
happen.

This got introduced when Daniel reworked the FIQ handling and was
technically always broken, but happened to work with both clang and gcc
before gcc-10 as long as they picked one of the lower registers.
This is probably an indication that still very few people build the
kernel in Thumb2 mode.

Marek pointed out the problem on IRC, Arnd narrowed it down to this
inline assembly and Russell pinpointed the exact bug.

Change the constraints to force the final mode switch to use a non-banked
register for the argument to ensure that the correct constant gets loaded.
Another alternative would be to always use registers for the constant
arguments to avoid the #ifdef that has now become more complex.

Cc: <stable@vger.kernel.org> # v3.18+
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Reported-by: Marek Vasut <marek.vasut@gmail.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Fixes: c0e7f7ee71 ("ARM: 8150/3: fiq: Replace default FIQ handler")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-06-03 11:39:36 +01:00
Jiapeng Chong
99b18e88a1 dmaengine: idxd: Fix missing error code in idxd_cdev_open()
The error code is missing in this code scenario, add the error code
'-EINVAL' to the return value 'rc'.

Eliminate the follow smatch warning:

drivers/dma/idxd/cdev.c:113 idxd_cdev_open() warn: missing error code
'rc'.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/1622628446-87909-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-03 12:28:37 +05:30
Yang Yingliang
d1ce245fe4 phy: Sparx5 Eth SerDes: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210603051014.2674744-1-yangyingliang@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-03 11:18:19 +05:30
Sergio Paracuellos
d6e9e8e5dd phy: ralink: phy-mt7621-pci: drop 'of_match_ptr' to fix -Wunused-const-variable
The of_device_id is included unconditionally by of.h header and used
in the driver as well. Remove of_match_ptr to fix W=1 compile test
warning with !CONFIG_OF:

drivers/phy/ralink/phy-mt7621-pci.c:341:34: warning: unused variable 'mt7621_pci_phy_ids' [-Wunused-const-variable]

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Link: https://lore.kernel.org/r/20210603043219.32646-1-sergio.paracuellos@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-03 11:18:14 +05:30
Nathan Chancellor
d4c6399900 vmlinux.lds.h: Avoid orphan section with !SMP
With x86_64_defconfig and the following configs, there is an orphan
section warning:

CONFIG_SMP=n
CONFIG_AMD_MEM_ENCRYPT=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_KVM=y
CONFIG_PARAVIRT=y

ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/cpu/vmware.o' being placed in section `.data..decrypted'
ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/kvm.o' being placed in section `.data..decrypted'

These sections are created with DEFINE_PER_CPU_DECRYPTED, which
ultimately turns into __PCPU_ATTRS, which in turn has a section
attribute with a value of PER_CPU_BASE_SECTION + the section name. When
CONFIG_SMP is not set, the base section is .data and that is not
currently handled in any linker script.

Add .data..decrypted to PERCPU_DECRYPTED_SECTION, which is included in
PERCPU_INPUT -> PERCPU_SECTION, which is include in the x86 linker
script when either CONFIG_X86_64 or CONFIG_SMP is unset, taking care of
the warning.

Fixes: ac26963a11 ("percpu: Introduce DEFINE_PER_CPU_DECRYPTED")
Link: https://github.com/ClangBuiltLinux/linux/issues/1360
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com> # build
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210506001410.1026691-1-nathan@kernel.org
2021-06-02 12:43:55 -07:00
Arnd Bergmann
d94b93a910 ARM: cpuidle: Avoid orphan section warning
Since commit 83109d5d5f ("x86/build: Warn on orphan section placement"),
we get a warning for objects in orphan sections. The cpuidle implementation
for OMAP causes this when CONFIG_CPU_IDLE is disabled:

arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table'
arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table'
arm-linux-gnueabi-ld: warning: orphan section `__cpuidle_method_of_table' from `arch/arm/mach-omap2/pm33xx-core.o' being placed in section `__cpuidle_method_of_table'

Change the definition of CPUIDLE_METHOD_OF_DECLARE() to silently
drop the table and all code referenced from it when CONFIG_CPU_IDLE
is disabled.

Fixes: 06ee7a950b ("ARM: OMAP2+: pm33xx-core: Add cpuidle_ops for am335x/am437x")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201230155506.1085689-1-arnd@kernel.org
2021-06-02 12:41:03 -07:00
Kamal Heib
a3e74fb924 RDMA/ipoib: Fix warning caused by destroying non-initial netns
After the commit 5ce2dced8e ("RDMA/ipoib: Set rtnl_link_ops for ipoib
interfaces"), if the IPoIB device is moved to non-initial netns,
destroying that netns lets the device vanish instead of moving it back to
the initial netns, This is happening because default_device_exit() skips
the interfaces due to having rtnl_link_ops set.

Steps to reporoduce:
  ip netns add foo
  ip link set mlx5_ib0 netns foo
  ip netns delete foo

WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50
Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d
 fuse
CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S      W  5.13.0-rc1+ #1
Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
Workqueue: netns cleanup_net
RIP: 0010:netdev_exit+0x3f/0x50
Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48
8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b
c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206
RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d
RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00
RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00
R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620
R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20
FS:  0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ops_exit_list.isra.9+0x36/0x70
 cleanup_net+0x234/0x390
 process_one_work+0x1cb/0x360
 ? process_one_work+0x360/0x360
 worker_thread+0x30/0x370
 ? process_one_work+0x360/0x360
 kthread+0x116/0x130
 ? kthread_park+0x80/0x80
 ret_from_fork+0x22/0x30

To avoid the above warning and later on the kernel panic that could happen
on shutdown due to a NULL pointer dereference, make sure to set the
netns_refund flag that was introduced by commit 3a5ca85707 ("can: dev:
Move device back to init netns on owning netns delete") to properly
restore the IPoIB interfaces to the initial netns.

Fixes: 5ce2dced8e ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces")
Link: https://lore.kernel.org/r/20210525150134.139342-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 15:20:39 -03:00
Kyle Tso
6490fa5655 usb: pd: Set PD_T_SINK_WAIT_CAP to 310ms
Current timer PD_T_SINK_WAIT_CAP is set to 240ms which will violate the
SinkWaitCapTimer (tTypeCSinkWaitCap 310 - 620 ms) defined in the PD
Spec if the port is faster enough when running the state machine. Set it
to the lower bound 310ms to ensure the timeout is in Spec.

Fixes: f0690a25a1 ("staging: typec: USB Type-C Port Manager (tcpm)")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210528081613.730661-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-02 16:59:29 +02:00
Thomas Petazzoni
b65ba0c362 usb: musb: fix MUSB_QUIRK_B_DISCONNECT_99 handling
In commit 92af4fc6ec ("usb: musb: Fix suspend with devices
connected for a64"), the logic to support the
MUSB_QUIRK_B_DISCONNECT_99 quirk was modified to only conditionally
schedule the musb->irq_work delayed work.

This commit badly breaks ECM Gadget on AM335X. Indeed, with this
commit, one can observe massive packet loss:

$ ping 192.168.0.100
...
15 packets transmitted, 3 received, 80% packet loss, time 14316ms

Reverting this commit brings back a properly functioning ECM
Gadget. An analysis of the commit seems to indicate that a mistake was
made: the previous code was not falling through into the
MUSB_QUIRK_B_INVALID_VBUS_91, but now it is, unless the condition is
taken.

Changing the logic to be as it was before the problematic commit *and*
only conditionally scheduling musb->irq_work resolves the regression:

$ ping 192.168.0.100
...
64 packets transmitted, 64 received, 0% packet loss, time 64475ms

Fixes: 92af4fc6ec ("usb: musb: Fix suspend with devices connected for a64")
Cc: stable@vger.kernel.org
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Drew Fustini <drew@beagleboard.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Link: https://lore.kernel.org/r/20210528140446.278076-1-thomas.petazzoni@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-02 16:58:08 +02:00
Jack Pham
03715ea2e3 usb: dwc3: gadget: Bail from dwc3_gadget_exit() if dwc->gadget is NULL
There exists a possible scenario in which dwc3_gadget_init() can fail:
during during host -> peripheral mode switch in dwc3_set_mode(), and
a pending gadget driver fails to bind.  Then, if the DRD undergoes
another mode switch from peripheral->host the resulting
dwc3_gadget_exit() will attempt to reference an invalid and dangling
dwc->gadget pointer as well as call dma_free_coherent() on unmapped
DMA pointers.

The exact scenario can be reproduced as follows:
 - Start DWC3 in peripheral mode
 - Configure ConfigFS gadget with FunctionFS instance (or use g_ffs)
 - Run FunctionFS userspace application (open EPs, write descriptors, etc)
 - Bind gadget driver to DWC3's UDC
 - Switch DWC3 to host mode
   => dwc3_gadget_exit() is called. usb_del_gadget() will put the
	ConfigFS driver instance on the gadget_driver_pending_list
 - Stop FunctionFS application (closes the ep files)
 - Switch DWC3 to peripheral mode
   => dwc3_gadget_init() fails as usb_add_gadget() calls
	check_pending_gadget_drivers() and attempts to rebind the UDC
	to the ConfigFS gadget but fails with -19 (-ENODEV) because the
	FFS instance is not in FFS_ACTIVE state (userspace has not
	re-opened and written the descriptors yet, i.e. desc_ready!=0).
 - Switch DWC3 back to host mode
   => dwc3_gadget_exit() is called again, but this time dwc->gadget
	is invalid.

Although it can be argued that userspace should take responsibility
for ensuring that the FunctionFS application be ready prior to
allowing the composite driver bind to the UDC, failure to do so
should not result in a panic from the kernel driver.

Fix this by setting dwc->gadget to NULL in the failure path of
dwc3_gadget_init() and add a check to dwc3_gadget_exit() to bail out
unless the gadget pointer is valid.

Fixes: e81a7018d9 ("usb: dwc3: allocate gadget structure dynamically")
Cc: <stable@vger.kernel.org>
Reviewed-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210528160405.17550-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-02 16:50:54 +02:00
Wesley Cheng
8212937305 usb: dwc3: gadget: Disable gadget IRQ during pullup disable
Current sequence utilizes dwc3_gadget_disable_irq() alongside
synchronize_irq() to ensure that no further DWC3 events are generated.
However, the dwc3_gadget_disable_irq() API only disables device
specific events.  Endpoint events can still be generated.  Briefly
disable the interrupt line, so that the cleanup code can run to
prevent device and endpoint events. (i.e. __dwc3_gadget_stop() and
dwc3_stop_active_transfers() respectively)

Without doing so, it can lead to both the interrupt handler and the
pullup disable routine both writing to the GEVNTCOUNT register, which
will cause an incorrect count being read from future interrupts.

Fixes: ae7e86108b ("usb: dwc3: Stop active transfers before halting the controller")
Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
Link: https://lore.kernel.org/r/1621571037-1424-1-git-send-email-wcheng@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-02 16:47:08 +02:00
zpershuai
f131767eef spi: spi-zynq-qspi: Fix some wrong goto jumps & missing error code
In zynq_qspi_probe function, when enable the device clock is done,
the return of all the functions should goto the clk_dis_all label.

If num_cs is not right then this should return a negative error
code but currently it returns success.

Signed-off-by: zpershuai <zpershuai@gmail.com>
Link: https://lore.kernel.org/r/1622110857-21812-1-git-send-email-zpershuai@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-02 12:03:26 +01:00
Matti Vaittinen
bc537e65b0 regulator: bd718x7: Fix the BUCK7 voltage setting on BD71837
Changing the BD71837 voltages for other regulators except the first 4 BUCKs
should be forbidden when the regulator is enabled. There may be out-of-spec
voltage spikes if the voltage of these "non DVS" bucks is changed when
enabled. This restriction was accidentally removed when the LDO voltage
change was allowed for BD71847. (It was not noticed that the BD71837
BUCK7 used same voltage setting function as LDOs).

Additionally this bug causes incorrect voltage monitoring register access.
The voltage change function accidentally used for bd71837 BUCK7 is
intended to only handle LDO voltage changes. A BD71847 LDO specific
voltage monitoring disabling code gets executed on BD71837 and register
offsets are wrongly calculated as regulator is assumed to be an LDO.

Prevent the BD71837 BUCK7 voltage change when BUCK7 is enabled by using
the correct voltage setting operation.

Fixes: 9bcbabafa1 ("regulator: bd718x7: remove voltage change restriction from BD71847 LDOs")
Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Link: https://lore.kernel.org/r/bd8c00931421fafa57e3fdf46557a83075b7cc17.1622610103.git.matti.vaittinen@fi.rohmeurope.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-02 12:03:21 +01:00
Mark Pearson
19a0aa9b04 ASoC: AMD Renoir - add DMI entry for Lenovo 2020 AMD platforms
More laptops identified where the AMD ACP bridge needs to be blocked
or the microphone will not work when connected to HDMI.

Use DMI to block the microphone PCM device for these platforms.

Suggested-by: Gabriel Craciunescu <nix.or.die@gmail.com>
Signed-off-by: Mark Pearson <markpearson@lenovo.com>
Link: https://lore.kernel.org/r/20210531145502.6079-1-markpearson@lenovo.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-02 12:03:19 +01:00
Dai Ngo
f8849e206e NFSv4: nfs4_proc_set_acl needs to restore NFS_CAP_UIDGID_NOMAP on error.
Currently if __nfs4_proc_set_acl fails with NFS4ERR_BADOWNER it
re-enables the idmapper by clearing NFS_CAP_UIDGID_NOMAP before
retrying again. The NFS_CAP_UIDGID_NOMAP remains cleared even if
the retry fails. This causes problem for subsequent setattr
requests for v4 server that does not have idmapping configured.

This patch modifies nfs4_proc_set_acl to detect NFS4ERR_BADOWNER
and NFS4ERR_BADNAME and skips the retry, since the kernel isn't
involved in encoding the ACEs, and return -EINVAL.

Steps to reproduce the problem:

 # mount -o vers=4.1,sec=sys server:/export/test /tmp/mnt
 # touch /tmp/mnt/file1
 # chown 99 /tmp/mnt/file1
 # nfs4_setfacl -a A::unknown.user@xyz.com:wrtncy /tmp/mnt/file1
 Failed setxattr operation: Invalid argument
 # chown 99 /tmp/mnt/file1
 chown: changing ownership of ‘/tmp/mnt/file1’: Invalid argument
 # umount /tmp/mnt
 # mount -o vers=4.1,sec=sys server:/export/test /tmp/mnt
 # chown 99 /tmp/mnt/file1
 #

v2: detect NFS4ERR_BADOWNER and NFS4ERR_BADNAME and skip retry
       in nfs4_proc_set_acl.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-01 13:16:17 -04:00
Kan Liang
848ff37686 perf/x86/intel/uncore: Fix M2M event umask for Ice Lake server
Perf tool errors out with the latest event list for the Ice Lake server.

event syntax error: 'unc_m2m_imc_reads.to_pmm'
                           \___ value too big for format, maximum is 255

The same as the Snow Ridge server, the M2M uncore unit in the Ice Lake
server has the unit mask extension field as well.

Fixes: 2b3b76b5ec ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1622552943-119174-1-git-send-email-kan.liang@linux.intel.com
2021-06-01 16:00:05 +02:00
Lukas Wunner
2ec6f20b33 spi: Cleanup on failure of initial setup
Commit c7299fea67 ("spi: Fix spi device unregister flow") changed the
SPI core's behavior if the ->setup() hook returns an error upon adding
an spi_device:  Before, the ->cleanup() hook was invoked to free any
allocations that were made by ->setup().  With the commit, that's no
longer the case, so the ->setup() hook is expected to free the
allocations itself.

I've identified 5 drivers which depend on the old behavior and am fixing
them up hereinafter: spi-bitbang.c spi-fsl-spi.c spi-omap-uwire.c
spi-omap2-mcspi.c spi-pxa2xx.c

Importantly, ->setup() is not only invoked on spi_device *addition*:
It may subsequently be called to *change* SPI parameters.  If changing
these SPI parameters fails, freeing memory allocations would be wrong.
That should only be done if the spi_device is finally destroyed.
I am therefore using a bool "initial_setup" in 4 of the affected drivers
to differentiate between the invocation on *adding* the spi_device and
any subsequent invocations: spi-bitbang.c spi-fsl-spi.c spi-omap-uwire.c
spi-omap2-mcspi.c

In spi-pxa2xx.c, it seems the ->setup() hook can only fail on spi_device
addition, not any subsequent calls.  It therefore doesn't need the bool.

It's worth noting that 5 other drivers already perform a cleanup if the
->setup() hook fails.  Before c7299fea67, they caused a double-free
if ->setup() failed on spi_device addition.  Since the commit, they're
fine.  These drivers are: spi-mpc512x-psc.c spi-pl022.c spi-s3c64xx.c
spi-st-ssc4.c spi-tegra114.c

(spi-pxa2xx.c also already performs a cleanup, but only in one of
several error paths.)

Fixes: c7299fea67 ("spi: Fix spi device unregister flow")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Saravana Kannan <saravanak@google.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> # pxa2xx
Link: https://lore.kernel.org/r/f76a0599469f265b69c371538794101fa37b5536.1622149321.git.lukas@wunner.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:03:12 +01:00
Axel Lin
1963fa67d7 regulator: atc260x: Fix n_voltages and min_sel for pickable linear ranges
The .n_voltages was missed for pickable linear ranges, fix it.
The min_sel for each pickable range should be starting from 0.
Also fix atc260x_ldo_voltage_range_sel setting (bit 5 - LDO<N>_VOL_SEL
in datasheet).

Fixes: 3b15ccac16 ("regulator: Add regulator driver for ATC260x PMICs")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@gmail.com>
Link: https://lore.kernel.org/r/20210528230147.363974-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:03:06 +01:00
ChiYuan Huang
46639a5e68 regulator: rtmv20: Fix to make regcache value first reading back from HW
- Fix to make regcache value first reading back from HW.

Signed-off-by: ChiYuan Huang <cy_huang@richtek.com>
Link: https://lore.kernel.org/r/1622542155-6373-1-git-send-email-u0084500@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:03:05 +01:00
Axel Lin
89082179ec regulator: mt6315: Fix function prototype for mt6315_map_mode
The .of_map_mode should has below function prototype:
	unsigned int (*of_map_mode)(unsigned int mode);

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20210530022109.425054-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:03:04 +01:00
Axel Lin
5f01de6ffa regulator: rtmv20: Add Richtek to Kconfig text
The other Richtek drivers has Richtek prefix, make it consistent.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20210530124101.477727-2-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:03:04 +01:00
Axel Lin
86ab21cc39 regulator: rtmv20: Fix .set_current_limit/.get_current_limit callbacks
Current code does not set .curr_table and .n_linear_ranges settings,
so it cannot use the regulator_get/set_current_limit_regmap helpers.
If we setup the curr_table, it will has 200 entries.
Implement customized .set_current_limit/.get_current_limit callbacks
instead.

Fixes: b8c054a5ea ("regulator: rtmv20: Adds support for Richtek RTMV20 load switch regulator")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: ChiYuan Huang <cy_huang@richtek.com>
Link: https://lore.kernel.org/r/20210530124101.477727-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:03:03 +01:00
Kai Vehmanen
b640e8a4bd ASoC: SOF: reset enabled_cores state at suspend
The recent changes to use common code to power up/down DSP cores also
removed the reset of the core state at suspend. It turns out this is
still needed. When the firmware state is reset to
SOF_FW_BOOT_NOT_STARTED, also enabled_cores should be reset, and
existing DSP drivers depend on this.

BugLink: https://github.com/thesofproject/linux/issues/2824
Fixes: 42077f08b3 ("ASoC: SOF: update dsp core power status in common APIs")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20210528144330.2551-1-kai.vehmanen@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:03:01 +01:00
Nicolas Cavallari
a8437f0538 ASoC: fsl-asoc-card: Set .owner attribute when registering card.
Otherwise, when compiled as module, a WARN_ON is triggered:

WARNING: CPU: 0 PID: 5 at sound/core/init.c:208 snd_card_new+0x310/0x39c [snd]
[...]
CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.10.39 #1
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: events deferred_probe_work_func
[<c0111988>] (unwind_backtrace) from [<c010c8ac>] (show_stack+0x10/0x14)
[<c010c8ac>] (show_stack) from [<c092784c>] (dump_stack+0xdc/0x104)
[<c092784c>] (dump_stack) from [<c0129710>] (__warn+0xd8/0x114)
[<c0129710>] (__warn) from [<c0922a48>] (warn_slowpath_fmt+0x5c/0xc4)
[<c0922a48>] (warn_slowpath_fmt) from [<bf0496f8>] (snd_card_new+0x310/0x39c [snd])
[<bf0496f8>] (snd_card_new [snd]) from [<bf1d7df8>] (snd_soc_bind_card+0x334/0x9c4 [snd_soc_core])
[<bf1d7df8>] (snd_soc_bind_card [snd_soc_core]) from [<bf1e9cd8>] (devm_snd_soc_register_card+0x30/0x6c [snd_soc_core])
[<bf1e9cd8>] (devm_snd_soc_register_card [snd_soc_core]) from [<bf22d964>] (fsl_asoc_card_probe+0x550/0xcc8 [snd_soc_fsl_asoc_card])
[<bf22d964>] (fsl_asoc_card_probe [snd_soc_fsl_asoc_card]) from [<c060c930>] (platform_drv_probe+0x48/0x98)
[...]

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com>
Link: https://lore.kernel.org/r/20210527163409.22049-1-nicolas.cavallari@green-communications.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:03:00 +01:00
Colin Ian King
ce1f25718b ASoC: topology: Fix spelling mistake "vesion" -> "version"
There are spelling mistakes in comments. Fix them.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210601103506.9477-1-colin.king@canonical.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-01 14:02:59 +01:00
Mathy Vanhoef
bddc0c411a mac80211: Fix NULL ptr deref for injected rate info
The commit cb17ed29a7 ("mac80211: parse radiotap header when selecting Tx
queue") moved the code to validate the radiotap header from
ieee80211_monitor_start_xmit to ieee80211_parse_tx_radiotap. This made is
possible to share more code with the new Tx queue selection code for
injected frames. But at the same time, it now required the call of
ieee80211_parse_tx_radiotap at the beginning of functions which wanted to
handle the radiotap header. And this broke the rate parser for radiotap
header parser.

The radiotap parser for rates is operating most of the time only on the
data in the actual radiotap header. But for the 802.11a/b/g rates, it must
also know the selected band from the chandef information. But this
information is only written to the ieee80211_tx_info at the end of the
ieee80211_monitor_start_xmit - long after ieee80211_parse_tx_radiotap was
already called. The info->band information was therefore always 0
(NL80211_BAND_2GHZ) when the parser code tried to access it.

For a 5GHz only device, injecting a frame with 802.11a rates would cause a
NULL pointer dereference because local->hw.wiphy->bands[NL80211_BAND_2GHZ]
would most likely have been NULL when the radiotap parser searched for the
correct rate index of the driver.

Cc: stable@vger.kernel.org
Reported-by: Ben Greear <greearb@candelatech.com>
Fixes: cb17ed29a7 ("mac80211: parse radiotap header when selecting Tx queue")
Signed-off-by: Mathy Vanhoef <Mathy.Vanhoef@kuleuven.be>
[sven@narfation.org: added commit message]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Link: https://lore.kernel.org/r/20210530133226.40587-1-sven@narfation.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-05-31 21:40:17 +02:00
Du Cheng
e298aa358f mac80211: fix skb length check in ieee80211_scan_rx()
Replace hard-coded compile-time constants for header length check
with dynamic determination based on the frame type. Otherwise, we
hit a validation WARN_ON in cfg80211 later.

Fixes: cd418ba63f ("mac80211: convert S1G beacon to scan results")
Reported-by: syzbot+405843667e93b9790fc1@syzkaller.appspotmail.com
Signed-off-by: Du Cheng <ducheng2@gmail.com>
Link: https://lore.kernel.org/r/20210510041649.589754-1-ducheng2@gmail.com
[style fixes, reword commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-05-31 21:39:10 +02:00
Johannes Berg
b90f51e8e1 staging: rtl8723bs: fix monitor netdev register/unregister
Due to the locking changes and callbacks happening inside
cfg80211, we need to use cfg80211 versions of the register
and unregister functions if called within cfg80211 methods,
otherwise deadlocks occur.

Fixes: a05829a722 ("cfg80211: avoid holding the RTNL when calling the driver")
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210426212801.3d902cc9e6f4.Ie0b1e0c545920c61400a4b7d0f384ea61feb645a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-05-31 21:27:15 +02:00
Du Cheng
a64b6a25dd cfg80211: call cfg80211_leave_ocb when switching away from OCB
If the userland switches back-and-forth between NL80211_IFTYPE_OCB and
NL80211_IFTYPE_ADHOC via send_msg(NL80211_CMD_SET_INTERFACE), there is a
chance where the cleanup cfg80211_leave_ocb() is not called. This leads
to initialization of in-use memory (e.g. init u.ibss while in-use by
u.ocb) due to a shared struct/union within ieee80211_sub_if_data:

struct ieee80211_sub_if_data {
    ...
    union {
        struct ieee80211_if_ap ap;
        struct ieee80211_if_vlan vlan;
        struct ieee80211_if_managed mgd;
        struct ieee80211_if_ibss ibss; // <- shares address
        struct ieee80211_if_mesh mesh;
        struct ieee80211_if_ocb ocb; // <- shares address
        struct ieee80211_if_mntr mntr;
        struct ieee80211_if_nan nan;
    } u;
    ...
}

Therefore add handling of otype == NL80211_IFTYPE_OCB, during
cfg80211_change_iface() to perform cleanup when leaving OCB mode.

link to syzkaller bug:
https://syzkaller.appspot.com/bug?id=0612dbfa595bf4b9b680ff7b4948257b8e3732d5

Reported-by: syzbot+105896fac213f26056f9@syzkaller.appspotmail.com
Signed-off-by: Du Cheng <ducheng2@gmail.com>
Link: https://lore.kernel.org/r/20210428063941.105161-1-ducheng2@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-05-31 21:27:15 +02:00
Brian Norris
34fb4db5ab mac80211: correct ieee80211_iterate_active_interfaces_mtx() locking comments
Commit a05829a722 ("cfg80211: avoid holding the RTNL when calling the
driver") dropped usage of RTNL here and replaced it with
hw->wiphy->mutex. But we didn't update the comments.

Fixes: a05829a722 ("cfg80211: avoid holding the RTNL when calling the driver")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Link: https://lore.kernel.org/r/20210505202829.1039400-1-briannorris@chromium.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-05-31 21:27:15 +02:00
Johannes Berg
bd18de5179 mac80211_hwsim: drop pending frames on stop
Syzbot reports that we may be able to get into a situation where
mac80211 has pending ACK frames on shutdown with hwsim. It appears
that the reason for this is that syzbot uses the wmediumd hooks to
intercept/injection frames, and may shut down hwsim, removing the
radio(s), while frames are pending in the air simulation.

Clean out the pending queue when the interface is stopped, after
this the frames can't be reported back to mac80211 properly anyway.

Reported-by: syzbot+a063bbf0b15737362592@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20210517170429.b0f85ab0eda1.Ie42a6ec6b940c971f3441286aeaaae2fe368e29a@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-05-31 21:27:07 +02:00
Johannes Berg
0ee4d55534 mac80211: remove warning in ieee80211_get_sband()
Syzbot reports that it's possible to hit this from userspace,
by trying to add a station before any other connection setup
has been done. Instead of trying to catch this in some other
way simply remove the warning, that will appropriately reject
the call from userspace.

Reported-by: syzbot+7716dbc401d9a437890d@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20210517164715.f537da276d17.Id05f40ec8761d6a8cc2df87f1aa09c651988a586@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-05-31 21:27:05 +02:00
Yang Li
b8203ec7f5 phy: ti: Fix an error code in wiz_probe()
When the code execute this if statement, the value of ret is 0.
However, we can see from the dev_err() log that the value of
ret should be -EINVAL.

Clean up smatch warning:

drivers/phy/ti/phy-j721e-wiz.c:1216 wiz_probe() warn: missing error code 'ret'

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Fixes: c9f9eba066 ("phy: ti: j721e-wiz: Manage typec-gpio-dir")
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/1621939832-65535-1-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 14:11:44 +05:30
Tiezhu Yang
aaac9a1bd3 phy: phy-mtk-tphy: Fix some resource leaks in mtk_phy_init()
Use clk_disable_unprepare() in the error path of mtk_phy_init() to fix
some resource leaks.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1621420659-15858-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 13:58:28 +05:30
Wang Wensheng
6411e386db phy: cadence: Sierra: Fix error return code in cdns_sierra_phy_probe()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: a43f72ae13 ("phy: cadence: Sierra: Change MAX_LANES of Sierra to 16")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Link: https://lore.kernel.org/r/20210517015749.127799-1-wangwensheng4@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 13:50:05 +05:30
Kan Liang
4a0e3ff309 perf/x86/intel/uncore: Fix a kernel WARNING triggered by maxcpus=1
A kernel WARNING may be triggered when setting maxcpus=1.

The uncore counters are Die-scope. When probing a PCI device, only the
BUS information can be retrieved. The uncore driver has to maintain a
mapping table used to calculate the logical Die ID from a given BUS#.

Before the patch ba9506be4e, the mapping table stores the mapping
information from the BUS# -> a Physical Socket ID. To calculate the
logical die ID, perf does,
- In snbep_pci2phy_map_init(), retrieve the BUS# -> a Physical Socket ID
  from the UBOX PCI configure space.
- Calculate the mapping information (a BUS# -> a Physical Socket ID) for
  the other PCI BUS.
- In the uncore_pci_probe(), get the physical Socket ID from a given BUS
  and the mapping table.
- Calculate the logical Die ID

Since only the logical Die ID is required, with the patch ba9506be4e,
the mapping table stores the mapping information from the BUS# -> a
logical Die ID. Now perf does,
- In snbep_pci2phy_map_init(), retrieve the BUS# -> a Physical Socket ID
  from the UBOX PCI configure space.
- Calculate the logical Die ID
- Calculate the mapping information (a BUS# -> a logical Die ID) for the
  other PCI BUS.
- In the uncore_pci_probe(), get the logical die ID from a given BUS and
  the mapping table.

When calculating the logical Die ID, -1 may be returned, especially when
maxcpus=1. Here, -1 means the logical Die ID is not found. But when
calculating the mapping information for the other PCI BUS, -1 indicates
that it's the other PCI BUS that requires the calculation of the
mapping. The driver will mistakenly do the calculation.

Uses the -ENODEV to indicate the case which the logical Die ID is not
found. The driver will not mess up the mapping table anymore.

Fixes: ba9506be4e ("perf/x86/intel/uncore: Store the logical die id instead of the physical die id.")
Reported-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Tested-by: John Donnelly <john.p.donnelly@oracle.com>
Link: https://lkml.kernel.org/r/1622037527-156028-1-git-send-email-kan.liang@linux.intel.com
2021-05-31 10:14:51 +02:00
Marco Elver
6c605f8371 perf: Fix data race between pin_count increment/decrement
KCSAN reports a data race between increment and decrement of pin_count:

  write to 0xffff888237c2d4e0 of 4 bytes by task 15740 on cpu 1:
   find_get_context		kernel/events/core.c:4617
   __do_sys_perf_event_open	kernel/events/core.c:12097 [inline]
   __se_sys_perf_event_open	kernel/events/core.c:11933
   ...
  read to 0xffff888237c2d4e0 of 4 bytes by task 15743 on cpu 0:
   perf_unpin_context		kernel/events/core.c:1525 [inline]
   __do_sys_perf_event_open	kernel/events/core.c:12328 [inline]
   __se_sys_perf_event_open	kernel/events/core.c:11933
   ...

Because neither read-modify-write here is atomic, this can lead to one
of the operations being lost, resulting in an inconsistent pin_count.
Fix it by adding the missing locking in the CPU-event case.

Fixes: fe4b04fa31 ("perf: Cure task_oncpu_function_call() races")
Reported-by: syzbot+142c9018f5962db69c7e@syzkaller.appspotmail.com
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210527104711.2671610-1-elver@google.com
2021-05-31 10:14:51 +02:00
Frederic Weisbecker
f268c3737e tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed
Checking for and processing RCU-nocb deferred wakeup upon user/guest
entry is only relevant when nohz_full runs on the local CPU, otherwise
the periodic tick should take care of it.

Make sure we don't needlessly pollute these fast-paths as a -3%
performance regression on a will-it-scale.per_process_ops has been
reported so far.

Fixes: 47b8ff194c (entry: Explicitly flush pending rcuog wakeup before last rescheduling point)
Fixes: 4ae7dc97f7 (entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point)
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210527113441.465489-1-frederic@kernel.org
2021-05-31 10:14:49 +02:00
Vincent Guittot
02da26ad5e sched/fair: Make sure to update tg contrib for blocked load
During the update of fair blocked load (__update_blocked_fair()), we
update the contribution of the cfs in tg->load_avg if cfs_rq's pelt
has decayed.  Nevertheless, the pelt values of a cfs_rq could have
been recently updated while propagating the change of a child. In this
case, cfs_rq's pelt will not decayed because it has already been
updated and we don't update tg->load_avg.

__update_blocked_fair
  ...
  for_each_leaf_cfs_rq_safe: child cfs_rq
    update cfs_rq_load_avg() for child cfs_rq
    ...
    update_load_avg(cfs_rq_of(se), se, 0)
      ...
      update cfs_rq_load_avg() for parent cfs_rq
		-propagation of child's load makes parent cfs_rq->load_sum
		 becoming null
        -UPDATE_TG is not set so it doesn't update parent
		 cfs_rq->tg_load_avg_contrib
  ..
  for_each_leaf_cfs_rq_safe: parent cfs_rq
    update cfs_rq_load_avg() for parent cfs_rq
      - nothing to do because parent cfs_rq has already been updated
		recently so cfs_rq->tg_load_avg_contrib is not updated
    ...
    parent cfs_rq is decayed
      list_del_leaf_cfs_rq parent cfs_rq
	  - but it still contibutes to tg->load_avg

we must set UPDATE_TG flags when propagting pending load to the parent

Fixes: 039ae8bcf7 ("sched/fair: Fix O(nr_cgroups) in the load balancing path")
Reported-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Odin Ugedal <odin@uged.al>
Link: https://lkml.kernel.org/r/20210527122916.27683-3-vincent.guittot@linaro.org
2021-05-31 10:14:48 +02:00
Vincent Guittot
7c7ad626d9 sched/fair: Keep load_avg and load_sum synced
when removing a cfs_rq from the list we only check _sum value so we must
ensure that _avg and _sum stay synced so load_sum can't be null whereas
load_avg is not after propagating load in the cgroup hierarchy.

Use load_avg to compute load_sum similarly to what is done for util_sum
and runnable_sum.

Fixes: 0e2d2aaaae ("sched/fair: Rewrite PELT migration propagation")
Reported-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Odin Ugedal <odin@uged.al>
Link: https://lkml.kernel.org/r/20210527122916.27683-2-vincent.guittot@linaro.org
2021-05-31 10:14:48 +02:00
Yang Yingliang
fffdaba402 dmaengine: stedma40: add missing iounmap() on error in d40_probe()
Add the missing iounmap() before return from d40_probe()
in the error handling case.

Fixes: 8d318a50b3 ("DMAENGINE: Support for ST-Ericssons DMA40 block v3")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20210518141108.1324127-1-yangyingliang@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:47:27 +05:30
Randy Dunlap
8e2e4f3c58 dmaengine: SF_PDMA depends on HAS_IOMEM
When CONFIG_HAS_IOMEM is not set/enabled, certain iomap() family
functions [including ioremap(), devm_ioremap(), etc.] are not
available.
Drivers that use these functions should depend on HAS_IOMEM so that
they do not cause build errors.

Mends this build error:
s390-linux-ld: drivers/dma/sf-pdma/sf-pdma.o: in function `sf_pdma_probe':
sf-pdma.c:(.text+0x1668): undefined reference to `devm_ioremap_resource'

Fixes: 6973886ad5 ("dmaengine: sf-pdma: add platform DMA support for HiFive Unleashed A00")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Green Wan <green.wan@sifive.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: dmaengine@vger.kernel.org
Link: https://lore.kernel.org/r/20210522021313.16405-4-rdunlap@infradead.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:44:14 +05:30
Randy Dunlap
0cfbb589d6 dmaengine: QCOM_HIDMA_MGMT depends on HAS_IOMEM
When CONFIG_HAS_IOMEM is not set/enabled, certain iomap() family
functions [including ioremap(), devm_ioremap(), etc.] are not
available.
Drivers that use these functions should depend on HAS_IOMEM so that
they do not cause build errors.

Rectifies these build errors:
s390-linux-ld: drivers/dma/qcom/hidma_mgmt.o: in function `hidma_mgmt_probe':
hidma_mgmt.c:(.text+0x780): undefined reference to `devm_ioremap_resource'
s390-linux-ld: drivers/dma/qcom/hidma_mgmt.o: in function `hidma_mgmt_init':
hidma_mgmt.c:(.init.text+0x126): undefined reference to `of_address_to_resource'
s390-linux-ld: hidma_mgmt.c:(.init.text+0x16e): undefined reference to `of_address_to_resource'

Fixes: 67a2003e06 ("dmaengine: add Qualcomm Technologies HIDMA channel driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Sinan Kaya <okaya@codeaurora.org>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: dmaengine@vger.kernel.org
Link: https://lore.kernel.org/r/20210522021313.16405-3-rdunlap@infradead.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:44:14 +05:30
Randy Dunlap
253697b93c dmaengine: ALTERA_MSGDMA depends on HAS_IOMEM
When CONFIG_HAS_IOMEM is not set/enabled, certain iomap() family
functions [including ioremap(), devm_ioremap(), etc.] are not
available.
Drivers that use these functions should depend on HAS_IOMEM so that
they do not cause build errors.

Repairs this build error:
s390-linux-ld: drivers/dma/altera-msgdma.o: in function `request_and_map':
altera-msgdma.c:(.text+0x14b0): undefined reference to `devm_ioremap'

Fixes: a85c6f1b29 ("dmaengine: Add driver for Altera / Intel mSGDMA IP core")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Stefan Roese <sr@denx.de>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: dmaengine@vger.kernel.org
Reviewed-by: Stefan Roese <sr@denx.de>
Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email: sr@denx.de
Link: https://lore.kernel.org/r/20210522021313.16405-2-rdunlap@infradead.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:44:14 +05:30
Dave Jiang
ddf742d4f3 dmaengine: idxd: Add missing cleanup for early error out in probe call
The probe call stack is missing some cleanup when things fail in the
middle. Add the appropriate cleanup routines to make sure we exit
gracefully.

Fixes: a39c7cd043 ("dmaengine: idxd: removal of pcim managed mmio mapping")
Reported-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/162197061707.392656.15760573520817310791.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:39:14 +05:30
Laurent Pinchart
9f007e7b66 dmaengine: xilinx: dpdma: Limit descriptor IDs to 16 bits
While the descriptor ID is stored in a 32-bit field in the hardware
descriptor, only 16 bits are used by the hardware and are reported
through the XILINX_DPDMA_CH_DESC_ID register. Failure to handle the
wrap-around results in a descriptor ID mismatch after 65536 frames. Fix
it.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Jianqiang Chen <jianqiang.chen@xilinx.com>
Reviewed-by: Jianqiang Chen <jianqiang.chen@xilinx.com>
Link: https://lore.kernel.org/r/20210520152420.23986-5-laurent.pinchart@ideasonboard.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:37:17 +05:30
Laurent Pinchart
32828b82fb dmaengine: xilinx: dpdma: Add missing dependencies to Kconfig
The driver depends on both OF and IOMEM support, express those
dependencies in Kconfig. This fixes a build failure on S390 reported by
the 0day bot.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Jianqiang Chen <jianqiang.chen@xilinx.com>
Reviewed-by: Jianqiang Chen <jianqiang.chen@xilinx.com>
Link: https://lore.kernel.org/r/20210520152420.23986-2-laurent.pinchart@ideasonboard.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:37:17 +05:30
Yu Kuai
83eb4868d3 dmaengine: stm32-mdma: fix PM reference leak in stm32_mdma_alloc_chan_resourc()
pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
Fix it by replacing it with pm_runtime_resume_and_get to keep usage
counter balanced.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20210517081826.1564698-2-yukuai3@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:33:09 +05:30
Yu Kuai
8982d48af3 dmaengine: zynqmp_dma: Fix PM reference leak in zynqmp_dma_alloc_chan_resourc()
pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
Fix it by replacing it with pm_runtime_resume_and_get to keep usage
counter balanced.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20210517081826.1564698-4-yukuai3@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-31 09:33:09 +05:30
Jack Yu
6308c44ed6 ASoC: rt5659: Fix the lost powers for the HDA header
The power of "LDO2", "MICBIAS1" and "Mic Det Power" were powered off after
the DAPM widgets were added, and these powers were set by the JD settings
"RT5659_JD_HDA_HEADER" in the probe function. In the codec probe function,
these powers were ignored to prevent them controlled by DAPM.

Signed-off-by: Oder Chiou <oder_chiou@realtek.com>
Signed-off-by: Jack Yu <jack.yu@realtek.com>
Message-Id: <15fced51977b458798ca4eebf03dafb9@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-27 16:24:58 +01:00
Til Jasper Ullrich
c0e0436cb4 platform/x86: thinkpad_acpi: Add X1 Carbon Gen 9 second fan support
The X1 Carbon Gen 9 uses two fans instead of one like the previous
generation. This adds support for the second fan. It has been tested
on my X1 Carbon Gen 9 (20XXS00100) and works fine.

Signed-off-by: Til Jasper Ullrich <tju@tju.me>
Link: https://lore.kernel.org/r/20210525150950.14805-1-tju@tju.me
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-05-27 11:23:57 +02:00
Srinivas Kandagatla
dbec64b11c gpio: wcd934x: Fix shift-out-of-bounds error
bit-mask for pins 0 to 4 is BIT(0) to BIT(4) however we ended up with BIT(n - 1)
which is not right, and this was caught by below usban check

UBSAN: shift-out-of-bounds in drivers/gpio/gpio-wcd934x.c:34:14

Fixes: 59c3246834 ("gpio: wcd934x: Add support to wcd934x gpio controller")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-05-27 09:51:35 +02:00
Pawel Laszczak
a9aecef198 usb: cdnsp: Fix deadlock issue in cdnsp_thread_irq_handler
Patch fixes the following critical issue caused by deadlock which has been
detected during testing NCM class:

smp: csd: Detected non-responsive CSD lock (#1) on CPU#0
smp:     csd: CSD lock (#1) unresponsive.
....
RIP: 0010:native_queued_spin_lock_slowpath+0x61/0x1d0
RSP: 0018:ffffbc494011cde0 EFLAGS: 00000002
RAX: 0000000000000101 RBX: ffff9ee8116b4a68 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ee8116b4658
RBP: ffffbc494011cde0 R08: 0000000000000001 R09: 0000000000000000
R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658
R13: ffff9ee8116b4670 R14: 0000000000000246 R15: ffff9ee8116b4658
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7bcc41a830 CR3: 000000007a612003 CR4: 00000000001706e0
Call Trace:
 <IRQ>
 do_raw_spin_lock+0xc0/0xd0
 _raw_spin_lock_irqsave+0x95/0xa0
 cdnsp_gadget_ep_queue.cold+0x88/0x107 [cdnsp_udc_pci]
 usb_ep_queue+0x35/0x110
 eth_start_xmit+0x220/0x3d0 [u_ether]
 ncm_tx_timeout+0x34/0x40 [usb_f_ncm]
 ? ncm_free_inst+0x50/0x50 [usb_f_ncm]
 __hrtimer_run_queues+0xac/0x440
 hrtimer_run_softirq+0x8c/0xb0
 __do_softirq+0xcf/0x428
 asm_call_irq_on_stack+0x12/0x20
 </IRQ>
 do_softirq_own_stack+0x61/0x70
 irq_exit_rcu+0xc1/0xd0
 sysvec_apic_timer_interrupt+0x52/0xb0
 asm_sysvec_apic_timer_interrupt+0x12/0x20
RIP: 0010:do_raw_spin_trylock+0x18/0x40
RSP: 0018:ffffbc494138bda8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9ee8116b4658 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9ee8116b4658
RBP: ffffbc494138bda8 R08: 0000000000000001 R09: 0000000000000000
R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658
R13: ffff9ee8116b4670 R14: ffff9ee7b5c73d80 R15: ffff9ee8116b4000
 _raw_spin_lock+0x3d/0x70
 ? cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci]
 cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci]
 ? cdnsp_remove_request+0x1f0/0x1f0 [cdnsp_udc_pci]
 ? cdnsp_thread_irq_handler+0x5/0xa0 [cdnsp_udc_pci]
 ? irq_thread+0xa0/0x1c0
 irq_thread_fn+0x28/0x60
 irq_thread+0x105/0x1c0
 ? __kthread_parkme+0x42/0x90
 ? irq_forced_thread_fn+0x90/0x90
 ? wake_threads_waitq+0x30/0x30
 ? irq_thread_check_affinity+0xe0/0xe0
 kthread+0x12a/0x160
 ? kthread_park+0x90/0x90
 ret_from_fork+0x22/0x30

The root cause of issue is spin_lock/spin_unlock instruction instead
spin_lock_irqsave/spin_lock_irqrestore in cdnsp_thread_irq_handler
function.

Cc: stable@vger.kernel.org
Fixes: 3d82904559 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Signed-off-by: Pawel Laszczak <pawell@cadence.com>

Link: https://lore.kernel.org/r/20210526060527.7197-1-pawell@gli-login.cadence.com
Signed-off-by: Peter Chen <peter.chen@kernel.org>
2021-05-27 09:36:20 +08:00
Maximilian Luz
2f26dc05af platform/surface: aggregator_registry: Add support for 13" Intel Surface Laptop 4
Add support for the 13" Intel version of the Surface Laptop 4.

Use the existing node group for the Surface Laptop 3 since the 15" AMD
version already shares its WSID HID with its predecessor and there don't
seem to be any significant differences with regards to SAM.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Link: https://lore.kernel.org/r/20210523134528.798887-3-luzmaximilian@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-05-25 14:51:28 +02:00
Maximilian Luz
5fafeeb4da platform/surface: aggregator_registry: Update comments for 15" AMD Surface Laptop 4
The 15" AMD version of the Surface Laptop 4 shares its WSID HID with the
15" AMD version of the Surface Laptop 3. Update the comments
accordingly.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Link: https://lore.kernel.org/r/20210523134528.798887-2-luzmaximilian@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-05-25 14:51:21 +02:00
Alexandre GRIVEAUX
56df0c758a USB: serial: omninet: update driver description
With the inclusion of Omni 56K Plus, this driver seem to be more common
among the family of Zyxel omni modem. Update the driver and module
descriptions.

Signed-off-by: Alexandre GRIVEAUX <agriveaux@deutnet.info>
[ johan: amend commit message ]
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-05-25 08:59:17 +02:00
Alexandre GRIVEAUX
fc0b3dc9a1 USB: serial: omninet: add device id for Zyxel Omni 56K Plus
Add device id for Zyxel Omni 56K Plus modem, this modem include:

USB chip:
NetChip
NET2888

Main chip:
901041A
F721501APGF

Another modem using the same chips is the Zyxel Omni 56K DUO/NEO,
could be added with the right USB ID.

Signed-off-by: Alexandre GRIVEAUX <agriveaux@deutnet.info>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-05-25 08:56:02 +02:00
Johan Hovold
eb8dbe8032 USB: serial: quatech2: fix control-request directions
The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.

Fix the three requests which erroneously used usb_rcvctrlpipe().

Fixes: f7a33e608d ("USB: serial: add quatech2 usb to serial driver")
Cc: stable@vger.kernel.org      # 3.5
Signed-off-by: Johan Hovold <johan@kernel.org>
2021-05-25 07:58:10 +02:00
Sanket Parmar
d6eef88690 usb: cdns3: Enable TDL_CHK only for OUT ep
ZLP gets stuck if TDL_CHK bit is set and TDL_FROM_TRB is used
as TDL source for IN endpoints. To fix it, TDL_CHK is only
enabled for OUT endpoints.

Fixes: 7733f6c32e ("usb: cdns3: Add Cadence USB3 DRD Driver")
Reported-by: Aswath Govindraju <a-govindraju@ti.com>
Signed-off-by: Sanket Parmar <sparmar@cadence.com>
Link: https://lore.kernel.org/r/1621263912-13175-1-git-send-email-sparmar@cadence.com
Signed-off-by: Peter Chen <peter.chen@kernel.org>
2021-05-25 08:55:29 +08:00
Mark Brown
a072cbda97 Merge series "Fix MAX77620 regulator driver regression" from Dmitry Osipenko <digetx@gmail.com>:
Hi,

The next-20210521 started to fail on Nexus 7 because of the change to
regulator core that caused regression of the MAX77620 regulator driver.
The regulator driver is now getting a deferred probe and turned out
driver wasn't ready for this. The root of the problem is that OF node
of the PMIC MFD sub-device is shared with the PINCTRL sub-device and we
need to convey this information to the driver core, otherwise it will
try to claim GPIO pin that is already claimed by PINCTRL and fail the
probe.

Dmitry Osipenko (2):
  regulator: max77620: Use device_set_of_node_from_dev()
  regulator: max77620: Silence deferred probe error

 drivers/regulator/max77620-regulator.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

--
2.30.2
2021-05-24 12:54:44 +01:00
Lukas Wunner
13817d466e spi: bcm2835: Fix out-of-bounds access with more than 4 slaves
Commit 571e31fa60 ("spi: bcm2835: Cache CS register value for
->prepare_message()") limited the number of slaves to 3 at compile-time.
The limitation was necessitated by a statically-sized array prepare_cs[]
in the driver private data which contains a per-slave register value.

The commit sought to enforce the limitation at run-time by setting the
controller's num_chipselect to 3:  Slaves with a higher chipselect are
rejected by spi_add_device().

However the commit neglected that num_chipselect only limits the number
of *native* chipselects.  If GPIO chipselects are specified in the
device tree for more than 3 slaves, num_chipselect is silently raised by
of_spi_get_gpio_numbers() and the result are out-of-bounds accesses to
the statically-sized array prepare_cs[].

As a bandaid fix which is backportable to stable, raise the number of
allowed slaves to 24 (which "ought to be enough for anybody"), enforce
the limitation on slave ->setup and revert num_chipselect to 3 (which is
the number of native chipselects supported by the controller).
An upcoming for-next commit will allow an arbitrary number of slaves.

Fixes: 571e31fa60 ("spi: bcm2835: Cache CS register value for ->prepare_message()")
Reported-by: Joe Burmeister <joe.burmeister@devtank.co.uk>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v5.4+
Cc: Phil Elwell <phil@raspberrypi.com>
Link: https://lore.kernel.org/r/75854affc1923309fde05e47494263bde73e5592.1621703210.git.lukas@wunner.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-24 09:50:36 +01:00
Hao Fang
8d6ee30c11 regulator: hisilicon: use the correct HiSilicon copyright
s/Hisilicon/HiSilicon/.
It should use capital S, according to the official website
https://www.hisilicon.com/en.

Signed-off-by: Hao Fang <fanghao11@huawei.com>
Link: https://lore.kernel.org/r/1621679151-15617-1-git-send-email-fanghao11@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-24 09:50:31 +01:00
Axel Lin
4c668630bf regulator: bd71828: Fix .n_voltages settings
Current .n_voltages settings do not cover the latest 2 valid selectors,
so it fails to set voltage for the hightest voltage support.
The latest linear range has step_uV = 0, so it does not matter if we
count the .n_voltages to maximum selector + 1 or the first selector of
latest linear range + 1.
To simplify calculating the n_voltages, let's just set the
.n_voltages to maximum selector + 1.

Fixes: 522498f8cb ("regulator: bd71828: Basic support for ROHM bd71828 PMIC regulators")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Link: https://lore.kernel.org/r/20210523071045.2168904-2-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-24 09:50:23 +01:00
Axel Lin
0514582a1a regulator: bd70528: Fix off-by-one for buck123 .n_voltages setting
The valid selectors for bd70528 bucks are 0 ~ 0xf, so the .n_voltages
should be 16 (0x10). Use 0x10 to make it consistent with BD70528_LDO_VOLTS.
Also remove redundant defines for BD70528_BUCK_VOLTS.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Link: https://lore.kernel.org/r/20210523071045.2168904-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-24 09:50:22 +01:00
Dmitry Osipenko
62499a94ce regulator: max77620: Silence deferred probe error
One of previous changes to regulator core causes PMIC regulators to
re-probe until supply regulator is registered. Silence noisy error
message about the deferred probe.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20210523224243.13219-3-digetx@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-24 09:50:20 +01:00
Dmitry Osipenko
6f55c5dd11 regulator: max77620: Use device_set_of_node_from_dev()
The MAX77620 driver fails to re-probe on deferred probe because driver
core tries to claim resources that are already claimed by the PINCTRL
device. Use device_set_of_node_from_dev() helper which marks OF node as
reused, skipping erroneous execution of pinctrl_bind_pins() for the PMIC
device on the re-probe.

Fixes: aea6cb9970 ("regulator: resolve supply after creating regulator")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20210523224243.13219-2-digetx@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-24 09:50:19 +01:00
Kefeng Wang
41daf6ba59 ASoC: core: Fix Null-point-dereference in fmt_single_name()
Check the return value of devm_kstrdup() in case of
Null-point-dereference.

Fixes: 45dd9943fc ("ASoC: core: remove artificial component and DAI name constraint")
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20210524024941.159952-1-wangkefeng.wang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-24 09:49:28 +01:00
Axel Lin
36cb555fae regulator: scmi: Fix off-by-one for linear regulators .n_voltages setting
For linear regulators, the .n_voltages is (max_uv - min_uv) / uv_step + 1.

Fixes: 0fbeae70ee ("regulator: add SCMI driver")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20210521073020.1944981-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-21 20:05:48 +01:00
Dmitry Baryshkov
98e48cd928 regulator: core: resolve supply for boot-on/always-on regulators
For the boot-on/always-on regulators the set_machine_constrainst() is
called before resolving rdev->supply. Thus the code would try to enable
rdev before enabling supplying regulator. Enforce resolving supply
regulator before enabling rdev.

Fixes: aea6cb9970 ("regulator: resolve supply after creating regulator")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210519221224.2868496-1-dmitry.baryshkov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-20 22:00:11 +01:00
Axel Lin
855bfff9d6 regulator: fixed: Ensure enable_counter is correct if reg_domain_disable fails
dev_pm_genpd_set_performance_state() may fail, so had better to check it's
return value before decreasing priv->enable_counter.

Fixes: bf3a28cf42 ("regulator: fixed: support using power domain for enable/disable")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20210520111811.1806293-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-20 17:57:23 +01:00
Axel Lin
687c9e3b1a regulator: Check ramp_delay_table for regulator_set_ramp_delay_regmap
Return -EINVAL if ramp_delay_table is NULL.
Also add WARN_ON since the driver code needs fix if this happened.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20210519132255.1683863-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-19 14:31:43 +01:00
Souptick Joarder
333944c7c3 pinctrl: aspeed: Fix minor documentation error
Kernel test robot throws below warning ->

drivers/pinctrl/aspeed/pinctrl-aspeed-g5.c:2705: warning: This comment
starts with '/**', but isn't a kernel-doc comment. Refer
Documentation/doc-guide/kernel-doc.rst
drivers/pinctrl/aspeed/pinctrl-aspeed-g6.c:2614: warning: This comment
starts with '/**', but isn't a kernel-doc comment. Refer
Documentation/doc-guide/kernel-doc.rst
drivers/pinctrl/aspeed/pinctrl-aspeed.c:111: warning: This comment
starts with '/**', but isn't a kernel-doc comment. Refer
Documentation/doc-guide/kernel-doc.rst
drivers/pinctrl/aspeed/pinmux-aspeed.c:24: warning: This comment starts
with '/**', but isn't a kernel-doc comment. Refer
Documentation/doc-guide/kernel-doc.rst

Fix minor documentation error.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Andrew Jeffery <andrew@aj.id.au>
Link: https://lore.kernel.org/r/1619353584-8196-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-05-19 02:11:18 +02:00
Sven Eckelmann
9f460ae31c batman-adv: Avoid WARN_ON timing related checks
The soft/batadv interface for a queued OGM can be changed during the time
the OGM was queued for transmission and when the OGM is actually
transmitted by the worker.

But WARN_ON must be used to denote kernel bugs and not to print simple
warnings. A warning can simply be printed using pr_warn.

Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: syzbot+c0b807de416427ff3dd1@syzkaller.appspotmail.com
Fixes: ef0a937f7a ("batman-adv: consider outgoing interface in OGM sending")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2021-05-18 21:10:01 +02:00
Axel Lin
34991ee96f regulator: fan53880: Fix missing n_voltages setting
Fixes: e6dea51e2d ("regulator: fan53880: Add initial support")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Christoph Fritz <chf.fritz@googlemail.com>
Link: https://lore.kernel.org/r/20210517105325.1227393-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-18 14:04:22 +01:00
Axel Lin
0b1e552673 regulator: da9121: Return REGULATOR_MODE_INVALID for invalid mode
-EINVAL is not a valid return value for .of_map_mode, return
REGULATOR_MODE_INVALID instead.

Fixes: 65ac97042d ("regulator: da9121: add mode support")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Adam Ward <Adam.Ward.opensource@diasemi.com>
Link: https://lore.kernel.org/r/20210517052721.1063375-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-18 14:04:21 +01:00
Chen Li
7c2fc79250 phy: usb: Fix misuse of IS_ENABLED
While IS_ENABLED() is perfectly fine for CONFIG_* symbols, it is not
for other symbols such as __BIG_ENDIAN that is provided directly by
the compiler.

Switch to use CONFIG_CPU_BIG_ENDIAN instead of __BIG_ENDIAN.

Signed-off-by: Chen Li <chenli@uniontech.com>
Reviewed-by: Al Cooper <alcooperx@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: 94583a4104 ("phy: usb: Restructure in preparation for adding 7216 USB support")
Link: https://lore.kernel.org/r/87czuggpra.wl-chenli@uniontech.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-14 17:40:37 +05:30
Miguel Ojeda
4792f9dd12 clang-format: Update with the latest for_each macro list
Re-run the shell fragment that generated the original list.

Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2021-05-12 23:32:39 +02:00
Wei Ming Chen
ca0760e7d7 Compiler Attributes: Add continue in comment
Add "continue;" for switch/case block according to Doc[1]

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Wei Ming Chen <jj251510319013@gmail.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2021-05-12 20:18:46 +02:00
Mark Brown
adf1471b2f Merge series "regulator: fan53555: tcs4525 fix and cleanup" from Peter Geis <pgwipeout@gmail.com>:
The tcs4525 voltage calculation is incorrect, which leads to a deadlock
on the rk3566-quartz64 board when loading cpufreq.
Fix the voltage calculation to correct the deadlock.
While we are at it, add a safety check and clean up the function names
to be more accurate.

Peter Geis (3):
  regulator: fan53555: fix TCS4525 voltage calulation
  regulator: fan53555: only bind tcs4525 to correct chip id
  regulator: fan53555: fix tcs4525 function names

 drivers/regulator/fan53555.c | 44 ++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 17 deletions(-)

--
2.25.1
2021-05-12 16:22:53 +01:00
Peter Geis
f8c8871f5e regulator: fan53555: fix TCS4525 voltage calulation
The TCS4525 has 128 voltage steps. With the calculation set to 127 the
most significant bit is disregarded which leads to a miscalculation of
the voltage by about 200mv.

Fix the calculation to end deadlock on the rk3566-quartz64 which uses
this as the cpu regulator.

Fixes: 914df8faa7 ("regulator: fan53555: Add TCS4525 DCDC support")
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
Link: https://lore.kernel.org/r/20210511211335.2935163-2-pgwipeout@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12 16:22:21 +01:00
Axel Lin
3d681804ef regulator: cros-ec: Fix error code in dev_err message
Show proper error code instead of 0.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20210512075824.620580-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12 13:01:41 +01:00
Quanyang Wang
538ea65a9f dmaengine: xilinx: dpdma: initialize registers before request_irq
In some scenarios (kdump), dpdma hardware irqs has been enabled when
calling request_irq in probe function, and then the dpdma irq handler
xilinx_dpdma_irq_handler is invoked to access xdev->chan[i]. But at
this moment xdev->chan[i] hasn't been initialized.

We should ensure the dpdma controller to be in a consistent and
clean state before further initialization. So add dpdma_hw_init()
to do this.

Furthermore, in xilinx_dpdma_disable_irq, disable all interrupts
instead of error interrupts.

This patch is to fix the kdump kernel crash as below:

[    3.696128] Unable to handle kernel NULL pointer dereference at virtual address 000000000000012c
[    3.696710] xilinx-zynqmp-dpdma fd4c0000.dma-controller: Xilinx DPDMA engine is probed
[    3.704900] Mem abort info:
[    3.704902]   ESR = 0x96000005
[    3.704905]   EC = 0x25: DABT (current EL), IL = 32 bits
[    3.704907]   SET = 0, FnV = 0
[    3.704912]   EA = 0, S1PTW = 0
[    3.713800] ahci-ceva fd0c0000.ahci: supply ahci not found, using dummy regulator
[    3.715585] Data abort info:
[    3.715587]   ISV = 0, ISS = 0x00000005
[    3.715589]   CM = 0, WnR = 0
[    3.715592] [000000000000012c] user address but active_mm is swapper
[    3.715596] Internal error: Oops: 96000005 [#1] SMP
[    3.715599] Modules linked in:
[    3.715608] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.0-12170-g60894882155f-dirty #77
[    3.723937] Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
[    3.723942] pstate: 80000085 (Nzcv daIf -PAN -UAO -TCO BTYPE=--)
[    3.723956] pc : xilinx_dpdma_irq_handler+0x418/0x560
[    3.793049] lr : xilinx_dpdma_irq_handler+0x3d8/0x560
[    3.798089] sp : ffffffc01186bdf0
[    3.801388] x29: ffffffc01186bdf0 x28: ffffffc011836f28
[    3.806692] x27: ffffff8023e0ac80 x26: 0000000000000080
[    3.811996] x25: 0000000008000408 x24: 0000000000000003
[    3.817300] x23: ffffffc01186be70 x22: ffffffc011291740
[    3.822604] x21: 0000000000000000 x20: 0000000008000408
[    3.827908] x19: 0000000000000000 x18: 0000000000000010
[    3.833212] x17: 0000000000000000 x16: 0000000000000000
[    3.838516] x15: 0000000000000000 x14: ffffffc011291740
[    3.843820] x13: ffffffc02eb4d000 x12: 0000000034d4d91d
[    3.849124] x11: 0000000000000040 x10: ffffffc0112d2d48
[    3.854428] x9 : ffffffc0112d2d40 x8 : ffffff8021c00268
[    3.859732] x7 : 0000000000000000 x6 : ffffffc011836000
[    3.865036] x5 : 0000000000000003 x4 : 0000000000000000
[    3.870340] x3 : 0000000000000001 x2 : 0000000000000000
[    3.875644] x1 : 0000000000000000 x0 : 000000000000012c
[    3.880948] Call trace:
[    3.883382]  xilinx_dpdma_irq_handler+0x418/0x560
[    3.888079]  __handle_irq_event_percpu+0x5c/0x178
[    3.892774]  handle_irq_event_percpu+0x34/0x98
[    3.897210]  handle_irq_event+0x44/0xb8
[    3.901030]  handle_fasteoi_irq+0xd0/0x190
[    3.905117]  generic_handle_irq+0x30/0x48
[    3.909111]  __handle_domain_irq+0x64/0xc0
[    3.913192]  gic_handle_irq+0x78/0xa0
[    3.916846]  el1_irq+0xc4/0x180
[    3.919982]  cpuidle_enter_state+0x134/0x2f8
[    3.924243]  cpuidle_enter+0x38/0x50
[    3.927810]  call_cpuidle+0x1c/0x40
[    3.931290]  do_idle+0x20c/0x270
[    3.934502]  cpu_startup_entry+0x28/0x58
[    3.938410]  rest_init+0xbc/0xcc
[    3.941631]  arch_call_rest_init+0x10/0x1c
[    3.945718]  start_kernel+0x51c/0x558

Fixes: 7cbb0c63de ("dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Link: https://lore.kernel.org/r/20210430064041.4058180-1-quanyang.wang@windriver.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-10 21:40:30 +05:30
Bumyong Lee
4ad5dd2d78 dmaengine: pl330: fix wrong usage of spinlock flags in dma_cyclc
flags varible which is the input parameter of pl330_prep_dma_cyclic()
should not be used by spinlock_irq[save/restore] function.

Signed-off-by: Jongho Park <jongho7.park@samsung.com>
Signed-off-by: Bumyong Lee <bumyong.lee@samsung.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Link: https://lore.kernel.org/r/20210507063647.111209-1-chanho61.park@samsung.com
Fixes: f6f2421c0a ("dmaengine: pl330: Merge dma_pl330_dmac and pl330_dmac structs")
Cc: stable@vger.kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-10 21:38:46 +05:30
Zhen Lei
17866bc6b2 dmaengine: fsl-dpaa2-qdma: Fix error return code in two functions
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in the function where it is.

Fixes: 7fdf9b05c7 ("dmaengine: fsl-dpaa2-qdma: Add NXP dpaa2 qDMA controller driver for Layerscape SoCs")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210508030056.2027-1-thunder.leizhen@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-10 21:34:49 +05:30
Dave Jiang
077cdb355b dmaengine: idxd: add missing dsa driver unregister
The idxd_unregister_driver() has never been called for the idxd driver upon
removal. Add fix to call unregister driver on module removal.

Fixes: c52ca47823 ("dmaengine: idxd: add configuration component of driver")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161947994449.1053102.13189942817915448216.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-10 19:48:12 +05:30
Dave Jiang
1c4841ccbd dmaengine: idxd: add engine 'struct device' missing bus type assignment
engine 'struct device' setup is missing assigning the bus type. Add it to
dsa_bus_type.

Fixes: 75b9113090 ("dmaengine: idxd: fix engine conf_dev lifetime")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161947841562.984844.17505646725993659651.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-05-10 19:48:12 +05:30
521 changed files with 5067 additions and 2401 deletions

View File

@@ -109,8 +109,8 @@ ForEachMacros:
- 'css_for_each_child'
- 'css_for_each_descendant_post'
- 'css_for_each_descendant_pre'
- 'cxl_for_each_cmd'
- 'device_for_each_child_node'
- 'displayid_iter_for_each'
- 'dma_fence_chain_for_each'
- 'do_for_each_ftrace_op'
- 'drm_atomic_crtc_for_each_plane'
@@ -136,6 +136,7 @@ ForEachMacros:
- 'drm_mm_for_each_node_in_range'
- 'drm_mm_for_each_node_safe'
- 'flow_action_for_each'
- 'for_each_acpi_dev_match'
- 'for_each_active_dev_scope'
- 'for_each_active_drhd_unit'
- 'for_each_active_iommu'
@@ -171,7 +172,6 @@ ForEachMacros:
- 'for_each_dapm_widgets'
- 'for_each_dev_addr'
- 'for_each_dev_scope'
- 'for_each_displayid_db'
- 'for_each_dma_cap_mask'
- 'for_each_dpcm_be'
- 'for_each_dpcm_be_rollback'
@@ -179,6 +179,7 @@ ForEachMacros:
- 'for_each_dpcm_fe'
- 'for_each_drhd_unit'
- 'for_each_dss_dev'
- 'for_each_dtpm_table'
- 'for_each_efi_memory_desc'
- 'for_each_efi_memory_desc_in_map'
- 'for_each_element'
@@ -215,6 +216,7 @@ ForEachMacros:
- 'for_each_migratetype_order'
- 'for_each_msi_entry'
- 'for_each_msi_entry_safe'
- 'for_each_msi_vector'
- 'for_each_net'
- 'for_each_net_continue_reverse'
- 'for_each_netdev'
@@ -270,6 +272,12 @@ ForEachMacros:
- 'for_each_prime_number_from'
- 'for_each_process'
- 'for_each_process_thread'
- 'for_each_prop_codec_conf'
- 'for_each_prop_dai_codec'
- 'for_each_prop_dai_cpu'
- 'for_each_prop_dlc_codecs'
- 'for_each_prop_dlc_cpus'
- 'for_each_prop_dlc_platforms'
- 'for_each_property_of_node'
- 'for_each_registered_fb'
- 'for_each_requested_gpio'
@@ -430,6 +438,7 @@ ForEachMacros:
- 'queue_for_each_hw_ctx'
- 'radix_tree_for_each_slot'
- 'radix_tree_for_each_tagged'
- 'rb_for_each'
- 'rbtree_postorder_for_each_entry_safe'
- 'rdma_for_each_block'
- 'rdma_for_each_port'

View File

@@ -212,6 +212,8 @@ Manivannan Sadhasivam <mani@kernel.org> <manivannanece23@gmail.com>
Manivannan Sadhasivam <mani@kernel.org> <manivannan.sadhasivam@linaro.org>
Marcin Nowakowski <marcin.nowakowski@mips.com> <marcin.nowakowski@imgtec.com>
Marc Zyngier <maz@kernel.org> <marc.zyngier@arm.com>
Marek Behún <kabel@kernel.org> <marek.behun@nic.cz>
Marek Behún <kabel@kernel.org> Marek Behun <marek.behun@nic.cz>
Mark Brown <broonie@sirena.org.uk>
Mark Starovoytov <mstarovo@pm.me> <mstarovoitov@marvell.com>
Mark Yao <markyao0591@gmail.com> <mark.yao@rock-chips.com>

View File

@@ -149,6 +149,17 @@ properties:
maxItems: 6
$ref: /schemas/types.yaml#/definitions/uint32-array
sink-vdos-v1:
description: An array of u32 with each entry, a Vendor Defined Message Object (VDO),
providing additional information corresponding to the product, the detailed bit
definitions and the order of each VDO can be found in
"USB Power Delivery Specification Revision 2.0, Version 1.3" chapter 6.4.4.3.1 Discover
Identity. User can specify the VDO array via VDO_IDH/_CERT/_PRODUCT/_CABLE/_AMA defined in
dt-bindings/usb/pd.h.
minItems: 3
maxItems: 6
$ref: /schemas/types.yaml#/definitions/uint32-array
op-sink-microwatt:
description: Sink required operating power in microwatt, if source can't
offer the power, Capability Mismatch is set. Required for power sink and
@@ -207,6 +218,10 @@ properties:
SNK_READY for non-pd link.
type: boolean
dependencies:
sink-vdos-v1: [ 'sink-vdos' ]
sink-vdos: [ 'sink-vdos-v1' ]
required:
- compatible

View File

@@ -49,7 +49,7 @@ examples:
#size-cells = <0>;
adc@48 {
comatible = "ti,ads7828";
compatible = "ti,ads7828";
reg = <0x48>;
vref-supply = <&vref>;
ti,differential-input;

View File

@@ -67,9 +67,7 @@ properties:
maxItems: 1
clock-names:
maxItems: 1
items:
- const: fck
const: fck
resets:
maxItems: 1

View File

@@ -57,7 +57,7 @@ patternProperties:
rate
sound-dai:
$ref: /schemas/types.yaml#/definitions/phandle
$ref: /schemas/types.yaml#/definitions/phandle-array
description: phandle of the CPU DAI
patternProperties:
@@ -71,7 +71,7 @@ patternProperties:
properties:
sound-dai:
$ref: /schemas/types.yaml#/definitions/phandle
$ref: /schemas/types.yaml#/definitions/phandle-array
description: phandle of the codec DAI
required:

View File

@@ -58,6 +58,6 @@ RISC-V Linux Kernel SV39
|
____________________________________________________________|____________________________________________________________
| | | |
ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules
ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel, BPF
ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules, BPF
ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel
__________________|____________|__________________|_________|____________________________________________________________

View File

@@ -171,8 +171,8 @@ Shadow pages contain the following information:
shadow pages) so role.quadrant takes values in the range 0..3. Each
quadrant maps 1GB virtual address space.
role.access:
Inherited guest access permissions in the form uwx. Note execute
permission is positive, not negative.
Inherited guest access permissions from the parent ptes in the form uwx.
Note execute permission is positive, not negative.
role.invalid:
The page is invalid and should not be used. It is a root page that is
currently pinned (by a cpu hardware register pointing to it); once it is

View File

@@ -181,7 +181,7 @@ SLUB Debug output
Here is a sample of slub debug output::
====================================================================
BUG kmalloc-8: Redzone overwritten
BUG kmalloc-8: Right Redzone overwritten
--------------------------------------------------------------------
INFO: 0xc90f6d28-0xc90f6d2b. First byte 0x00 instead of 0xcc
@@ -189,10 +189,10 @@ Here is a sample of slub debug output::
INFO: Object 0xc90f6d20 @offset=3360 fp=0xc90f6d58
INFO: Allocated in get_modalias+0x61/0xf5 age=53 cpu=1 pid=554
Bytes b4 0xc90f6d10: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
Object 0xc90f6d20: 31 30 31 39 2e 30 30 35 1019.005
Redzone 0xc90f6d28: 00 cc cc cc .
Padding 0xc90f6d50: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
Bytes b4 (0xc90f6d10): 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
Object (0xc90f6d20): 31 30 31 39 2e 30 30 35 1019.005
Redzone (0xc90f6d28): 00 cc cc cc .
Padding (0xc90f6d50): 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
[<c010523d>] dump_trace+0x63/0x1eb
[<c01053df>] show_trace_log_lvl+0x1a/0x2f

View File

@@ -1816,7 +1816,7 @@ F: drivers/pinctrl/pinctrl-gemini.c
F: drivers/rtc/rtc-ftrtc010.c
ARM/CZ.NIC TURRIS SUPPORT
M: Marek Behun <kabel@kernel.org>
M: Marek Behún <kabel@kernel.org>
S: Maintained
W: https://www.turris.cz/
F: Documentation/ABI/testing/debugfs-moxtet
@@ -7354,7 +7354,6 @@ F: drivers/net/ethernet/freescale/fs_enet/
F: include/linux/fs_enet_pd.h
FREESCALE SOC SOUND DRIVERS
M: Timur Tabi <timur@kernel.org>
M: Nicolin Chen <nicoleotsuka@gmail.com>
M: Xiubo Li <Xiubo.Lee@gmail.com>
R: Fabio Estevam <festevam@gmail.com>
@@ -10946,7 +10945,7 @@ F: include/linux/mv643xx.h
MARVELL MV88X3310 PHY DRIVER
M: Russell King <linux@armlinux.org.uk>
M: Marek Behun <marek.behun@nic.cz>
M: Marek Behún <kabel@kernel.org>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/net/phy/marvell10g.c
@@ -16560,6 +16559,7 @@ F: drivers/misc/sgi-xp/
SHARED MEMORY COMMUNICATIONS (SMC) SOCKETS
M: Karsten Graul <kgraul@linux.ibm.com>
M: Guvenc Gulce <guvenc@linux.ibm.com>
L: linux-s390@vger.kernel.org
S: Supported
W: http://www.ibm.com/developerworks/linux/linux390/
@@ -18873,6 +18873,13 @@ S: Maintained
F: drivers/usb/host/isp116x*
F: include/linux/usb/isp116x.h
USB ISP1760 DRIVER
M: Rui Miguel Silva <rui.silva@linaro.org>
L: linux-usb@vger.kernel.org
S: Maintained
F: drivers/usb/isp1760/*
F: Documentation/devicetree/bindings/usb/nxp,isp1760.yaml
USB LAN78XX ETHERNET DRIVER
M: Woojung Huh <woojung.huh@microchip.com>
M: UNGLinuxDriver@microchip.com

View File

@@ -2,8 +2,8 @@
VERSION = 5
PATCHLEVEL = 13
SUBLEVEL = 0
EXTRAVERSION = -rc5
NAME = Frozen Wasteland
EXTRAVERSION =
NAME = Opossums on Parade
# *DOCUMENTATION*
# To see a list of typical targets execute "make help"
@@ -929,11 +929,14 @@ CC_FLAGS_LTO += -fvisibility=hidden
# Limit inlining across translation units to reduce binary size
KBUILD_LDFLAGS += -mllvm -import-instr-limit=5
# Check for frame size exceeding threshold during prolog/epilog insertion.
# Check for frame size exceeding threshold during prolog/epilog insertion
# when using lld < 13.0.0.
ifneq ($(CONFIG_FRAME_WARN),0)
ifeq ($(shell test $(CONFIG_LLD_VERSION) -lt 130000; echo $$?),0)
KBUILD_LDFLAGS += -plugin-opt=-warn-stack-size=$(CONFIG_FRAME_WARN)
endif
endif
endif
ifdef CONFIG_LTO
KBUILD_CFLAGS += -fno-lto $(CC_FLAGS_LTO)

View File

@@ -18,6 +18,7 @@
*/
struct sigcontext {
struct user_regs_struct regs;
struct user_regs_arcv2 v2abi;
};
#endif /* _ASM_ARC_SIGCONTEXT_H */

View File

@@ -61,6 +61,41 @@ struct rt_sigframe {
unsigned int sigret_magic;
};
static int save_arcv2_regs(struct sigcontext *mctx, struct pt_regs *regs)
{
int err = 0;
#ifndef CONFIG_ISA_ARCOMPACT
struct user_regs_arcv2 v2abi;
v2abi.r30 = regs->r30;
#ifdef CONFIG_ARC_HAS_ACCL_REGS
v2abi.r58 = regs->r58;
v2abi.r59 = regs->r59;
#else
v2abi.r58 = v2abi.r59 = 0;
#endif
err = __copy_to_user(&mctx->v2abi, &v2abi, sizeof(v2abi));
#endif
return err;
}
static int restore_arcv2_regs(struct sigcontext *mctx, struct pt_regs *regs)
{
int err = 0;
#ifndef CONFIG_ISA_ARCOMPACT
struct user_regs_arcv2 v2abi;
err = __copy_from_user(&v2abi, &mctx->v2abi, sizeof(v2abi));
regs->r30 = v2abi.r30;
#ifdef CONFIG_ARC_HAS_ACCL_REGS
regs->r58 = v2abi.r58;
regs->r59 = v2abi.r59;
#endif
#endif
return err;
}
static int
stash_usr_regs(struct rt_sigframe __user *sf, struct pt_regs *regs,
sigset_t *set)
@@ -94,6 +129,10 @@ stash_usr_regs(struct rt_sigframe __user *sf, struct pt_regs *regs,
err = __copy_to_user(&(sf->uc.uc_mcontext.regs.scratch), &uregs.scratch,
sizeof(sf->uc.uc_mcontext.regs.scratch));
if (is_isa_arcv2())
err |= save_arcv2_regs(&(sf->uc.uc_mcontext), regs);
err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(sigset_t));
return err ? -EFAULT : 0;
@@ -109,6 +148,10 @@ static int restore_usr_regs(struct pt_regs *regs, struct rt_sigframe __user *sf)
err |= __copy_from_user(&uregs.scratch,
&(sf->uc.uc_mcontext.regs.scratch),
sizeof(sf->uc.uc_mcontext.regs.scratch));
if (is_isa_arcv2())
err |= restore_arcv2_regs(&(sf->uc.uc_mcontext), regs);
if (err)
return -EFAULT;

View File

@@ -57,7 +57,6 @@ SECTIONS
.init.ramfs : { INIT_RAM_FS }
. = ALIGN(PAGE_SIZE);
_stext = .;
HEAD_TEXT_SECTION
INIT_TEXT_SECTION(L1_CACHE_BYTES)
@@ -83,6 +82,7 @@ SECTIONS
.text : {
_text = .;
_stext = .;
TEXT_TEXT
SCHED_TEXT
CPUIDLE_TEXT

View File

@@ -7,9 +7,11 @@
#ifdef CONFIG_CPU_IDLE
extern int arm_cpuidle_simple_enter(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index);
#define __cpuidle_method_section __used __section("__cpuidle_method_of_table")
#else
static inline int arm_cpuidle_simple_enter(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index) { return -ENODEV; }
#define __cpuidle_method_section __maybe_unused /* drop silently */
#endif
/* Common ARM WFI state */
@@ -42,8 +44,7 @@ struct of_cpuidle_method {
#define CPUIDLE_METHOD_OF_DECLARE(name, _method, _ops) \
static const struct of_cpuidle_method __cpuidle_method_of_table_##name \
__used __section("__cpuidle_method_of_table") \
= { .method = _method, .ops = _ops }
__cpuidle_method_section = { .method = _method, .ops = _ops }
extern int arm_cpuidle_suspend(int index);

View File

@@ -545,9 +545,11 @@ void notrace cpu_init(void)
* In Thumb-2, msr with an immediate value is not allowed.
*/
#ifdef CONFIG_THUMB2_KERNEL
#define PLC "r"
#define PLC_l "l"
#define PLC_r "r"
#else
#define PLC "I"
#define PLC_l "I"
#define PLC_r "I"
#endif
/*
@@ -569,15 +571,15 @@ void notrace cpu_init(void)
"msr cpsr_c, %9"
:
: "r" (stk),
PLC (PSR_F_BIT | PSR_I_BIT | IRQ_MODE),
PLC_r (PSR_F_BIT | PSR_I_BIT | IRQ_MODE),
"I" (offsetof(struct stack, irq[0])),
PLC (PSR_F_BIT | PSR_I_BIT | ABT_MODE),
PLC_r (PSR_F_BIT | PSR_I_BIT | ABT_MODE),
"I" (offsetof(struct stack, abt[0])),
PLC (PSR_F_BIT | PSR_I_BIT | UND_MODE),
PLC_r (PSR_F_BIT | PSR_I_BIT | UND_MODE),
"I" (offsetof(struct stack, und[0])),
PLC (PSR_F_BIT | PSR_I_BIT | FIQ_MODE),
PLC_r (PSR_F_BIT | PSR_I_BIT | FIQ_MODE),
"I" (offsetof(struct stack, fiq[0])),
PLC (PSR_F_BIT | PSR_I_BIT | SVC_MODE)
PLC_l (PSR_F_BIT | PSR_I_BIT | SVC_MODE)
: "r14");
#endif
}

View File

@@ -50,7 +50,7 @@ l_yes:
1098: nop; \
.pushsection __jump_table, "aw"; \
.long 1098b - ., LABEL - .; \
FTR_ENTRY_LONG KEY; \
FTR_ENTRY_LONG KEY - .; \
.popsection
#endif

View File

@@ -902,6 +902,10 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
unsafe_copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set), badframe_block);
user_write_access_end();
/* Save the siginfo outside of the unsafe block. */
if (copy_siginfo_to_user(&frame->info, &ksig->info))
goto badframe;
/* Make sure signal handler doesn't get spurious FP exceptions */
tsk->thread.fp_state.fpscr = 0;
@@ -915,11 +919,6 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
regs->nip = (unsigned long) &frame->tramp[0];
}
/* Save the siginfo outside of the unsafe block. */
if (copy_siginfo_to_user(&frame->info, &ksig->info))
goto badframe;
/* Allocate a dummy caller frame for the signal handler. */
newsp = ((unsigned long)frame) - __SIGNAL_FRAMESIZE;
err |= put_user(regs->gpr[1], (unsigned long __user *)newsp);

View File

@@ -20,6 +20,7 @@
#include <asm/machdep.h>
#include <asm/rtas.h>
#include <asm/kasan.h>
#include <asm/sparsemem.h>
#include <asm/svm.h>
#include <mm/mmu_decl.h>

View File

@@ -2254,7 +2254,7 @@ unsigned long perf_instruction_pointer(struct pt_regs *regs)
bool use_siar = regs_use_siar(regs);
unsigned long siar = mfspr(SPRN_SIAR);
if (ppmu->flags & PPMU_P10_DD1) {
if (ppmu && (ppmu->flags & PPMU_P10_DD1)) {
if (siar)
return siar;
else

View File

@@ -61,11 +61,11 @@ config RISCV
select GENERIC_TIME_VSYSCALL if MMU && 64BIT
select HANDLE_DOMAIN_IRQ
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
select HAVE_ARCH_JUMP_LABEL_RELATIVE if !XIP_KERNEL
select HAVE_ARCH_KASAN if MMU && 64BIT
select HAVE_ARCH_KASAN_VMALLOC if MMU && 64BIT
select HAVE_ARCH_KGDB
select HAVE_ARCH_KGDB if !XIP_KERNEL
select HAVE_ARCH_KGDB_QXFER_PKT
select HAVE_ARCH_MMAP_RND_BITS if MMU
select HAVE_ARCH_SECCOMP_FILTER
@@ -80,9 +80,9 @@ config RISCV
select HAVE_GCC_PLUGINS
select HAVE_GENERIC_VDSO if MMU && 64BIT
select HAVE_IRQ_TIME_ACCOUNTING
select HAVE_KPROBES
select HAVE_KPROBES_ON_FTRACE
select HAVE_KRETPROBES
select HAVE_KPROBES if !XIP_KERNEL
select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
select HAVE_KRETPROBES if !XIP_KERNEL
select HAVE_PCI
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
@@ -231,11 +231,11 @@ config ARCH_RV64I
bool "RV64I"
select 64BIT
select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && GCC_VERSION >= 50000
select HAVE_DYNAMIC_FTRACE if MMU && $(cc-option,-fpatchable-function-entry=8)
select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && $(cc-option,-fpatchable-function-entry=8)
select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_TRACER if !XIP_KERNEL
select SWIOTLB if MMU
endchoice

View File

@@ -14,6 +14,7 @@ config SOC_SIFIVE
select CLK_SIFIVE
select CLK_SIFIVE_PRCI
select SIFIVE_PLIC
select RISCV_ERRATA_ALTERNATIVE
select ERRATA_SIFIVE
help
This enables support for SiFive SoC platform hardware.

View File

@@ -16,7 +16,7 @@ ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
CC_FLAGS_FTRACE := -fpatchable-function-entry=8
endif
ifeq ($(CONFIG_64BIT)$(CONFIG_CMODEL_MEDLOW),yy)
ifeq ($(CONFIG_CMODEL_MEDLOW),y)
KBUILD_CFLAGS_MODULE += -mcmodel=medany
endif

View File

@@ -1,2 +1,3 @@
# SPDX-License-Identifier: GPL-2.0
dtb-$(CONFIG_SOC_MICROCHIP_POLARFIRE) += microchip-mpfs-icicle-kit.dtb
obj-$(CONFIG_BUILTIN_DTB) += $(addsuffix .o, $(dtb-y))

View File

@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
dtb-$(CONFIG_SOC_SIFIVE) += hifive-unleashed-a00.dtb \
hifive-unmatched-a00.dtb
obj-$(CONFIG_BUILTIN_DTB) += $(addsuffix .o, $(dtb-y))

View File

@@ -273,7 +273,7 @@
cache-size = <2097152>;
cache-unified;
interrupt-parent = <&plic0>;
interrupts = <19 20 21 22>;
interrupts = <19 21 22 20>;
reg = <0x0 0x2010000 0x0 0x1000>;
};
gpio: gpio@10060000 {

View File

@@ -51,7 +51,7 @@
REG_ASM " " newlen "\n" \
".word " errata_id "\n"
#define ALT_NEW_CONSTENT(vendor_id, errata_id, enable, new_c) \
#define ALT_NEW_CONTENT(vendor_id, errata_id, enable, new_c) \
".if " __stringify(enable) " == 1\n" \
".pushsection .alternative, \"a\"\n" \
ALT_ENTRY("886b", "888f", __stringify(vendor_id), __stringify(errata_id), "889f - 888f") \
@@ -69,7 +69,7 @@
"886 :\n" \
old_c "\n" \
"887 :\n" \
ALT_NEW_CONSTENT(vendor_id, errata_id, enable, new_c)
ALT_NEW_CONTENT(vendor_id, errata_id, enable, new_c)
#define _ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, CONFIG_k) \
__ALTERNATIVE_CFG(old_c, new_c, vendor_id, errata_id, IS_ENABLED(CONFIG_k))

View File

@@ -30,9 +30,8 @@
#define BPF_JIT_REGION_SIZE (SZ_128M)
#ifdef CONFIG_64BIT
/* KASLR should leave at least 128MB for BPF after the kernel */
#define BPF_JIT_REGION_START PFN_ALIGN((unsigned long)&_end)
#define BPF_JIT_REGION_END (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
#define BPF_JIT_REGION_START (BPF_JIT_REGION_END - BPF_JIT_REGION_SIZE)
#define BPF_JIT_REGION_END (MODULES_END)
#else
#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
#define BPF_JIT_REGION_END (VMALLOC_END)

View File

@@ -86,8 +86,13 @@ static void do_trap_error(struct pt_regs *regs, int signo, int code,
}
}
#if defined (CONFIG_XIP_KERNEL) && defined (CONFIG_RISCV_ERRATA_ALTERNATIVE)
#define __trap_section __section(".xip.traps")
#else
#define __trap_section
#endif
#define DO_ERROR_INFO(name, signo, code, str) \
asmlinkage __visible void name(struct pt_regs *regs) \
asmlinkage __visible __trap_section void name(struct pt_regs *regs) \
{ \
do_trap_error(regs, signo, code, regs->epc, "Oops - " str); \
}
@@ -111,7 +116,7 @@ DO_ERROR_INFO(do_trap_store_misaligned,
int handle_misaligned_load(struct pt_regs *regs);
int handle_misaligned_store(struct pt_regs *regs);
asmlinkage void do_trap_load_misaligned(struct pt_regs *regs)
asmlinkage void __trap_section do_trap_load_misaligned(struct pt_regs *regs)
{
if (!handle_misaligned_load(regs))
return;
@@ -119,7 +124,7 @@ asmlinkage void do_trap_load_misaligned(struct pt_regs *regs)
"Oops - load address misaligned");
}
asmlinkage void do_trap_store_misaligned(struct pt_regs *regs)
asmlinkage void __trap_section do_trap_store_misaligned(struct pt_regs *regs)
{
if (!handle_misaligned_store(regs))
return;
@@ -146,7 +151,7 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
return GET_INSN_LENGTH(insn);
}
asmlinkage __visible void do_trap_break(struct pt_regs *regs)
asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
{
#ifdef CONFIG_KPROBES
if (kprobe_single_step_handler(regs))

View File

@@ -99,9 +99,22 @@ SECTIONS
}
PERCPU_SECTION(L1_CACHE_BYTES)
. = ALIGN(PAGE_SIZE);
. = ALIGN(8);
.alternative : {
__alt_start = .;
*(.alternative)
__alt_end = .;
}
__init_end = .;
. = ALIGN(16);
.xip.traps : {
__xip_traps_start = .;
*(.xip.traps)
__xip_traps_end = .;
}
. = ALIGN(PAGE_SIZE);
.sdata : {
__global_pointer$ = . + 0x800;
*(.sdata*)

View File

@@ -169,7 +169,7 @@ static void __init kasan_shallow_populate(void *start, void *end)
void __init kasan_init(void)
{
phys_addr_t _start, _end;
phys_addr_t p_start, p_end;
u64 i;
/*
@@ -189,9 +189,9 @@ void __init kasan_init(void)
(void *)kasan_mem_to_shadow((void *)VMALLOC_END));
/* Populate the linear mapping */
for_each_mem_range(i, &_start, &_end) {
void *start = (void *)__va(_start);
void *end = (void *)__va(_end);
for_each_mem_range(i, &p_start, &p_end) {
void *start = (void *)__va(p_start);
void *end = (void *)__va(p_end);
if (start >= end)
break;
@@ -201,7 +201,7 @@ void __init kasan_init(void)
/* Populate kernel, BPF, modules mapping */
kasan_populate(kasan_mem_to_shadow((const void *)MODULES_VADDR),
kasan_mem_to_shadow((const void *)BPF_JIT_REGION_END));
kasan_mem_to_shadow((const void *)MODULES_VADDR + SZ_2G));
for (i = 0; i < PTRS_PER_PTE; i++)
set_pte(&kasan_early_shadow_pte[i],

View File

@@ -91,12 +91,16 @@ struct stack_frame {
CALL_ARGS_4(arg1, arg2, arg3, arg4); \
register unsigned long r4 asm("6") = (unsigned long)(arg5)
#define CALL_FMT_0 "=&d" (r2) :
#define CALL_FMT_1 "+&d" (r2) :
#define CALL_FMT_2 CALL_FMT_1 "d" (r3),
#define CALL_FMT_3 CALL_FMT_2 "d" (r4),
#define CALL_FMT_4 CALL_FMT_3 "d" (r5),
#define CALL_FMT_5 CALL_FMT_4 "d" (r6),
/*
* To keep this simple mark register 2-6 as being changed (volatile)
* by the called function, even though register 6 is saved/nonvolatile.
*/
#define CALL_FMT_0 "=&d" (r2)
#define CALL_FMT_1 "+&d" (r2)
#define CALL_FMT_2 CALL_FMT_1, "+&d" (r3)
#define CALL_FMT_3 CALL_FMT_2, "+&d" (r4)
#define CALL_FMT_4 CALL_FMT_3, "+&d" (r5)
#define CALL_FMT_5 CALL_FMT_4, "+&d" (r6)
#define CALL_CLOBBER_5 "0", "1", "14", "cc", "memory"
#define CALL_CLOBBER_4 CALL_CLOBBER_5
@@ -118,7 +122,7 @@ struct stack_frame {
" brasl 14,%[_fn]\n" \
" la 15,0(%[_prev])\n" \
: [_prev] "=&a" (prev), CALL_FMT_##nr \
[_stack] "R" (stack), \
: [_stack] "R" (stack), \
[_bc] "i" (offsetof(struct stack_frame, back_chain)), \
[_frame] "d" (frame), \
[_fn] "X" (fn) : CALL_CLOBBER_##nr); \

View File

@@ -418,6 +418,7 @@ ENTRY(\name)
xgr %r6,%r6
xgr %r7,%r7
xgr %r10,%r10
xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_ASYNC
stmg %r8,%r9,__PT_PSW(%r11)
tm %r8,0x0001 # coming from user space?
@@ -651,9 +652,9 @@ ENDPROC(stack_overflow)
.Lcleanup_sie_mcck:
larl %r13,.Lsie_entry
slgr %r9,%r13
larl %r13,.Lsie_skip
lghi %r13,.Lsie_skip - .Lsie_entry
clgr %r9,%r13
jh .Lcleanup_sie_int
jhe .Lcleanup_sie_int
oi __LC_CPU_FLAGS+7, _CIF_MCCK_GUEST
.Lcleanup_sie_int:
BPENTER __SF_SIE_FLAGS(%r15),(_TIF_ISOLATE_BP|_TIF_ISOLATE_BP_GUEST)

View File

@@ -512,7 +512,6 @@ void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
/* No handlers present - check for system call restart */
clear_pt_regs_flag(regs, PIF_SYSCALL);
clear_pt_regs_flag(regs, PIF_SYSCALL_RESTART);
if (current->thread.system_call) {
regs->int_code = current->thread.system_call;
switch (regs->gprs[2]) {

View File

@@ -66,7 +66,10 @@ static void cpu_group_map(cpumask_t *dst, struct mask_info *info, unsigned int c
{
static cpumask_t mask;
cpumask_copy(&mask, cpumask_of(cpu));
cpumask_clear(&mask);
if (!cpu_online(cpu))
goto out;
cpumask_set_cpu(cpu, &mask);
switch (topology_mode) {
case TOPOLOGY_MODE_HW:
while (info) {
@@ -83,10 +86,10 @@ static void cpu_group_map(cpumask_t *dst, struct mask_info *info, unsigned int c
default:
fallthrough;
case TOPOLOGY_MODE_SINGLE:
cpumask_copy(&mask, cpumask_of(cpu));
break;
}
cpumask_and(&mask, &mask, cpu_online_mask);
out:
cpumask_copy(dst, &mask);
}
@@ -95,7 +98,10 @@ static void cpu_thread_map(cpumask_t *dst, unsigned int cpu)
static cpumask_t mask;
int i;
cpumask_copy(&mask, cpumask_of(cpu));
cpumask_clear(&mask);
if (!cpu_online(cpu))
goto out;
cpumask_set_cpu(cpu, &mask);
if (topology_mode != TOPOLOGY_MODE_HW)
goto out;
cpu -= cpu % (smp_cpu_mtid + 1);

View File

@@ -140,7 +140,12 @@ static int kvm_s390_pv_alloc_vm(struct kvm *kvm)
/* Allocate variable storage */
vlen = ALIGN(virt * ((npages * PAGE_SIZE) / HPAGE_SIZE), PAGE_SIZE);
vlen += uv_info.guest_virt_base_stor_len;
kvm->arch.pv.stor_var = vzalloc(vlen);
/*
* The Create Secure Configuration Ultravisor Call does not support
* using large pages for the virtual memory area.
* This is a hardware limitation.
*/
kvm->arch.pv.stor_var = vmalloc_no_huge(vlen);
if (!kvm->arch.pv.stor_var)
goto out_err;
return 0;

View File

@@ -200,8 +200,9 @@ endif
KBUILD_LDFLAGS += -m elf_$(UTS_MACHINE)
ifdef CONFIG_LTO_CLANG
KBUILD_LDFLAGS += -plugin-opt=-code-model=kernel \
-plugin-opt=-stack-alignment=$(if $(CONFIG_X86_32),4,8)
ifeq ($(shell test $(CONFIG_LLD_VERSION) -lt 130000; echo $$?),0)
KBUILD_LDFLAGS += -plugin-opt=-stack-alignment=$(if $(CONFIG_X86_32),4,8)
endif
endif
ifdef CONFIG_X86_NEED_RELOCS

View File

@@ -130,8 +130,8 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
/* User code screwed up. */
regs->ax = -EFAULT;
instrumentation_end();
local_irq_disable();
instrumentation_end();
irqentry_exit_to_user_mode(regs);
return false;
}
@@ -269,15 +269,16 @@ __visible noinstr void xen_pv_evtchn_do_upcall(struct pt_regs *regs)
irqentry_state_t state = irqentry_enter(regs);
bool inhcall;
instrumentation_begin();
run_sysvec_on_irqstack_cond(__xen_pv_evtchn_do_upcall, regs);
inhcall = get_and_clear_inhcall();
if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) {
instrumentation_begin();
irqentry_exit_cond_resched();
instrumentation_end();
restore_inhcall(inhcall);
} else {
instrumentation_end();
irqentry_exit(regs, state);
}
}

View File

@@ -731,7 +731,8 @@ void reserve_lbr_buffers(void)
if (!kmem_cache || cpuc->lbr_xsave)
continue;
cpuc->lbr_xsave = kmem_cache_alloc_node(kmem_cache, GFP_KERNEL,
cpuc->lbr_xsave = kmem_cache_alloc_node(kmem_cache,
GFP_KERNEL | __GFP_ZERO,
cpu_to_node(cpu));
}
}

View File

@@ -1406,6 +1406,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
die_id = i;
else
die_id = topology_phys_to_logical_pkg(i);
if (die_id < 0)
die_id = -ENODEV;
map->pbus_to_dieid[bus] = die_id;
break;
}
@@ -1452,14 +1454,14 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
i = -1;
if (reverse) {
for (bus = 255; bus >= 0; bus--) {
if (map->pbus_to_dieid[bus] >= 0)
if (map->pbus_to_dieid[bus] != -1)
i = map->pbus_to_dieid[bus];
else
map->pbus_to_dieid[bus] = i;
}
} else {
for (bus = 0; bus <= 255; bus++) {
if (map->pbus_to_dieid[bus] >= 0)
if (map->pbus_to_dieid[bus] != -1)
i = map->pbus_to_dieid[bus];
else
map->pbus_to_dieid[bus] = i;
@@ -5097,9 +5099,10 @@ static struct intel_uncore_type icx_uncore_m2m = {
.perf_ctr = SNR_M2M_PCI_PMON_CTR0,
.event_ctl = SNR_M2M_PCI_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.event_mask_ext = SNR_M2M_PCI_PMON_UMASK_EXT,
.box_ctl = SNR_M2M_PCI_PMON_BOX_CTL,
.ops = &snr_m2m_uncore_pci_ops,
.format_group = &skx_uncore_format_group,
.format_group = &snr_m2m_uncore_format_group,
};
static struct attribute *icx_upi_uncore_formats_attr[] = {

View File

@@ -204,6 +204,14 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu)
asm volatile("fxsaveq %[fx]" : [fx] "=m" (fpu->state.fxsave));
}
static inline void fxsave(struct fxregs_state *fx)
{
if (IS_ENABLED(CONFIG_X86_32))
asm volatile( "fxsave %[fx]" : [fx] "=m" (*fx));
else
asm volatile("fxsaveq %[fx]" : [fx] "=m" (*fx));
}
/* These macros all use (%edi)/(%rdi) as the single memory argument. */
#define XSAVE ".byte " REX_PREFIX "0x0f,0xae,0x27"
#define XSAVEOPT ".byte " REX_PREFIX "0x0f,0xae,0x37"
@@ -268,28 +276,6 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu)
: "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \
: "memory")
/*
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
*/
static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate)
{
u64 mask = xfeatures_mask_all;
u32 lmask = mask;
u32 hmask = mask >> 32;
int err;
WARN_ON(system_state != SYSTEM_BOOTING);
if (boot_cpu_has(X86_FEATURE_XSAVES))
XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
else
XSTATE_OP(XSAVE, xstate, lmask, hmask, err);
/* We should never fault when copying to a kernel buffer: */
WARN_ON_FPU(err);
}
/*
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
@@ -578,10 +564,17 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
* PKRU state is switched eagerly because it needs to be valid before we
* return to userland e.g. for a copy_to_user() operation.
*/
if (current->mm) {
if (!(current->flags & PF_KTHREAD)) {
/*
* If the PKRU bit in xsave.header.xfeatures is not set,
* then the PKRU component was in init state, which means
* XRSTOR will set PKRU to 0. If the bit is not set then
* get_xsave_addr() will return NULL because the PKRU value
* in memory is not valid. This means pkru_val has to be
* set to 0 and not to init_pkru_value.
*/
pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
if (pk)
pkru_val = pk->pkru;
pkru_val = pk ? pk->pkru : 0;
}
__write_pkru(pkru_val);
}

View File

@@ -75,7 +75,7 @@ void copy_page(void *to, void *from);
*
* With page table isolation enabled, we map the LDT in ... [stay tuned]
*/
static inline unsigned long task_size_max(void)
static __always_inline unsigned long task_size_max(void)
{
unsigned long ret;

View File

@@ -63,7 +63,7 @@ static inline unsigned int nmi_perfctr_msr_to_bit(unsigned int msr)
case 15:
return msr - MSR_P4_BPU_PERFCTR0;
}
fallthrough;
break;
case X86_VENDOR_ZHAOXIN:
case X86_VENDOR_CENTAUR:
return msr - MSR_ARCH_PERFMON_PERFCTR0;
@@ -96,7 +96,7 @@ static inline unsigned int nmi_evntsel_msr_to_bit(unsigned int msr)
case 15:
return msr - MSR_P4_BSU_ESCR0;
}
fallthrough;
break;
case X86_VENDOR_ZHAOXIN:
case X86_VENDOR_CENTAUR:
return msr - MSR_ARCH_PERFMON_EVENTSEL0;

View File

@@ -212,6 +212,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
list_splice_tail(&secs_pages, &zombie_secs_pages);
mutex_unlock(&zombie_secs_pages_lock);
xa_destroy(&vepc->page_array);
kfree(vepc);
return 0;

View File

@@ -221,28 +221,18 @@ sanitize_restored_user_xstate(union fpregs_state *state,
if (use_xsave()) {
/*
* Note: we don't need to zero the reserved bits in the
* xstate_header here because we either didn't copy them at all,
* or we checked earlier that they aren't set.
* Clear all feature bits which are not set in
* user_xfeatures and clear all extended features
* for fx_only mode.
*/
u64 mask = fx_only ? XFEATURE_MASK_FPSSE : user_xfeatures;
/*
* 'user_xfeatures' might have bits clear which are
* set in header->xfeatures. This represents features that
* were in init state prior to a signal delivery, and need
* to be reset back to the init state. Clear any user
* feature bits which are set in the kernel buffer to get
* them back to the init state.
*
* Supervisor state is unchanged by input from userspace.
* Ensure supervisor state bits stay set and supervisor
* state is not modified.
* Supervisor state has to be preserved. The sigframe
* restore can only modify user features, i.e. @mask
* cannot contain them.
*/
if (fx_only)
header->xfeatures = XFEATURE_MASK_FPSSE;
else
header->xfeatures &= user_xfeatures |
xfeatures_mask_supervisor();
header->xfeatures &= mask | xfeatures_mask_supervisor();
}
if (use_fxsr()) {
@@ -307,13 +297,17 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
return 0;
}
if (!access_ok(buf, size))
return -EACCES;
if (!access_ok(buf, size)) {
ret = -EACCES;
goto out;
}
if (!static_cpu_has(X86_FEATURE_FPU))
return fpregs_soft_set(current, NULL,
0, sizeof(struct user_i387_ia32_struct),
NULL, buf) != 0;
if (!static_cpu_has(X86_FEATURE_FPU)) {
ret = fpregs_soft_set(current, NULL, 0,
sizeof(struct user_i387_ia32_struct),
NULL, buf);
goto out;
}
if (use_xsave()) {
struct _fpx_sw_bytes fx_sw_user;
@@ -369,6 +363,25 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpregs_unlock();
return 0;
}
/*
* The above did an FPU restore operation, restricted to
* the user portion of the registers, and failed, but the
* microcode might have modified the FPU registers
* nevertheless.
*
* If the FPU registers do not belong to current, then
* invalidate the FPU register state otherwise the task might
* preempt current and return to user space with corrupted
* FPU registers.
*
* In case current owns the FPU registers then no further
* action is required. The fixup below will handle it
* correctly.
*/
if (test_thread_flag(TIF_NEED_FPU_LOAD))
__cpu_invalidate_fpregs_state();
fpregs_unlock();
} else {
/*
@@ -377,7 +390,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
*/
ret = __copy_from_user(&env, buf, sizeof(env));
if (ret)
goto err_out;
goto out;
envp = &env;
}
@@ -405,16 +418,9 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
if (use_xsave() && !fx_only) {
u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
if (using_compacted_format()) {
ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
} else {
ret = __copy_from_user(&fpu->state.xsave, buf_fx, state_size);
if (!ret && state_size > offsetof(struct xregs_state, header))
ret = validate_user_xstate_header(&fpu->state.xsave.header);
}
ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
if (ret)
goto err_out;
goto out;
sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
fx_only);
@@ -434,7 +440,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
if (ret) {
ret = -EFAULT;
goto err_out;
goto out;
}
sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
@@ -452,7 +458,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
} else {
ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
if (ret)
goto err_out;
goto out;
fpregs_lock();
ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
@@ -463,7 +469,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpregs_deactivate(fpu);
fpregs_unlock();
err_out:
out:
if (ret)
fpu__clear_user_states(fpu);
return ret;

View File

@@ -440,6 +440,25 @@ static void __init print_xstate_offset_size(void)
}
}
/*
* All supported features have either init state all zeros or are
* handled in setup_init_fpu() individually. This is an explicit
* feature list and does not use XFEATURE_MASK*SUPPORTED to catch
* newly added supported features at build time and make people
* actually look at the init state for the new feature.
*/
#define XFEATURES_INIT_FPSTATE_HANDLED \
(XFEATURE_MASK_FP | \
XFEATURE_MASK_SSE | \
XFEATURE_MASK_YMM | \
XFEATURE_MASK_OPMASK | \
XFEATURE_MASK_ZMM_Hi256 | \
XFEATURE_MASK_Hi16_ZMM | \
XFEATURE_MASK_PKRU | \
XFEATURE_MASK_BNDREGS | \
XFEATURE_MASK_BNDCSR | \
XFEATURE_MASK_PASID)
/*
* setup the xstate image representing the init state
*/
@@ -447,6 +466,10 @@ static void __init setup_init_fpu_buf(void)
{
static int on_boot_cpu __initdata = 1;
BUILD_BUG_ON((XFEATURE_MASK_USER_SUPPORTED |
XFEATURE_MASK_SUPERVISOR_SUPPORTED) !=
XFEATURES_INIT_FPSTATE_HANDLED);
WARN_ON_FPU(!on_boot_cpu);
on_boot_cpu = 0;
@@ -466,10 +489,22 @@ static void __init setup_init_fpu_buf(void)
copy_kernel_to_xregs_booting(&init_fpstate.xsave);
/*
* Dump the init state again. This is to identify the init state
* of any feature which is not represented by all zero's.
* All components are now in init state. Read the state back so
* that init_fpstate contains all non-zero init state. This only
* works with XSAVE, but not with XSAVEOPT and XSAVES because
* those use the init optimization which skips writing data for
* components in init state.
*
* XSAVE could be used, but that would require to reshuffle the
* data when XSAVES is available because XSAVES uses xstate
* compaction. But doing so is a pointless exercise because most
* components have an all zeros init state except for the legacy
* ones (FP and SSE). Those can be saved with FXSAVE into the
* legacy area. Adding new features requires to ensure that init
* state is all zeroes or if not to add the necessary handling
* here.
*/
copy_xregs_to_kernel_booting(&init_fpstate.xsave);
fxsave(&init_fpstate.fxsave);
}
static int xfeature_uncompacted_offset(int xfeature_nr)

View File

@@ -655,6 +655,7 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
entry->ecx = F(RDPID);
++array->nent;
break;
default:
break;
}

View File

@@ -1410,6 +1410,9 @@ int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
if (!apic_x2apic_mode(apic))
valid_reg_mask |= APIC_REG_MASK(APIC_ARBPRI);
if (alignment + len > 4)
return 1;
if (offset > 0x3f0 || !(valid_reg_mask & APIC_REG_MASK(offset)))
return 1;
@@ -1494,6 +1497,15 @@ static void limit_periodic_timer_frequency(struct kvm_lapic *apic)
static void cancel_hv_timer(struct kvm_lapic *apic);
static void cancel_apic_timer(struct kvm_lapic *apic)
{
hrtimer_cancel(&apic->lapic_timer.timer);
preempt_disable();
if (apic->lapic_timer.hv_timer_in_use)
cancel_hv_timer(apic);
preempt_enable();
}
static void apic_update_lvtt(struct kvm_lapic *apic)
{
u32 timer_mode = kvm_lapic_get_reg(apic, APIC_LVTT) &
@@ -1502,11 +1514,7 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
if (apic->lapic_timer.timer_mode != timer_mode) {
if (apic_lvtt_tscdeadline(apic) != (timer_mode ==
APIC_LVT_TIMER_TSCDEADLINE)) {
hrtimer_cancel(&apic->lapic_timer.timer);
preempt_disable();
if (apic->lapic_timer.hv_timer_in_use)
cancel_hv_timer(apic);
preempt_enable();
cancel_apic_timer(apic);
kvm_lapic_set_reg(apic, APIC_TMICT, 0);
apic->lapic_timer.period = 0;
apic->lapic_timer.tscdeadline = 0;
@@ -2092,7 +2100,7 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
if (apic_lvtt_tscdeadline(apic))
break;
hrtimer_cancel(&apic->lapic_timer.timer);
cancel_apic_timer(apic);
kvm_lapic_set_reg(apic, APIC_TMICT, val);
start_apic_timer(apic);
break;

View File

@@ -4739,9 +4739,33 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu)
context->inject_page_fault = kvm_inject_page_fault;
}
static union kvm_mmu_role kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu)
{
union kvm_mmu_role role = kvm_calc_shadow_root_page_role_common(vcpu, false);
/*
* Nested MMUs are used only for walking L2's gva->gpa, they never have
* shadow pages of their own and so "direct" has no meaning. Set it
* to "true" to try to detect bogus usage of the nested MMU.
*/
role.base.direct = true;
if (!is_paging(vcpu))
role.base.level = 0;
else if (is_long_mode(vcpu))
role.base.level = is_la57_mode(vcpu) ? PT64_ROOT_5LEVEL :
PT64_ROOT_4LEVEL;
else if (is_pae(vcpu))
role.base.level = PT32E_ROOT_LEVEL;
else
role.base.level = PT32_ROOT_LEVEL;
return role;
}
static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
{
union kvm_mmu_role new_role = kvm_calc_mmu_role_common(vcpu, false);
union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu);
struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
if (new_role.as_u64 == g_context->mmu_role.as_u64)

View File

@@ -90,8 +90,8 @@ struct guest_walker {
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
bool pte_writable[PT_MAX_FULL_LEVELS];
unsigned pt_access;
unsigned pte_access;
unsigned int pt_access[PT_MAX_FULL_LEVELS];
unsigned int pte_access;
gfn_t gfn;
struct x86_exception fault;
};
@@ -418,13 +418,15 @@ retry_walk:
}
walker->ptes[walker->level - 1] = pte;
/* Convert to ACC_*_MASK flags for struct guest_walker. */
walker->pt_access[walker->level - 1] = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
} while (!is_last_gpte(mmu, walker->level, pte));
pte_pkey = FNAME(gpte_pkeys)(vcpu, pte);
accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0;
/* Convert to ACC_*_MASK flags for struct guest_walker. */
walker->pt_access = FNAME(gpte_access)(pt_access ^ walk_nx_mask);
walker->pte_access = FNAME(gpte_access)(pte_access ^ walk_nx_mask);
errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access);
if (unlikely(errcode))
@@ -463,7 +465,8 @@ retry_walk:
}
pgprintk("%s: pte %llx pte_access %x pt_access %x\n",
__func__, (u64)pte, walker->pte_access, walker->pt_access);
__func__, (u64)pte, walker->pte_access,
walker->pt_access[walker->level - 1]);
return 1;
error:
@@ -643,7 +646,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
bool huge_page_disallowed = exec && nx_huge_page_workaround_enabled;
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
unsigned direct_access, access = gw->pt_access;
unsigned int direct_access, access;
int top_level, level, req_level, ret;
gfn_t base_gfn = gw->gfn;
@@ -675,6 +678,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gpa_t addr,
sp = NULL;
if (!is_shadow_present_pte(*it.sptep)) {
table_gfn = gw->table_gfn[it.level - 2];
access = gw->pt_access[it.level - 2];
sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1,
false, access);
}

View File

@@ -221,7 +221,7 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
return &avic_physical_id_table[index];
}
/**
/*
* Note:
* AVIC hardware walks the nested page table to check permissions,
* but does not use the SPA address specified in the leaf page
@@ -764,7 +764,7 @@ out:
return ret;
}
/**
/*
* Note:
* The HW cannot support posting multicast/broadcast
* interrupts to a vCPU. So, we still use legacy interrupt
@@ -1005,7 +1005,7 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
}
/**
/*
* This function is called during VCPU halt/unhalt.
*/
static void avic_set_running(struct kvm_vcpu *vcpu, bool is_run)

View File

@@ -199,9 +199,19 @@ static void sev_asid_free(struct kvm_sev_info *sev)
sev->misc_cg = NULL;
}
static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
static void sev_decommission(unsigned int handle)
{
struct sev_data_decommission decommission;
if (!handle)
return;
decommission.handle = handle;
sev_guest_decommission(&decommission, NULL);
}
static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
{
struct sev_data_deactivate deactivate;
if (!handle)
@@ -214,9 +224,7 @@ static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
sev_guest_deactivate(&deactivate, NULL);
up_read(&sev_deactivate_lock);
/* decommission handle */
decommission.handle = handle;
sev_guest_decommission(&decommission, NULL);
sev_decommission(handle);
}
static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
@@ -341,8 +349,10 @@ static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
/* Bind ASID to this guest */
ret = sev_bind_asid(kvm, start.handle, error);
if (ret)
if (ret) {
sev_decommission(start.handle);
goto e_free_session;
}
/* return handle to userspace */
params.handle = start.handle;
@@ -1103,10 +1113,9 @@ __sev_send_start_query_session_length(struct kvm *kvm, struct kvm_sev_cmd *argp,
struct sev_data_send_start data;
int ret;
memset(&data, 0, sizeof(data));
data.handle = sev->handle;
ret = sev_issue_cmd(kvm, SEV_CMD_SEND_START, &data, &argp->error);
if (ret < 0)
return ret;
params->session_len = data.session_len;
if (copy_to_user((void __user *)(uintptr_t)argp->data, params,
@@ -1215,10 +1224,9 @@ __sev_send_update_data_query_lengths(struct kvm *kvm, struct kvm_sev_cmd *argp,
struct sev_data_send_update_data data;
int ret;
memset(&data, 0, sizeof(data));
data.handle = sev->handle;
ret = sev_issue_cmd(kvm, SEV_CMD_SEND_UPDATE_DATA, &data, &argp->error);
if (ret < 0)
return ret;
params->hdr_len = data.hdr_len;
params->trans_len = data.trans_len;

View File

@@ -1550,16 +1550,16 @@ TRACE_EVENT(kvm_nested_vmenter_failed,
TP_ARGS(msg, err),
TP_STRUCT__entry(
__field(const char *, msg)
__string(msg, msg)
__field(u32, err)
),
TP_fast_assign(
__entry->msg = msg;
__assign_str(msg, msg);
__entry->err = err;
),
TP_printk("%s%s", __entry->msg, !__entry->err ? "" :
TP_printk("%s%s", __get_str(msg), !__entry->err ? "" :
__print_symbolic(__entry->err, VMX_VMENTER_INSTRUCTION_ERRORS))
);

View File

@@ -6247,6 +6247,7 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
switch (kvm_get_apic_mode(vcpu)) {
case LAPIC_MODE_INVALID:
WARN_ONCE(true, "Invalid local APIC state");
break;
case LAPIC_MODE_DISABLED:
break;
case LAPIC_MODE_XAPIC:

View File

@@ -3072,6 +3072,19 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
{
++vcpu->stat.tlb_flush;
if (!tdp_enabled) {
/*
* A TLB flush on behalf of the guest is equivalent to
* INVPCID(all), toggling CR4.PGE, etc., which requires
* a forced sync of the shadow page tables. Unload the
* entire MMU here and the subsequent load will sync the
* shadow page tables, and also flush the TLB.
*/
kvm_mmu_unload(vcpu);
return;
}
static_call(kvm_x86_tlb_flush_guest)(vcpu);
}
@@ -3101,9 +3114,11 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
* expensive IPIs.
*/
if (guest_pv_has(vcpu, KVM_FEATURE_PV_TLB_FLUSH)) {
u8 st_preempted = xchg(&st->preempted, 0);
trace_kvm_pv_tlb_flush(vcpu->vcpu_id,
st->preempted & KVM_VCPU_FLUSH_TLB);
if (xchg(&st->preempted, 0) & KVM_VCPU_FLUSH_TLB)
st_preempted & KVM_VCPU_FLUSH_TLB);
if (st_preempted & KVM_VCPU_FLUSH_TLB)
kvm_vcpu_flush_tlb_guest(vcpu);
} else {
st->preempted = 0;
@@ -7091,7 +7106,10 @@ static unsigned emulator_get_hflags(struct x86_emulate_ctxt *ctxt)
static void emulator_set_hflags(struct x86_emulate_ctxt *ctxt, unsigned emul_flags)
{
emul_to_vcpu(ctxt)->arch.hflags = emul_flags;
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
vcpu->arch.hflags = emul_flags;
kvm_mmu_reset_context(vcpu);
}
static int emulator_pre_leave_smm(struct x86_emulate_ctxt *ctxt,
@@ -8243,6 +8261,7 @@ void kvm_arch_exit(void)
kvm_x86_ops.hardware_enable = NULL;
kvm_mmu_module_exit();
free_percpu(user_return_msrs);
kmem_cache_destroy(x86_emulator_cache);
kmem_cache_destroy(x86_fpu_cache);
#ifdef CONFIG_KVM_XEN
static_key_deferred_flush(&kvm_xen_enabled);

View File

@@ -58,12 +58,16 @@ SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg)
2: .skip 5-(2b-1b), 0x90
SYM_FUNC_END(__x86_indirect_alt_call_\reg)
STACK_FRAME_NON_STANDARD(__x86_indirect_alt_call_\reg)
SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg)
ANNOTATE_RETPOLINE_SAFE
1: jmp *%\reg
2: .skip 5-(2b-1b), 0x90
SYM_FUNC_END(__x86_indirect_alt_jmp_\reg)
STACK_FRAME_NON_STANDARD(__x86_indirect_alt_jmp_\reg)
.endm
/*

View File

@@ -118,7 +118,9 @@ static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *des
if (!IS_ENABLED(CONFIG_EFI))
return;
if (efi_mem_type(addr) == EFI_RUNTIME_SERVICES_DATA)
if (efi_mem_type(addr) == EFI_RUNTIME_SERVICES_DATA ||
(efi_mem_type(addr) == EFI_BOOT_SERVICES_DATA &&
efi_mem_attributes(addr) & EFI_MEMORY_RUNTIME))
desc->flags |= IORES_MAP_ENCRYPTED;
}

View File

@@ -254,7 +254,13 @@ int __init numa_cleanup_meminfo(struct numa_meminfo *mi)
/* make sure all non-reserved blocks are inside the limits */
bi->start = max(bi->start, low);
bi->end = min(bi->end, high);
/* preserve info for non-RAM areas above 'max_pfn': */
if (bi->end > high) {
numa_add_memblk_to(bi->nid, high, bi->end,
&numa_reserved_meminfo);
bi->end = high;
}
/* and there's no empty block */
if (bi->start >= bi->end)

View File

@@ -779,4 +779,48 @@ DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1571, pci_amd_enable_64bit_bar);
DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x15b1, pci_amd_enable_64bit_bar);
DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1601, pci_amd_enable_64bit_bar);
#define RS690_LOWER_TOP_OF_DRAM2 0x30
#define RS690_LOWER_TOP_OF_DRAM2_VALID 0x1
#define RS690_UPPER_TOP_OF_DRAM2 0x31
#define RS690_HTIU_NB_INDEX 0xA8
#define RS690_HTIU_NB_INDEX_WR_ENABLE 0x100
#define RS690_HTIU_NB_DATA 0xAC
/*
* Some BIOS implementations support RAM above 4GB, but do not configure the
* PCI host to respond to bus master accesses for these addresses. These
* implementations set the TOP_OF_DRAM_SLOT1 register correctly, so PCI DMA
* works as expected for addresses below 4GB.
*
* Reference: "AMD RS690 ASIC Family Register Reference Guide" (pg. 2-57)
* https://www.amd.com/system/files/TechDocs/43372_rs690_rrg_3.00o.pdf
*/
static void rs690_fix_64bit_dma(struct pci_dev *pdev)
{
u32 val = 0;
phys_addr_t top_of_dram = __pa(high_memory - 1) + 1;
if (top_of_dram <= (1ULL << 32))
return;
pci_write_config_dword(pdev, RS690_HTIU_NB_INDEX,
RS690_LOWER_TOP_OF_DRAM2);
pci_read_config_dword(pdev, RS690_HTIU_NB_DATA, &val);
if (val)
return;
pci_info(pdev, "Adjusting top of DRAM to %pa for 64-bit DMA support\n", &top_of_dram);
pci_write_config_dword(pdev, RS690_HTIU_NB_INDEX,
RS690_UPPER_TOP_OF_DRAM2 | RS690_HTIU_NB_INDEX_WR_ENABLE);
pci_write_config_dword(pdev, RS690_HTIU_NB_DATA, top_of_dram >> 32);
pci_write_config_dword(pdev, RS690_HTIU_NB_INDEX,
RS690_LOWER_TOP_OF_DRAM2 | RS690_HTIU_NB_INDEX_WR_ENABLE);
pci_write_config_dword(pdev, RS690_HTIU_NB_DATA,
top_of_dram | RS690_LOWER_TOP_OF_DRAM2_VALID);
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7910, rs690_fix_64bit_dma);
#endif

View File

@@ -592,8 +592,10 @@ DEFINE_IDTENTRY_RAW(xenpv_exc_debug)
DEFINE_IDTENTRY_RAW(exc_xen_unknown_trap)
{
/* This should never happen and there is no way to handle it. */
instrumentation_begin();
pr_err("Unknown trap in Xen PV mode.");
BUG();
instrumentation_end();
}
#ifdef CONFIG_X86_MCE

View File

@@ -233,7 +233,8 @@ async_xor_offs(struct page *dest, unsigned int offset,
if (submit->flags & ASYNC_TX_XOR_DROP_DST) {
src_cnt--;
src_list++;
src_offs++;
if (src_offs)
src_offs++;
}
/* wait for any prerequisite operations */

View File

@@ -330,32 +330,21 @@ static void acpi_bus_osc_negotiate_platform_control(void)
if (ACPI_FAILURE(acpi_run_osc(handle, &context)))
return;
capbuf_ret = context.ret.pointer;
if (context.ret.length <= OSC_SUPPORT_DWORD) {
kfree(context.ret.pointer);
return;
}
/*
* Now run _OSC again with query flag clear and with the caps
* supported by both the OS and the platform.
*/
capbuf[OSC_QUERY_DWORD] = 0;
capbuf[OSC_SUPPORT_DWORD] = capbuf_ret[OSC_SUPPORT_DWORD];
kfree(context.ret.pointer);
/* Now run _OSC again with query flag clear */
capbuf[OSC_QUERY_DWORD] = 0;
if (ACPI_FAILURE(acpi_run_osc(handle, &context)))
return;
capbuf_ret = context.ret.pointer;
if (context.ret.length > OSC_SUPPORT_DWORD) {
osc_sb_apei_support_acked =
capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_APEI_SUPPORT;
osc_pc_lpi_support_confirmed =
capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_PCLPI_SUPPORT;
osc_sb_native_usb4_support_confirmed =
capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_NATIVE_USB4_SUPPORT;
}
osc_sb_apei_support_acked =
capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_APEI_SUPPORT;
osc_pc_lpi_support_confirmed =
capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_PCLPI_SUPPORT;
osc_sb_native_usb4_support_confirmed =
capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_NATIVE_USB4_SUPPORT;
kfree(context.ret.pointer);
}

View File

@@ -1009,10 +1009,8 @@ static void acpi_sleep_hibernate_setup(void)
return;
acpi_get_table(ACPI_SIG_FACS, 1, (struct acpi_table_header **)&facs);
if (facs) {
if (facs)
s4_hardware_signature = facs->hardware_signature;
acpi_put_table((struct acpi_table_header *)facs);
}
}
#else /* !CONFIG_HIBERNATION */
static inline void acpi_sleep_hibernate_setup(void) {}

View File

@@ -1045,7 +1045,15 @@ int device_add_software_node(struct device *dev, const struct software_node *nod
}
set_secondary_fwnode(dev, &swnode->fwnode);
software_node_notify(dev, KOBJ_ADD);
/*
* If the device has been fully registered by the time this function is
* called, software_node_notify() must be called separately so that the
* symlinks get created and the reference count of the node is kept in
* balance.
*/
if (device_is_registered(dev))
software_node_notify(dev, KOBJ_ADD);
return 0;
}
@@ -1065,7 +1073,8 @@ void device_remove_software_node(struct device *dev)
if (!swnode)
return;
software_node_notify(dev, KOBJ_REMOVE);
if (device_is_registered(dev))
software_node_notify(dev, KOBJ_REMOVE);
set_secondary_fwnode(dev, NULL);
kobject_put(&swnode->kobj);
}
@@ -1119,8 +1128,7 @@ int software_node_notify(struct device *dev, unsigned long action)
switch (action) {
case KOBJ_ADD:
ret = sysfs_create_link_nowarn(&dev->kobj, &swnode->kobj,
"software_node");
ret = sysfs_create_link(&dev->kobj, &swnode->kobj, "software_node");
if (ret)
break;

View File

@@ -1879,29 +1879,18 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode,
static int lo_open(struct block_device *bdev, fmode_t mode)
{
struct loop_device *lo;
struct loop_device *lo = bdev->bd_disk->private_data;
int err;
/*
* take loop_ctl_mutex to protect lo pointer from race with
* loop_control_ioctl(LOOP_CTL_REMOVE), however, to reduce contention
* release it prior to updating lo->lo_refcnt.
*/
err = mutex_lock_killable(&loop_ctl_mutex);
if (err)
return err;
lo = bdev->bd_disk->private_data;
if (!lo) {
mutex_unlock(&loop_ctl_mutex);
return -ENXIO;
}
err = mutex_lock_killable(&lo->lo_mutex);
mutex_unlock(&loop_ctl_mutex);
if (err)
return err;
atomic_inc(&lo->lo_refcnt);
if (lo->lo_state == Lo_deleting)
err = -ENXIO;
else
atomic_inc(&lo->lo_refcnt);
mutex_unlock(&lo->lo_mutex);
return 0;
return err;
}
static void lo_release(struct gendisk *disk, fmode_t mode)
@@ -2285,7 +2274,7 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
mutex_unlock(&lo->lo_mutex);
break;
}
lo->lo_disk->private_data = NULL;
lo->lo_state = Lo_deleting;
mutex_unlock(&lo->lo_mutex);
idr_remove(&loop_index_idr, lo->lo_number);
loop_remove(lo);

View File

@@ -22,6 +22,7 @@ enum {
Lo_unbound,
Lo_bound,
Lo_rundown,
Lo_deleting,
};
struct loop_func_table;

View File

@@ -311,8 +311,8 @@ static const struct mhi_channel_config mhi_foxconn_sdx55_channels[] = {
MHI_CHANNEL_CONFIG_DL(5, "DIAG", 32, 1),
MHI_CHANNEL_CONFIG_UL(12, "MBIM", 32, 0),
MHI_CHANNEL_CONFIG_DL(13, "MBIM", 32, 0),
MHI_CHANNEL_CONFIG_UL(32, "AT", 32, 0),
MHI_CHANNEL_CONFIG_DL(33, "AT", 32, 0),
MHI_CHANNEL_CONFIG_UL(32, "DUN", 32, 0),
MHI_CHANNEL_CONFIG_DL(33, "DUN", 32, 0),
MHI_CHANNEL_CONFIG_HW_UL(100, "IP_HW0_MBIM", 128, 2),
MHI_CHANNEL_CONFIG_HW_DL(101, "IP_HW0_MBIM", 128, 3),
};
@@ -708,7 +708,7 @@ static void mhi_pci_remove(struct pci_dev *pdev)
struct mhi_pci_device *mhi_pdev = pci_get_drvdata(pdev);
struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl;
del_timer(&mhi_pdev->health_check_timer);
del_timer_sync(&mhi_pdev->health_check_timer);
cancel_work_sync(&mhi_pdev->recovery_work);
if (test_and_clear_bit(MHI_PCI_DEV_STARTED, &mhi_pdev->status)) {
@@ -935,9 +935,43 @@ static int __maybe_unused mhi_pci_resume(struct device *dev)
return ret;
}
static int __maybe_unused mhi_pci_freeze(struct device *dev)
{
struct mhi_pci_device *mhi_pdev = dev_get_drvdata(dev);
struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl;
/* We want to stop all operations, hibernation does not guarantee that
* device will be in the same state as before freezing, especially if
* the intermediate restore kernel reinitializes MHI device with new
* context.
*/
if (test_and_clear_bit(MHI_PCI_DEV_STARTED, &mhi_pdev->status)) {
mhi_power_down(mhi_cntrl, false);
mhi_unprepare_after_power_down(mhi_cntrl);
}
return 0;
}
static int __maybe_unused mhi_pci_restore(struct device *dev)
{
struct mhi_pci_device *mhi_pdev = dev_get_drvdata(dev);
/* Reinitialize the device */
queue_work(system_long_wq, &mhi_pdev->recovery_work);
return 0;
}
static const struct dev_pm_ops mhi_pci_pm_ops = {
SET_RUNTIME_PM_OPS(mhi_pci_runtime_suspend, mhi_pci_runtime_resume, NULL)
SET_SYSTEM_SLEEP_PM_OPS(mhi_pci_suspend, mhi_pci_resume)
#ifdef CONFIG_PM_SLEEP
.suspend = mhi_pci_suspend,
.resume = mhi_pci_resume,
.freeze = mhi_pci_freeze,
.thaw = mhi_pci_restore,
.restore = mhi_pci_restore,
#endif
};
static struct pci_driver mhi_pci_driver = {

View File

@@ -19,16 +19,6 @@ config ACPI_CPPC_CPUFREQ
If in doubt, say N.
config ACPI_CPPC_CPUFREQ_FIE
bool "Frequency Invariance support for CPPC cpufreq driver"
depends on ACPI_CPPC_CPUFREQ && GENERIC_ARCH_TOPOLOGY
default y
help
This extends frequency invariance support in the CPPC cpufreq driver,
by using CPPC delivered and reference performance counters.
If in doubt, say N.
config ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM
tristate "Allwinner nvmem based SUN50I CPUFreq driver"
depends on ARCH_SUNXI

View File

@@ -10,18 +10,14 @@
#define pr_fmt(fmt) "CPPC Cpufreq:" fmt
#include <linux/arch_topology.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/delay.h>
#include <linux/cpu.h>
#include <linux/cpufreq.h>
#include <linux/dmi.h>
#include <linux/irq_work.h>
#include <linux/kthread.h>
#include <linux/time.h>
#include <linux/vmalloc.h>
#include <uapi/linux/sched/types.h>
#include <asm/unaligned.h>
@@ -61,204 +57,6 @@ static struct cppc_workaround_oem_info wa_info[] = {
}
};
#ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
/* Frequency invariance support */
struct cppc_freq_invariance {
int cpu;
struct irq_work irq_work;
struct kthread_work work;
struct cppc_perf_fb_ctrs prev_perf_fb_ctrs;
struct cppc_cpudata *cpu_data;
};
static DEFINE_PER_CPU(struct cppc_freq_invariance, cppc_freq_inv);
static struct kthread_worker *kworker_fie;
static bool fie_disabled;
static struct cpufreq_driver cppc_cpufreq_driver;
static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu);
static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data,
struct cppc_perf_fb_ctrs fb_ctrs_t0,
struct cppc_perf_fb_ctrs fb_ctrs_t1);
/**
* cppc_scale_freq_workfn - CPPC arch_freq_scale updater for frequency invariance
* @work: The work item.
*
* The CPPC driver register itself with the topology core to provide its own
* implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which
* gets called by the scheduler on every tick.
*
* Note that the arch specific counters have higher priority than CPPC counters,
* if available, though the CPPC driver doesn't need to have any special
* handling for that.
*
* On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we
* reach here from hard-irq context), which then schedules a normal work item
* and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable
* based on the counter updates since the last tick.
*/
static void cppc_scale_freq_workfn(struct kthread_work *work)
{
struct cppc_freq_invariance *cppc_fi;
struct cppc_perf_fb_ctrs fb_ctrs = {0};
struct cppc_cpudata *cpu_data;
unsigned long local_freq_scale;
u64 perf;
cppc_fi = container_of(work, struct cppc_freq_invariance, work);
cpu_data = cppc_fi->cpu_data;
if (cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs)) {
pr_warn("%s: failed to read perf counters\n", __func__);
return;
}
cppc_fi->prev_perf_fb_ctrs = fb_ctrs;
perf = cppc_perf_from_fbctrs(cpu_data, cppc_fi->prev_perf_fb_ctrs,
fb_ctrs);
perf <<= SCHED_CAPACITY_SHIFT;
local_freq_scale = div64_u64(perf, cpu_data->perf_caps.highest_perf);
if (WARN_ON(local_freq_scale > 1024))
local_freq_scale = 1024;
per_cpu(arch_freq_scale, cppc_fi->cpu) = local_freq_scale;
}
static void cppc_irq_work(struct irq_work *irq_work)
{
struct cppc_freq_invariance *cppc_fi;
cppc_fi = container_of(irq_work, struct cppc_freq_invariance, irq_work);
kthread_queue_work(kworker_fie, &cppc_fi->work);
}
static void cppc_scale_freq_tick(void)
{
struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, smp_processor_id());
/*
* cppc_get_perf_ctrs() can potentially sleep, call that from the right
* context.
*/
irq_work_queue(&cppc_fi->irq_work);
}
static struct scale_freq_data cppc_sftd = {
.source = SCALE_FREQ_SOURCE_CPPC,
.set_freq_scale = cppc_scale_freq_tick,
};
static void cppc_freq_invariance_policy_init(struct cpufreq_policy *policy,
struct cppc_cpudata *cpu_data)
{
struct cppc_perf_fb_ctrs fb_ctrs = {0};
struct cppc_freq_invariance *cppc_fi;
int i, ret;
if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
return;
if (fie_disabled)
return;
for_each_cpu(i, policy->cpus) {
cppc_fi = &per_cpu(cppc_freq_inv, i);
cppc_fi->cpu = i;
cppc_fi->cpu_data = cpu_data;
kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn);
init_irq_work(&cppc_fi->irq_work, cppc_irq_work);
ret = cppc_get_perf_ctrs(i, &fb_ctrs);
if (ret) {
pr_warn("%s: failed to read perf counters: %d\n",
__func__, ret);
fie_disabled = true;
} else {
cppc_fi->prev_perf_fb_ctrs = fb_ctrs;
}
}
}
static void __init cppc_freq_invariance_init(void)
{
struct sched_attr attr = {
.size = sizeof(struct sched_attr),
.sched_policy = SCHED_DEADLINE,
.sched_nice = 0,
.sched_priority = 0,
/*
* Fake (unused) bandwidth; workaround to "fix"
* priority inheritance.
*/
.sched_runtime = 1000000,
.sched_deadline = 10000000,
.sched_period = 10000000,
};
int ret;
if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
return;
if (fie_disabled)
return;
kworker_fie = kthread_create_worker(0, "cppc_fie");
if (IS_ERR(kworker_fie))
return;
ret = sched_setattr_nocheck(kworker_fie->task, &attr);
if (ret) {
pr_warn("%s: failed to set SCHED_DEADLINE: %d\n", __func__,
ret);
kthread_destroy_worker(kworker_fie);
return;
}
/* Register for freq-invariance */
topology_set_scale_freq_source(&cppc_sftd, cpu_present_mask);
}
static void cppc_freq_invariance_exit(void)
{
struct cppc_freq_invariance *cppc_fi;
int i;
if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate)
return;
if (fie_disabled)
return;
topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, cpu_present_mask);
for_each_possible_cpu(i) {
cppc_fi = &per_cpu(cppc_freq_inv, i);
irq_work_sync(&cppc_fi->irq_work);
}
kthread_destroy_worker(kworker_fie);
kworker_fie = NULL;
}
#else
static inline void
cppc_freq_invariance_policy_init(struct cpufreq_policy *policy,
struct cppc_cpudata *cpu_data)
{
}
static inline void cppc_freq_invariance_init(void)
{
}
static inline void cppc_freq_invariance_exit(void)
{
}
#endif /* CONFIG_ACPI_CPPC_CPUFREQ_FIE */
/* Callback function used to retrieve the max frequency from DMI */
static void cppc_find_dmi_mhz(const struct dmi_header *dm, void *private)
{
@@ -547,12 +345,9 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
cpu_data->perf_ctrls.desired_perf = caps->highest_perf;
ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
if (ret) {
if (ret)
pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n",
caps->highest_perf, cpu, ret);
} else {
cppc_freq_invariance_policy_init(policy, cpu_data);
}
return ret;
}
@@ -565,12 +360,12 @@ static inline u64 get_delta(u64 t1, u64 t0)
return (u32)t1 - (u32)t0;
}
static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data,
struct cppc_perf_fb_ctrs fb_ctrs_t0,
struct cppc_perf_fb_ctrs fb_ctrs_t1)
static int cppc_get_rate_from_fbctrs(struct cppc_cpudata *cpu_data,
struct cppc_perf_fb_ctrs fb_ctrs_t0,
struct cppc_perf_fb_ctrs fb_ctrs_t1)
{
u64 delta_reference, delta_delivered;
u64 reference_perf;
u64 reference_perf, delivered_perf;
reference_perf = fb_ctrs_t0.reference_perf;
@@ -579,21 +374,12 @@ static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data,
delta_delivered = get_delta(fb_ctrs_t1.delivered,
fb_ctrs_t0.delivered);
/* Check to avoid divide-by zero and invalid delivered_perf */
if (!delta_reference || !delta_delivered)
return cpu_data->perf_ctrls.desired_perf;
return (reference_perf * delta_delivered) / delta_reference;
}
static int cppc_get_rate_from_fbctrs(struct cppc_cpudata *cpu_data,
struct cppc_perf_fb_ctrs fb_ctrs_t0,
struct cppc_perf_fb_ctrs fb_ctrs_t1)
{
u64 delivered_perf;
delivered_perf = cppc_perf_from_fbctrs(cpu_data, fb_ctrs_t0,
fb_ctrs_t1);
/* Check to avoid divide-by zero */
if (delta_reference || delta_delivered)
delivered_perf = (reference_perf * delta_delivered) /
delta_reference;
else
delivered_perf = cpu_data->perf_ctrls.desired_perf;
return cppc_cpufreq_perf_to_khz(cpu_data, delivered_perf);
}
@@ -718,8 +504,6 @@ static void cppc_check_hisi_workaround(void)
static int __init cppc_cpufreq_init(void)
{
int ret;
if ((acpi_disabled) || !acpi_cpc_valid())
return -ENODEV;
@@ -727,11 +511,7 @@ static int __init cppc_cpufreq_init(void)
cppc_check_hisi_workaround();
ret = cpufreq_register_driver(&cppc_cpufreq_driver);
if (!ret)
cppc_freq_invariance_init();
return ret;
return cpufreq_register_driver(&cppc_cpufreq_driver);
}
static inline void free_cpu_data(void)
@@ -748,7 +528,6 @@ static inline void free_cpu_data(void)
static void __exit cppc_cpufreq_exit(void)
{
cppc_freq_invariance_exit();
cpufreq_unregister_driver(&cppc_cpufreq_driver);
free_cpu_data();

View File

@@ -59,6 +59,7 @@ config DMA_OF
#devices
config ALTERA_MSGDMA
tristate "Altera / Intel mSGDMA Engine"
depends on HAS_IOMEM
select DMA_ENGINE
help
Enable support for Altera / Intel mSGDMA controller.
@@ -701,6 +702,7 @@ config XILINX_ZYNQMP_DMA
config XILINX_ZYNQMP_DPDMA
tristate "Xilinx DPDMA Engine"
depends on HAS_IOMEM && OF
select DMA_ENGINE
select DMA_VIRTUAL_CHANNELS
help

View File

@@ -332,6 +332,7 @@ static int __cold dpaa2_qdma_setup(struct fsl_mc_device *ls_dev)
}
if (priv->dpdmai_attr.version.major > DPDMAI_VER_MAJOR) {
err = -EINVAL;
dev_err(dev, "DPDMAI major version mismatch\n"
"Found %u.%u, supported version is %u.%u\n",
priv->dpdmai_attr.version.major,
@@ -341,6 +342,7 @@ static int __cold dpaa2_qdma_setup(struct fsl_mc_device *ls_dev)
}
if (priv->dpdmai_attr.version.minor > DPDMAI_VER_MINOR) {
err = -EINVAL;
dev_err(dev, "DPDMAI minor version mismatch\n"
"Found %u.%u, supported version is %u.%u\n",
priv->dpdmai_attr.version.major,
@@ -475,6 +477,7 @@ static int __cold dpaa2_qdma_dpio_setup(struct dpaa2_qdma_priv *priv)
ppriv->store =
dpaa2_io_store_create(DPAA2_QDMA_STORE_SIZE, dev);
if (!ppriv->store) {
err = -ENOMEM;
dev_err(dev, "dpaa2_io_store_create() failed\n");
goto err_store;
}

View File

@@ -110,6 +110,7 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp)
pasid = iommu_sva_get_pasid(sva);
if (pasid == IOMMU_PASID_INVALID) {
iommu_sva_unbind_device(sva);
rc = -EINVAL;
goto failed;
}

View File

@@ -168,6 +168,32 @@ static int idxd_setup_interrupts(struct idxd_device *idxd)
return rc;
}
static void idxd_cleanup_interrupts(struct idxd_device *idxd)
{
struct pci_dev *pdev = idxd->pdev;
struct idxd_irq_entry *irq_entry;
int i, msixcnt;
msixcnt = pci_msix_vec_count(pdev);
if (msixcnt <= 0)
return;
irq_entry = &idxd->irq_entries[0];
free_irq(irq_entry->vector, irq_entry);
for (i = 1; i < msixcnt; i++) {
irq_entry = &idxd->irq_entries[i];
if (idxd->hw.cmd_cap & BIT(IDXD_CMD_RELEASE_INT_HANDLE))
idxd_device_release_int_handle(idxd, idxd->int_handles[i],
IDXD_IRQ_MSIX);
free_irq(irq_entry->vector, irq_entry);
}
idxd_mask_error_interrupts(idxd);
pci_free_irq_vectors(pdev);
}
static int idxd_setup_wqs(struct idxd_device *idxd)
{
struct device *dev = &idxd->pdev->dev;
@@ -242,6 +268,7 @@ static int idxd_setup_engines(struct idxd_device *idxd)
engine->idxd = idxd;
device_initialize(&engine->conf_dev);
engine->conf_dev.parent = &idxd->conf_dev;
engine->conf_dev.bus = &dsa_bus_type;
engine->conf_dev.type = &idxd_engine_device_type;
rc = dev_set_name(&engine->conf_dev, "engine%d.%d", idxd->id, engine->id);
if (rc < 0) {
@@ -303,6 +330,19 @@ static int idxd_setup_groups(struct idxd_device *idxd)
return rc;
}
static void idxd_cleanup_internals(struct idxd_device *idxd)
{
int i;
for (i = 0; i < idxd->max_groups; i++)
put_device(&idxd->groups[i]->conf_dev);
for (i = 0; i < idxd->max_engines; i++)
put_device(&idxd->engines[i]->conf_dev);
for (i = 0; i < idxd->max_wqs; i++)
put_device(&idxd->wqs[i]->conf_dev);
destroy_workqueue(idxd->wq);
}
static int idxd_setup_internals(struct idxd_device *idxd)
{
struct device *dev = &idxd->pdev->dev;
@@ -531,12 +571,12 @@ static int idxd_probe(struct idxd_device *idxd)
dev_dbg(dev, "Loading RO device config\n");
rc = idxd_device_load_config(idxd);
if (rc < 0)
goto err;
goto err_config;
}
rc = idxd_setup_interrupts(idxd);
if (rc)
goto err;
goto err_config;
dev_dbg(dev, "IDXD interrupt setup complete.\n");
@@ -549,6 +589,8 @@ static int idxd_probe(struct idxd_device *idxd)
dev_dbg(dev, "IDXD device %d probed successfully\n", idxd->id);
return 0;
err_config:
idxd_cleanup_internals(idxd);
err:
if (device_pasid_enabled(idxd))
idxd_disable_system_pasid(idxd);
@@ -556,6 +598,18 @@ static int idxd_probe(struct idxd_device *idxd)
return rc;
}
static void idxd_cleanup(struct idxd_device *idxd)
{
struct device *dev = &idxd->pdev->dev;
perfmon_pmu_remove(idxd);
idxd_cleanup_interrupts(idxd);
idxd_cleanup_internals(idxd);
if (device_pasid_enabled(idxd))
idxd_disable_system_pasid(idxd);
iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
}
static int idxd_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct device *dev = &pdev->dev;
@@ -608,7 +662,7 @@ static int idxd_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
rc = idxd_register_devices(idxd);
if (rc) {
dev_err(dev, "IDXD sysfs setup failed\n");
goto err;
goto err_dev_register;
}
idxd->state = IDXD_DEV_CONF_READY;
@@ -618,6 +672,8 @@ static int idxd_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
return 0;
err_dev_register:
idxd_cleanup(idxd);
err:
pci_iounmap(pdev, idxd->reg_base);
err_iomap:
@@ -787,6 +843,7 @@ module_init(idxd_init_module);
static void __exit idxd_exit_module(void)
{
idxd_unregister_driver();
pci_unregister_driver(&idxd_pci_driver);
idxd_cdev_remove();
idxd_unregister_bus_type();

View File

@@ -230,7 +230,7 @@ out:
}
/**
* ipu_irq_map() - map an IPU interrupt source to an IRQ number
* ipu_irq_unmap() - unmap an IPU interrupt source
* @source: interrupt source bit position (see ipu_irq_map())
* @return: 0 or negative error code
*/

View File

@@ -131,10 +131,7 @@ static unsigned int mtk_uart_apdma_read(struct mtk_chan *c, unsigned int reg)
static void mtk_uart_apdma_desc_free(struct virt_dma_desc *vd)
{
struct dma_chan *chan = vd->tx.chan;
struct mtk_chan *c = to_mtk_uart_apdma_chan(chan);
kfree(c->desc);
kfree(container_of(vd, struct mtk_uart_apdma_desc, vd));
}
static void mtk_uart_apdma_start_tx(struct mtk_chan *c)
@@ -207,14 +204,9 @@ static void mtk_uart_apdma_start_rx(struct mtk_chan *c)
static void mtk_uart_apdma_tx_handler(struct mtk_chan *c)
{
struct mtk_uart_apdma_desc *d = c->desc;
mtk_uart_apdma_write(c, VFF_INT_FLAG, VFF_TX_INT_CLR_B);
mtk_uart_apdma_write(c, VFF_INT_EN, VFF_INT_EN_CLR_B);
mtk_uart_apdma_write(c, VFF_EN, VFF_EN_CLR_B);
list_del(&d->vd.node);
vchan_cookie_complete(&d->vd);
}
static void mtk_uart_apdma_rx_handler(struct mtk_chan *c)
@@ -245,9 +237,17 @@ static void mtk_uart_apdma_rx_handler(struct mtk_chan *c)
c->rx_status = d->avail_len - cnt;
mtk_uart_apdma_write(c, VFF_RPT, wg);
}
list_del(&d->vd.node);
vchan_cookie_complete(&d->vd);
static void mtk_uart_apdma_chan_complete_handler(struct mtk_chan *c)
{
struct mtk_uart_apdma_desc *d = c->desc;
if (d) {
list_del(&d->vd.node);
vchan_cookie_complete(&d->vd);
c->desc = NULL;
}
}
static irqreturn_t mtk_uart_apdma_irq_handler(int irq, void *dev_id)
@@ -261,6 +261,7 @@ static irqreturn_t mtk_uart_apdma_irq_handler(int irq, void *dev_id)
mtk_uart_apdma_rx_handler(c);
else if (c->dir == DMA_MEM_TO_DEV)
mtk_uart_apdma_tx_handler(c);
mtk_uart_apdma_chan_complete_handler(c);
spin_unlock_irqrestore(&c->vc.lock, flags);
return IRQ_HANDLED;
@@ -348,7 +349,7 @@ static struct dma_async_tx_descriptor *mtk_uart_apdma_prep_slave_sg
return NULL;
/* Now allocate and setup the descriptor */
d = kzalloc(sizeof(*d), GFP_ATOMIC);
d = kzalloc(sizeof(*d), GFP_NOWAIT);
if (!d)
return NULL;
@@ -366,7 +367,7 @@ static void mtk_uart_apdma_issue_pending(struct dma_chan *chan)
unsigned long flags;
spin_lock_irqsave(&c->vc.lock, flags);
if (vchan_issue_pending(&c->vc)) {
if (vchan_issue_pending(&c->vc) && !c->desc) {
vd = vchan_next_desc(&c->vc);
c->desc = to_mtk_uart_apdma_desc(&vd->tx);

View File

@@ -2694,13 +2694,15 @@ static struct dma_async_tx_descriptor *pl330_prep_dma_cyclic(
for (i = 0; i < len / period_len; i++) {
desc = pl330_get_desc(pch);
if (!desc) {
unsigned long iflags;
dev_err(pch->dmac->ddma.dev, "%s:%d Unable to fetch desc\n",
__func__, __LINE__);
if (!first)
return NULL;
spin_lock_irqsave(&pl330->pool_lock, flags);
spin_lock_irqsave(&pl330->pool_lock, iflags);
while (!list_empty(&first->node)) {
desc = list_entry(first->node.next,
@@ -2710,7 +2712,7 @@ static struct dma_async_tx_descriptor *pl330_prep_dma_cyclic(
list_move_tail(&first->node, &pl330->desc_pool);
spin_unlock_irqrestore(&pl330->pool_lock, flags);
spin_unlock_irqrestore(&pl330->pool_lock, iflags);
return NULL;
}

View File

@@ -33,6 +33,7 @@ config QCOM_GPI_DMA
config QCOM_HIDMA_MGMT
tristate "Qualcomm Technologies HIDMA Management support"
depends on HAS_IOMEM
select DMA_ENGINE
help
Enable support for the Qualcomm Technologies HIDMA Management.

View File

@@ -1,5 +1,6 @@
config SF_PDMA
tristate "Sifive PDMA controller driver"
depends on HAS_IOMEM
select DMA_ENGINE
select DMA_VIRTUAL_CHANNELS
help

View File

@@ -1913,7 +1913,7 @@ static int rcar_dmac_probe(struct platform_device *pdev)
/* Enable runtime PM and initialize the device. */
pm_runtime_enable(&pdev->dev);
ret = pm_runtime_get_sync(&pdev->dev);
ret = pm_runtime_resume_and_get(&pdev->dev);
if (ret < 0) {
dev_err(&pdev->dev, "runtime PM get sync failed (%d)\n", ret);
return ret;

View File

@@ -3675,6 +3675,9 @@ static int __init d40_probe(struct platform_device *pdev)
kfree(base->lcla_pool.base_unaligned);
if (base->lcpa_base)
iounmap(base->lcpa_base);
if (base->phy_lcpa)
release_mem_region(base->phy_lcpa,
base->lcpa_size);

View File

@@ -1452,7 +1452,7 @@ static int stm32_mdma_alloc_chan_resources(struct dma_chan *c)
return -ENOMEM;
}
ret = pm_runtime_get_sync(dmadev->ddev.dev);
ret = pm_runtime_resume_and_get(dmadev->ddev.dev);
if (ret < 0)
return ret;
@@ -1718,7 +1718,7 @@ static int stm32_mdma_pm_suspend(struct device *dev)
u32 ccr, id;
int ret;
ret = pm_runtime_get_sync(dev);
ret = pm_runtime_resume_and_get(dev);
if (ret < 0)
return ret;

View File

@@ -113,6 +113,7 @@
#define XILINX_DPDMA_CH_VDO 0x020
#define XILINX_DPDMA_CH_PYLD_SZ 0x024
#define XILINX_DPDMA_CH_DESC_ID 0x028
#define XILINX_DPDMA_CH_DESC_ID_MASK GENMASK(15, 0)
/* DPDMA descriptor fields */
#define XILINX_DPDMA_DESC_CONTROL_PREEMBLE 0xa5
@@ -866,7 +867,8 @@ static void xilinx_dpdma_chan_queue_transfer(struct xilinx_dpdma_chan *chan)
* will be used, but it should be enough.
*/
list_for_each_entry(sw_desc, &desc->descriptors, node)
sw_desc->hw.desc_id = desc->vdesc.tx.cookie;
sw_desc->hw.desc_id = desc->vdesc.tx.cookie
& XILINX_DPDMA_CH_DESC_ID_MASK;
sw_desc = list_first_entry(&desc->descriptors,
struct xilinx_dpdma_sw_desc, node);
@@ -1086,7 +1088,8 @@ static void xilinx_dpdma_chan_vsync_irq(struct xilinx_dpdma_chan *chan)
if (!chan->running || !pending)
goto out;
desc_id = dpdma_read(chan->reg, XILINX_DPDMA_CH_DESC_ID);
desc_id = dpdma_read(chan->reg, XILINX_DPDMA_CH_DESC_ID)
& XILINX_DPDMA_CH_DESC_ID_MASK;
/* If the retrigger raced with vsync, retry at the next frame. */
sw_desc = list_first_entry(&pending->descriptors,
@@ -1459,7 +1462,7 @@ static void xilinx_dpdma_enable_irq(struct xilinx_dpdma_device *xdev)
*/
static void xilinx_dpdma_disable_irq(struct xilinx_dpdma_device *xdev)
{
dpdma_write(xdev->reg, XILINX_DPDMA_IDS, XILINX_DPDMA_INTR_ERR_ALL);
dpdma_write(xdev->reg, XILINX_DPDMA_IDS, XILINX_DPDMA_INTR_ALL);
dpdma_write(xdev->reg, XILINX_DPDMA_EIDS, XILINX_DPDMA_EINTR_ALL);
}
@@ -1596,6 +1599,26 @@ static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
return dma_get_slave_channel(&xdev->chan[chan_id]->vchan.chan);
}
static void dpdma_hw_init(struct xilinx_dpdma_device *xdev)
{
unsigned int i;
void __iomem *reg;
/* Disable all interrupts */
xilinx_dpdma_disable_irq(xdev);
/* Stop all channels */
for (i = 0; i < ARRAY_SIZE(xdev->chan); i++) {
reg = xdev->reg + XILINX_DPDMA_CH_BASE
+ XILINX_DPDMA_CH_OFFSET * i;
dpdma_clr(reg, XILINX_DPDMA_CH_CNTL, XILINX_DPDMA_CH_CNTL_ENABLE);
}
/* Clear the interrupt status registers */
dpdma_write(xdev->reg, XILINX_DPDMA_ISR, XILINX_DPDMA_INTR_ALL);
dpdma_write(xdev->reg, XILINX_DPDMA_EISR, XILINX_DPDMA_EINTR_ALL);
}
static int xilinx_dpdma_probe(struct platform_device *pdev)
{
struct xilinx_dpdma_device *xdev;
@@ -1622,6 +1645,8 @@ static int xilinx_dpdma_probe(struct platform_device *pdev)
if (IS_ERR(xdev->reg))
return PTR_ERR(xdev->reg);
dpdma_hw_init(xdev);
xdev->irq = platform_get_irq(pdev, 0);
if (xdev->irq < 0) {
dev_err(xdev->dev, "failed to get platform irq\n");

View File

@@ -468,7 +468,7 @@ static int zynqmp_dma_alloc_chan_resources(struct dma_chan *dchan)
struct zynqmp_dma_desc_sw *desc;
int i, ret;
ret = pm_runtime_get_sync(chan->dev);
ret = pm_runtime_resume_and_get(chan->dev);
if (ret < 0)
return ret;

View File

@@ -1383,6 +1383,7 @@ config GPIO_TPS68470
config GPIO_TQMX86
tristate "TQ-Systems QTMX86 GPIO"
depends on MFD_TQMX86 || COMPILE_TEST
depends on HAS_IOPORT_MAP
select GPIOLIB_IRQCHIP
help
This driver supports GPIO on the TQMX86 IO controller.
@@ -1450,6 +1451,7 @@ menu "PCI GPIO expanders"
config GPIO_AMD8111
tristate "AMD 8111 GPIO driver"
depends on X86 || COMPILE_TEST
depends on HAS_IOPORT_MAP
help
The AMD 8111 south bridge contains 32 GPIO pins which can be used.

View File

@@ -334,7 +334,7 @@ static int mxc_gpio_init_gc(struct mxc_gpio_port *port, int irq_base)
ct->chip.irq_unmask = irq_gc_mask_set_bit;
ct->chip.irq_set_type = gpio_set_irq_type;
ct->chip.irq_set_wake = gpio_set_wake_irq;
ct->chip.flags = IRQCHIP_MASK_ON_SUSPEND;
ct->chip.flags = IRQCHIP_MASK_ON_SUSPEND | IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND;
ct->regs.ack = GPIO_ISR;
ct->regs.mask = GPIO_IMR;

View File

@@ -7,7 +7,7 @@
#include <linux/slab.h>
#include <linux/of_device.h>
#define WCD_PIN_MASK(p) BIT(p - 1)
#define WCD_PIN_MASK(p) BIT(p)
#define WCD_REG_DIR_CTL_OFFSET 0x42
#define WCD_REG_VAL_CTL_OFFSET 0x43
#define WCD934X_NPINS 5

View File

@@ -1880,6 +1880,7 @@ static void gpio_v2_line_info_changed_to_v1(
struct gpio_v2_line_info_changed *lic_v2,
struct gpioline_info_changed *lic_v1)
{
memset(lic_v1, 0, sizeof(*lic_v1));
gpio_v2_line_info_to_v1(&lic_v2->info, &lic_v1->info);
lic_v1->timestamp = lic_v2->timestamp_ns;
lic_v1->event_type = lic_v2->event_type;

View File

@@ -1047,17 +1047,18 @@ int amdgpu_display_gem_fb_init(struct drm_device *dev,
rfb->base.obj[0] = obj;
drm_helper_mode_fill_fb_struct(dev, &rfb->base, mode_cmd);
ret = drm_framebuffer_init(dev, &rfb->base, &amdgpu_fb_funcs);
if (ret)
goto err;
ret = amdgpu_display_framebuffer_init(dev, rfb, mode_cmd, obj);
if (ret)
goto err;
ret = drm_framebuffer_init(dev, &rfb->base, &amdgpu_fb_funcs);
if (ret)
goto err;
return 0;
err:
drm_err(dev, "Failed to init gem fb: %d\n", ret);
drm_dbg_kms(dev, "Failed to init gem fb: %d\n", ret);
rfb->base.obj[0] = NULL;
return ret;
}
@@ -1071,9 +1072,6 @@ int amdgpu_display_gem_fb_verify_and_init(
rfb->base.obj[0] = obj;
drm_helper_mode_fill_fb_struct(dev, &rfb->base, mode_cmd);
ret = drm_framebuffer_init(dev, &rfb->base, &amdgpu_fb_funcs);
if (ret)
goto err;
/* Verify that the modifier is supported. */
if (!drm_any_plane_has_format(dev, mode_cmd->pixel_format,
mode_cmd->modifier[0])) {
@@ -1092,9 +1090,13 @@ int amdgpu_display_gem_fb_verify_and_init(
if (ret)
goto err;
ret = drm_framebuffer_init(dev, &rfb->base, &amdgpu_fb_funcs);
if (ret)
goto err;
return 0;
err:
drm_err(dev, "Failed to verify and init gem fb: %d\n", ret);
drm_dbg_kms(dev, "Failed to verify and init gem fb: %d\n", ret);
rfb->base.obj[0] = NULL;
return ret;
}

View File

@@ -214,9 +214,21 @@ static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach)
{
struct drm_gem_object *obj = attach->dmabuf->priv;
struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
int r;
/* pin buffer into GTT */
return amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT);
if (r)
return r;
if (bo->tbo.moving) {
r = dma_fence_wait(bo->tbo.moving, true);
if (r) {
amdgpu_bo_unpin(bo);
return r;
}
}
return 0;
}
/**

View File

@@ -100,7 +100,7 @@ static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
kfree(ubo->metadata);
}
kfree(bo);
kvfree(bo);
}
/**
@@ -552,7 +552,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
BUG_ON(bp->bo_ptr_size < sizeof(struct amdgpu_bo));
*bo_ptr = NULL;
bo = kzalloc(bp->bo_ptr_size, GFP_KERNEL);
bo = kvzalloc(bp->bo_ptr_size, GFP_KERNEL);
if (bo == NULL)
return -ENOMEM;
drm_gem_private_object_init(adev_to_drm(adev), &bo->tbo.base, size);

View File

@@ -173,6 +173,9 @@
#define mmGC_THROTTLE_CTRL_Sienna_Cichlid 0x2030
#define mmGC_THROTTLE_CTRL_Sienna_Cichlid_BASE_IDX 0
#define mmRLC_SPARE_INT_0_Sienna_Cichlid 0x4ca5
#define mmRLC_SPARE_INT_0_Sienna_Cichlid_BASE_IDX 1
#define GFX_RLCG_GC_WRITE_OLD (0x8 << 28)
#define GFX_RLCG_GC_WRITE (0x0 << 28)
#define GFX_RLCG_GC_READ (0x1 << 28)
@@ -1480,8 +1483,15 @@ static u32 gfx_v10_rlcg_rw(struct amdgpu_device *adev, u32 offset, u32 v, uint32
(adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG0_BASE_IDX] + mmSCRATCH_REG2) * 4;
scratch_reg3 = adev->rmmio +
(adev->reg_offset[GC_HWIP][0][mmSCRATCH_REG1_BASE_IDX] + mmSCRATCH_REG3) * 4;
spare_int = adev->rmmio +
(adev->reg_offset[GC_HWIP][0][mmRLC_SPARE_INT_BASE_IDX] + mmRLC_SPARE_INT) * 4;
if (adev->asic_type >= CHIP_SIENNA_CICHLID) {
spare_int = adev->rmmio +
(adev->reg_offset[GC_HWIP][0][mmRLC_SPARE_INT_0_Sienna_Cichlid_BASE_IDX]
+ mmRLC_SPARE_INT_0_Sienna_Cichlid) * 4;
} else {
spare_int = adev->rmmio +
(adev->reg_offset[GC_HWIP][0][mmRLC_SPARE_INT_BASE_IDX] + mmRLC_SPARE_INT) * 4;
}
grbm_cntl = adev->reg_offset[GC_HWIP][0][mmGRBM_GFX_CNTL_BASE_IDX] + mmGRBM_GFX_CNTL;
grbm_idx = adev->reg_offset[GC_HWIP][0][mmGRBM_GFX_INDEX_BASE_IDX] + mmGRBM_GFX_INDEX;
@@ -7349,9 +7359,15 @@ static int gfx_v10_0_hw_fini(void *handle)
if (amdgpu_sriov_vf(adev)) {
gfx_v10_0_cp_gfx_enable(adev, false);
/* Program KIQ position of RLC_CP_SCHEDULERS during destroy */
tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS);
tmp &= 0xffffff00;
WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp);
if (adev->asic_type >= CHIP_SIENNA_CICHLID) {
tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid);
tmp &= 0xffffff00;
WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS_Sienna_Cichlid, tmp);
} else {
tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS);
tmp &= 0xffffff00;
WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp);
}
return 0;
}

View File

@@ -810,6 +810,7 @@ static int smu10_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
break;
case AMD_DPM_FORCED_LEVEL_MANUAL:
data->fine_grain_enabled = 1;
break;
case AMD_DPM_FORCED_LEVEL_PROFILE_EXIT:
default:
break;

View File

@@ -232,7 +232,6 @@ static void atmel_hlcdc_crtc_atomic_enable(struct drm_crtc *c,
pm_runtime_put_sync(dev->dev);
drm_crtc_vblank_on(c);
}
#define ATMEL_HLCDC_RGB444_OUTPUT BIT(0)
@@ -343,8 +342,17 @@ static int atmel_hlcdc_crtc_atomic_check(struct drm_crtc *c,
static void atmel_hlcdc_crtc_atomic_begin(struct drm_crtc *c,
struct drm_atomic_state *state)
{
drm_crtc_vblank_on(c);
}
static void atmel_hlcdc_crtc_atomic_flush(struct drm_crtc *c,
struct drm_atomic_state *state)
{
struct atmel_hlcdc_crtc *crtc = drm_crtc_to_atmel_hlcdc_crtc(c);
unsigned long flags;
spin_lock_irqsave(&c->dev->event_lock, flags);
if (c->state->event) {
c->state->event->pipe = drm_crtc_index(c);
@@ -354,12 +362,7 @@ static void atmel_hlcdc_crtc_atomic_begin(struct drm_crtc *c,
crtc->event = c->state->event;
c->state->event = NULL;
}
}
static void atmel_hlcdc_crtc_atomic_flush(struct drm_crtc *crtc,
struct drm_atomic_state *state)
{
/* TODO: write common plane control register if available */
spin_unlock_irqrestore(&c->dev->event_lock, flags);
}
static const struct drm_crtc_helper_funcs lcdc_crtc_helper_funcs = {

View File

@@ -593,6 +593,7 @@ static int atmel_hlcdc_dc_modeset_init(struct drm_device *dev)
dev->mode_config.max_width = dc->desc->max_width;
dev->mode_config.max_height = dc->desc->max_height;
dev->mode_config.funcs = &mode_config_funcs;
dev->mode_config.async_page_flip = true;
return 0;
}

View File

@@ -314,9 +314,10 @@ int drm_master_open(struct drm_file *file_priv)
void drm_master_release(struct drm_file *file_priv)
{
struct drm_device *dev = file_priv->minor->dev;
struct drm_master *master = file_priv->master;
struct drm_master *master;
mutex_lock(&dev->master_mutex);
master = file_priv->master;
if (file_priv->magic)
idr_remove(&file_priv->master->magic_map, file_priv->magic);

View File

@@ -118,17 +118,18 @@ int drm_getunique(struct drm_device *dev, void *data,
struct drm_file *file_priv)
{
struct drm_unique *u = data;
struct drm_master *master = file_priv->master;
struct drm_master *master;
mutex_lock(&master->dev->master_mutex);
mutex_lock(&dev->master_mutex);
master = file_priv->master;
if (u->unique_len >= master->unique_len) {
if (copy_to_user(u->unique, master->unique, master->unique_len)) {
mutex_unlock(&master->dev->master_mutex);
mutex_unlock(&dev->master_mutex);
return -EFAULT;
}
}
u->unique_len = master->unique_len;
mutex_unlock(&master->dev->master_mutex);
mutex_unlock(&dev->master_mutex);
return 0;
}

View File

@@ -137,6 +137,7 @@ static int kmb_hw_init(struct drm_device *drm, unsigned long flags)
/* Allocate LCD interrupt resources */
irq_lcd = platform_get_irq(pdev, 0);
if (irq_lcd < 0) {
ret = irq_lcd;
drm_err(&kmb->drm, "irq_lcd not found");
goto setup_fail;
}

View File

@@ -577,7 +577,7 @@ static void mcde_dsi_setup_video_mode(struct mcde_dsi *d,
* porches and sync.
*/
/* (ps/s) / (pixels/s) = ps/pixels */
pclk = DIV_ROUND_UP_ULL(1000000000000, mode->clock);
pclk = DIV_ROUND_UP_ULL(1000000000000, (mode->clock * 1000));
dev_dbg(d->dev, "picoseconds between two pixels: %llu\n",
pclk);

View File

@@ -157,7 +157,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
* GPU registers so we need to add 0x1a800 to the register value on A630
* to get the right value from PM4.
*/
get_stats_counter(ring, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L + 0x1a800,
get_stats_counter(ring, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
rbmemptr_stats(ring, index, alwayson_start));
/* Invalidate CCU depth and color */
@@ -187,7 +187,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
rbmemptr_stats(ring, index, cpcycles_end));
get_stats_counter(ring, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L + 0x1a800,
get_stats_counter(ring, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
rbmemptr_stats(ring, index, alwayson_end));
/* Write the fence to the scratch register */
@@ -206,8 +206,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
OUT_RING(ring, submit->seqno);
trace_msm_gpu_submit_flush(submit,
gmu_read64(&a6xx_gpu->gmu, REG_A6XX_GMU_ALWAYS_ON_COUNTER_L,
REG_A6XX_GMU_ALWAYS_ON_COUNTER_H));
gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
REG_A6XX_CP_ALWAYS_ON_COUNTER_HI));
a6xx_flush(gpu, ring);
}
@@ -462,6 +462,113 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
gpu_write(gpu, REG_A6XX_RBBM_CLOCK_CNTL, state ? clock_cntl_on : 0);
}
/* For a615, a616, a618, A619, a630, a640 and a680 */
static const u32 a6xx_protect[] = {
A6XX_PROTECT_RDONLY(0x00000, 0x04ff),
A6XX_PROTECT_RDONLY(0x00501, 0x0005),
A6XX_PROTECT_RDONLY(0x0050b, 0x02f4),
A6XX_PROTECT_NORDWR(0x0050e, 0x0000),
A6XX_PROTECT_NORDWR(0x00510, 0x0000),
A6XX_PROTECT_NORDWR(0x00534, 0x0000),
A6XX_PROTECT_NORDWR(0x00800, 0x0082),
A6XX_PROTECT_NORDWR(0x008a0, 0x0008),
A6XX_PROTECT_NORDWR(0x008ab, 0x0024),
A6XX_PROTECT_RDONLY(0x008de, 0x00ae),
A6XX_PROTECT_NORDWR(0x00900, 0x004d),
A6XX_PROTECT_NORDWR(0x0098d, 0x0272),
A6XX_PROTECT_NORDWR(0x00e00, 0x0001),
A6XX_PROTECT_NORDWR(0x00e03, 0x000c),
A6XX_PROTECT_NORDWR(0x03c00, 0x00c3),
A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff),
A6XX_PROTECT_NORDWR(0x08630, 0x01cf),
A6XX_PROTECT_NORDWR(0x08e00, 0x0000),
A6XX_PROTECT_NORDWR(0x08e08, 0x0000),
A6XX_PROTECT_NORDWR(0x08e50, 0x001f),
A6XX_PROTECT_NORDWR(0x09624, 0x01db),
A6XX_PROTECT_NORDWR(0x09e70, 0x0001),
A6XX_PROTECT_NORDWR(0x09e78, 0x0187),
A6XX_PROTECT_NORDWR(0x0a630, 0x01cf),
A6XX_PROTECT_NORDWR(0x0ae02, 0x0000),
A6XX_PROTECT_NORDWR(0x0ae50, 0x032f),
A6XX_PROTECT_NORDWR(0x0b604, 0x0000),
A6XX_PROTECT_NORDWR(0x0be02, 0x0001),
A6XX_PROTECT_NORDWR(0x0be20, 0x17df),
A6XX_PROTECT_NORDWR(0x0f000, 0x0bff),
A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff),
A6XX_PROTECT_NORDWR(0x11c00, 0x0000), /* note: infinite range */
};
/* These are for a620 and a650 */
static const u32 a650_protect[] = {
A6XX_PROTECT_RDONLY(0x00000, 0x04ff),
A6XX_PROTECT_RDONLY(0x00501, 0x0005),
A6XX_PROTECT_RDONLY(0x0050b, 0x02f4),
A6XX_PROTECT_NORDWR(0x0050e, 0x0000),
A6XX_PROTECT_NORDWR(0x00510, 0x0000),
A6XX_PROTECT_NORDWR(0x00534, 0x0000),
A6XX_PROTECT_NORDWR(0x00800, 0x0082),
A6XX_PROTECT_NORDWR(0x008a0, 0x0008),
A6XX_PROTECT_NORDWR(0x008ab, 0x0024),
A6XX_PROTECT_RDONLY(0x008de, 0x00ae),
A6XX_PROTECT_NORDWR(0x00900, 0x004d),
A6XX_PROTECT_NORDWR(0x0098d, 0x0272),
A6XX_PROTECT_NORDWR(0x00e00, 0x0001),
A6XX_PROTECT_NORDWR(0x00e03, 0x000c),
A6XX_PROTECT_NORDWR(0x03c00, 0x00c3),
A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff),
A6XX_PROTECT_NORDWR(0x08630, 0x01cf),
A6XX_PROTECT_NORDWR(0x08e00, 0x0000),
A6XX_PROTECT_NORDWR(0x08e08, 0x0000),
A6XX_PROTECT_NORDWR(0x08e50, 0x001f),
A6XX_PROTECT_NORDWR(0x08e80, 0x027f),
A6XX_PROTECT_NORDWR(0x09624, 0x01db),
A6XX_PROTECT_NORDWR(0x09e60, 0x0011),
A6XX_PROTECT_NORDWR(0x09e78, 0x0187),
A6XX_PROTECT_NORDWR(0x0a630, 0x01cf),
A6XX_PROTECT_NORDWR(0x0ae02, 0x0000),
A6XX_PROTECT_NORDWR(0x0ae50, 0x032f),
A6XX_PROTECT_NORDWR(0x0b604, 0x0000),
A6XX_PROTECT_NORDWR(0x0b608, 0x0007),
A6XX_PROTECT_NORDWR(0x0be02, 0x0001),
A6XX_PROTECT_NORDWR(0x0be20, 0x17df),
A6XX_PROTECT_NORDWR(0x0f000, 0x0bff),
A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff),
A6XX_PROTECT_NORDWR(0x18400, 0x1fff),
A6XX_PROTECT_NORDWR(0x1a800, 0x1fff),
A6XX_PROTECT_NORDWR(0x1f400, 0x0443),
A6XX_PROTECT_RDONLY(0x1f844, 0x007b),
A6XX_PROTECT_NORDWR(0x1f887, 0x001b),
A6XX_PROTECT_NORDWR(0x1f8c0, 0x0000), /* note: infinite range */
};
static void a6xx_set_cp_protect(struct msm_gpu *gpu)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
const u32 *regs = a6xx_protect;
unsigned i, count = ARRAY_SIZE(a6xx_protect), count_max = 32;
BUILD_BUG_ON(ARRAY_SIZE(a6xx_protect) > 32);
BUILD_BUG_ON(ARRAY_SIZE(a650_protect) > 48);
if (adreno_is_a650(adreno_gpu)) {
regs = a650_protect;
count = ARRAY_SIZE(a650_protect);
count_max = 48;
}
/*
* Enable access protection to privileged registers, fault on an access
* protect violation and select the last span to protect from the start
* address all the way to the end of the register address space
*/
gpu_write(gpu, REG_A6XX_CP_PROTECT_CNTL, BIT(0) | BIT(1) | BIT(3));
for (i = 0; i < count - 1; i++)
gpu_write(gpu, REG_A6XX_CP_PROTECT(i), regs[i]);
/* last CP_PROTECT to have "infinite" length on the last entry */
gpu_write(gpu, REG_A6XX_CP_PROTECT(count_max - 1), regs[i]);
}
static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -489,7 +596,7 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
uavflagprd_inv >> 4 | lower_bit << 1);
uavflagprd_inv << 4 | lower_bit << 1);
gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
}
@@ -776,41 +883,7 @@ static int a6xx_hw_init(struct msm_gpu *gpu)
}
/* Protect registers from the CP */
gpu_write(gpu, REG_A6XX_CP_PROTECT_CNTL, 0x00000003);
gpu_write(gpu, REG_A6XX_CP_PROTECT(0),
A6XX_PROTECT_RDONLY(0x600, 0x51));
gpu_write(gpu, REG_A6XX_CP_PROTECT(1), A6XX_PROTECT_RW(0xae50, 0x2));
gpu_write(gpu, REG_A6XX_CP_PROTECT(2), A6XX_PROTECT_RW(0x9624, 0x13));
gpu_write(gpu, REG_A6XX_CP_PROTECT(3), A6XX_PROTECT_RW(0x8630, 0x8));
gpu_write(gpu, REG_A6XX_CP_PROTECT(4), A6XX_PROTECT_RW(0x9e70, 0x1));
gpu_write(gpu, REG_A6XX_CP_PROTECT(5), A6XX_PROTECT_RW(0x9e78, 0x187));
gpu_write(gpu, REG_A6XX_CP_PROTECT(6), A6XX_PROTECT_RW(0xf000, 0x810));
gpu_write(gpu, REG_A6XX_CP_PROTECT(7),
A6XX_PROTECT_RDONLY(0xfc00, 0x3));
gpu_write(gpu, REG_A6XX_CP_PROTECT(8), A6XX_PROTECT_RW(0x50e, 0x0));
gpu_write(gpu, REG_A6XX_CP_PROTECT(9), A6XX_PROTECT_RDONLY(0x50f, 0x0));
gpu_write(gpu, REG_A6XX_CP_PROTECT(10), A6XX_PROTECT_RW(0x510, 0x0));
gpu_write(gpu, REG_A6XX_CP_PROTECT(11),
A6XX_PROTECT_RDONLY(0x0, 0x4f9));
gpu_write(gpu, REG_A6XX_CP_PROTECT(12),
A6XX_PROTECT_RDONLY(0x501, 0xa));
gpu_write(gpu, REG_A6XX_CP_PROTECT(13),
A6XX_PROTECT_RDONLY(0x511, 0x44));
gpu_write(gpu, REG_A6XX_CP_PROTECT(14), A6XX_PROTECT_RW(0xe00, 0xe));
gpu_write(gpu, REG_A6XX_CP_PROTECT(15), A6XX_PROTECT_RW(0x8e00, 0x0));
gpu_write(gpu, REG_A6XX_CP_PROTECT(16), A6XX_PROTECT_RW(0x8e50, 0xf));
gpu_write(gpu, REG_A6XX_CP_PROTECT(17), A6XX_PROTECT_RW(0xbe02, 0x0));
gpu_write(gpu, REG_A6XX_CP_PROTECT(18),
A6XX_PROTECT_RW(0xbe20, 0x11f3));
gpu_write(gpu, REG_A6XX_CP_PROTECT(19), A6XX_PROTECT_RW(0x800, 0x82));
gpu_write(gpu, REG_A6XX_CP_PROTECT(20), A6XX_PROTECT_RW(0x8a0, 0x8));
gpu_write(gpu, REG_A6XX_CP_PROTECT(21), A6XX_PROTECT_RW(0x8ab, 0x19));
gpu_write(gpu, REG_A6XX_CP_PROTECT(22), A6XX_PROTECT_RW(0x900, 0x4d));
gpu_write(gpu, REG_A6XX_CP_PROTECT(23), A6XX_PROTECT_RW(0x98d, 0x76));
gpu_write(gpu, REG_A6XX_CP_PROTECT(24),
A6XX_PROTECT_RDONLY(0x980, 0x4));
gpu_write(gpu, REG_A6XX_CP_PROTECT(25), A6XX_PROTECT_RW(0xa630, 0x0));
a6xx_set_cp_protect(gpu);
/* Enable expanded apriv for targets that support it */
if (gpu->hw_apriv) {
@@ -1211,7 +1284,7 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
if (ret)
return ret;
if (adreno_gpu->base.hw_apriv || a6xx_gpu->has_whereami)
if (a6xx_gpu->shadow_bo)
for (i = 0; i < gpu->nr_rings; i++)
a6xx_gpu->shadow[i] = 0;

View File

@@ -44,7 +44,7 @@ struct a6xx_gpu {
* REG_CP_PROTECT_REG(n) - this will block both reads and writes for _len
* registers starting at _reg.
*/
#define A6XX_PROTECT_RW(_reg, _len) \
#define A6XX_PROTECT_NORDWR(_reg, _len) \
((1 << 31) | \
(((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))

Some files were not shown because too many files have changed in this diff Show More