Compare commits

...

206 Commits

Author SHA1 Message Date
Linus Torvalds
e93c9c99a6 Linux 5.1 2019-05-05 17:42:58 -07:00
Linus Torvalds
7178fb0b23 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "I'd like to apologize for this very late pull request: I was dithering
  through the week whether to send the fixes, and then yesterday Jiri's
  crash fix for a regression introduced in this cycle clearly marked
  perf/urgent as 'must merge now'.

  Most of the commits are tooling fixes, plus there's three kernel fixes
  via four commits:

    - race fix in the Intel PEBS code

    - fix an AUX bug and roll back a previous attempt

    - fix AMD family 17h generic HW cache-event perf counters

  The largest diffstat contribution comes from the AMD fix - a new event
  table is introduced, which is a fairly low risk change but has a large
  linecount"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix race in intel_pmu_disable_event()
  perf/x86/intel/pt: Remove software double buffering PMU capability
  perf/ring_buffer: Fix AUX software double buffering
  perf tools: Remove needless asm/unistd.h include fixing build in some places
  tools arch uapi: Copy missing unistd.h headers for arc, hexagon and riscv
  tools build: Add -ldl to the disassembler-four-args feature test
  perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet
  perf cs-etm: Don't check cs_etm_queue::prev_packet validity
  perf report: Report OOM in status line in the GTK UI
  perf bench numa: Add define for RUSAGE_THREAD if not present
  tools lib traceevent: Change tag string for error
  perf annotate: Fix build on 32 bit for BPF annotation
  tools uapi x86: Sync vmx.h with the kernel
  perf bpf: Return value with unlocking in perf_env__find_btf()
  MAINTAINERS: Include vendor specific files under arch/*/events/*
  perf/x86/amd: Update generic hardware cache events for Family 17h
2019-05-05 14:37:25 -07:00
Linus Torvalds
70c9fb570b Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Ingo Molnar:
 "Fix a kobject memory leak in the cpufreq code"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/cpufreq: Fix kobject memleak
2019-05-05 14:28:48 -07:00
Linus Torvalds
13369e8311 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "Disable function tracing during early SME setup to fix a boot crash on
  SME-enabled kernels running distro kernels (some of which have
  function tracing enabled)"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/mem_encrypt: Disable all instrumentation for early SME setup
2019-05-05 14:26:11 -07:00
Linus Torvalds
51987affd6 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:

 - a couple of ->i_link use-after-free fixes

 - regression fix for wrong errno on absent device name in mount(2)
   (this cycle stuff)

 - ancient UFS braino in large GID handling on Solaris UFS images (bogus
   cut'n'paste from large UID handling; wrong field checked to decide
   whether we should look at old (16bit) or new (32bit) field)

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
  Abort file_remove_privs() for non-reg. files
  [fix] get rid of checking for absent device name in vfs_get_tree()
  apparmorfs: fix use-after-free on symlink traversal
  securityfs: fix use-after-free on symlink traversal
2019-05-05 09:28:45 -07:00
Jiri Olsa
6f55967ad9 perf/x86/intel: Fix race in intel_pmu_disable_event()
New race in x86_pmu_stop() was introduced by replacing the
atomic __test_and_clear_bit() of cpuc->active_mask by separate
test_bit() and __clear_bit() calls in the following commit:

  3966c3feca ("x86/perf/amd: Remove need to check "running" bit in NMI handler")

The race causes panic for PEBS events with enabled callchains:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  ...
  RIP: 0010:perf_prepare_sample+0x8c/0x530
  Call Trace:
   <NMI>
   perf_event_output_forward+0x2a/0x80
   __perf_event_overflow+0x51/0xe0
   handle_pmi_common+0x19e/0x240
   intel_pmu_handle_irq+0xad/0x170
   perf_event_nmi_handler+0x2e/0x50
   nmi_handle+0x69/0x110
   default_do_nmi+0x3e/0x100
   do_nmi+0x11a/0x180
   end_repeat_nmi+0x16/0x1a
  RIP: 0010:native_write_msr+0x6/0x20
  ...
   </NMI>
   intel_pmu_disable_event+0x98/0xf0
   x86_pmu_stop+0x6e/0xb0
   x86_pmu_del+0x46/0x140
   event_sched_out.isra.97+0x7e/0x160
  ...

The event is configured to make samples from PEBS drain code,
but when it's disabled, we'll go through NMI path instead,
where data->callchain will not get allocated and we'll crash:

          x86_pmu_stop
            test_bit(hwc->idx, cpuc->active_mask)
            intel_pmu_disable_event(event)
            {
              ...
              intel_pmu_pebs_disable(event);
              ...

EVENT OVERFLOW ->  <NMI>
                     intel_pmu_handle_irq
                       handle_pmi_common
   TEST PASSES ->        test_bit(bit, cpuc->active_mask))
                           perf_event_overflow
                             perf_prepare_sample
                             {
                               ...
                               if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
                                     data->callchain = perf_callchain(event, regs);

         CRASH ->              size += data->callchain->nr;
                             }
                   </NMI>
              ...
              x86_pmu_disable_event(event)
            }

            __clear_bit(hwc->idx, cpuc->active_mask);

Fixing this by disabling the event itself before setting
off the PEBS bit.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Arcari <darcari@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Lendacky Thomas <Thomas.Lendacky@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 3966c3feca ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-05 13:00:48 +02:00
Linus Torvalds
6203838dec Merge tag 'powerpc-5.1-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
 "One regression fix.

  Changes we merged to STRICT_KERNEL_RWX on 32-bit were causing crashes
  under load on some machines depending on memory layout.

  Thanks to Christophe Leroy"

* tag 'powerpc-5.1-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/32s: Fix BATs setting with CONFIG_STRICT_KERNEL_RWX
2019-05-04 12:24:05 -07:00
Linus Torvalds
aa1be08f52 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:

 - PPC and ARM bugfixes from submaintainers

 - Fix old Windows versions on AMD (recent regression)

 - Fix old Linux versions on processors without EPT

 - Fixes for LAPIC timer optimizations

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
  KVM: nVMX: Fix size checks in vmx_set_nested_state
  KVM: selftests: make hyperv_cpuid test pass on AMD
  KVM: lapic: Check for in-kernel LAPIC before deferencing apic pointer
  KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size
  x86/kvm/mmu: reset MMU context when 32-bit guest switches PAE
  KVM: x86: Whitelist port 0x7e for pre-incrementing %rip
  Documentation: kvm: fix dirty log ioctl arch lists
  KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit
  KVM: arm/arm64: Don't emulate virtual timers on userspace ioctls
  kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP
  KVM: arm/arm64: Ensure vcpu target is unset on reset failure
  KVM: lapic: Convert guest TSC to host time domain if necessary
  KVM: lapic: Allow user to disable adaptive tuning of timer advancement
  KVM: lapic: Track lapic timer advance per vCPU
  KVM: lapic: Disable timer advancement if adaptive tuning goes haywire
  x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012
  KVM: x86: Consider LAPIC TSC-Deadline timer expired if deadline too short
  KVM: PPC: Book3S: Protect memslots while validating user address
  KVM: PPC: Book3S HV: Perserve PSSCR FAKE_SUSPEND bit on guest exit
  KVM: arm/arm64: vgic-v3: Retire pending interrupts on disabling LPIs
  ...
2019-05-03 16:49:46 -07:00
Linus Torvalds
82463436a7 Merge branch 'i2c/for-current-fixed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "I2C driver bugfixes and a MAINTAINERS update for you"

* 'i2c/for-current-fixed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: Prevent runtime suspend of adapter when Host Notify is required
  i2c: synquacer: fix enumeration of slave devices
  MAINTAINERS: friendly takeover of i2c-gpio driver
  i2c: designware: ratelimit 'transfer when suspended' errors
  i2c: imx: correct the method of getting private data in notifier_call
2019-05-03 11:42:01 -07:00
Linus Torvalds
a4ccb5f9dc Merge tag 'drm-fixes-2019-05-03' of git://anongit.freedesktop.org/drm/drm
Pull drm fix from Dave Airlie:
 "Just a single qxl revert"

* tag 'drm-fixes-2019-05-03' of git://anongit.freedesktop.org/drm/drm:
  Revert "drm/qxl: drop prime import/export callbacks"
2019-05-03 09:14:07 -07:00
Linus Torvalds
8f76216c80 Merge tag 'clk-fixes-for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "Two fixes for the NKMP clks on Allwinner SoCs, a locking fix for
  clkdev where we forgot to hold a lock while iterating a list that can
  change, and finally a build fix that adds some stubs for clk APIs that
  are used by devfreq drivers on platforms without the clk APIs"

* tag 'clk-fixes-for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: Add missing stubs for a few functions
  clkdev: Hold clocks_mutex while iterating clocks list
  clk: sunxi-ng: nkmp: Explain why zero width check is needed
  clk: sunxi-ng: nkmp: Avoid GENMASK(-1, 0)
2019-05-03 08:55:06 -07:00
Linus Torvalds
46572f785f Merge tag 'sound-5.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A few stable fixes at this round.

  The USB Line6 audio fixes are a bit large, but they are rather trivial
  and pretty much device-specific, so should be safe to apply at this
  late stage. Ditto for other HD-audio quirks"

* tag 'sound-5.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Apply the fixup for ASUS Q325UAR
  ALSA: line6: use dynamic buffers
  ALSA: hda/realtek - Fixed Dell AIO speaker noise
  ALSA: hda/realtek - Add new Dell platform for headset mode
2019-05-03 08:42:03 -07:00
Alexander Shishkin
72e830f684 perf/x86/intel/pt: Remove software double buffering PMU capability
Now that all AUX allocations are high-order by default, the software
double buffering PMU capability doesn't make sense any more, get rid
of it. In case some PMUs choose to opt out, we can re-introduce it.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: adrian.hunter@intel.com
Link: http://lkml.kernel.org/r/20190503085536.24119-3-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-03 12:46:20 +02:00
Alexander Shishkin
26ae4f4406 perf/ring_buffer: Fix AUX software double buffering
This recent commit:

  5768402fd9 ("perf/ring_buffer: Use high order allocations for AUX buffers optimistically")

overlooked the fact that the previous one page granularity of the AUX buffer
provided an implicit double buffering capability to the PMU driver, which
went away when the entire buffer became one high-order page.

Always make the full-trace mode AUX allocation at least two-part to preserve
the previous behavior and allow the implicit double buffering to continue.

Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: adrian.hunter@intel.com
Fixes: 5768402fd9 ("perf/ring_buffer: Use high order allocations for AUX buffers optimistically")
Link: http://lkml.kernel.org/r/20190503085536.24119-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-03 12:46:10 +02:00
Ingo Molnar
221856b16e Merge tag 'perf-urgent-for-mingo-5.1-20190502' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

tools UAPI:

  Arnaldo Carvalho de Melo:

  - Sync x86's vmx.h with the kernel.

  - Copy missing unistd.h headers for arc, hexagon and riscv, fixing
    a reported build regression on the ARC 32-bit architecture.

perf bench numa:

  Arnaldo Carvalho de Melo:

  - Add define for RUSAGE_THREAD if not present, fixing the build on the
    ARC architecture when only zlib and libnuma are present.

perf BPF:

  Arnaldo Carvalho de Melo:

  - The disassembler-four-args feature test needs -ldl on distros such as
    Mageia 7.

  Bo YU:

  - Fix unlocking on success in perf_env__find_btf(), detected with
    the coverity tool.

libtraceevent:

  Leo Yan:

  - Change misleading hard coded 'trace-cmd' string in error messages.

ARM hardware tracing:

  Leo Yan:

  - Always allocate memory for cs_etm_queue::prev_packet, fixing a segfault
    when processing CoreSight perf data.

perf annotate:

  Thadeu Lima de Souza Cascardo:

  - Fix build on 32 bit for BPF.

perf report:

  Thomas Richter:

  - Report OOM in status line in the GTK UI.

core libs:

  - Remove needless asm/unistd.h that, used with sys/syscall.h ended
    up redefining the syscalls defines in environments such as the
    ARC arch when using uClibc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-03 07:48:18 +02:00
Dave Airlie
1daa0449d2 Merge tag 'drm-misc-fixes-2019-05-02' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- One revert for QXL for a DRI3 breakage

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190502122529.hguztj3kncaixe3d@flea
2019-05-03 09:36:31 +10:00
Arnaldo Carvalho de Melo
7e221b811f perf tools: Remove needless asm/unistd.h include fixing build in some places
We were including sys/syscall.h and asm/unistd.h, since sys/syscall.h
includes asm/unistd.h, sometimes this leads to the redefinition of
defines, breaking the build.

Noticed on ARC with uCLibc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Link: https://lkml.kernel.org/n/tip-xjpf80o64i2ko74aj2jih0qg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:20 -04:00
Arnaldo Carvalho de Melo
18f90d372c tools arch uapi: Copy missing unistd.h headers for arc, hexagon and riscv
Since those were introduced in:

  c8ce48f065 ("asm-generic: Make time32 syscall numbers optional")

But when the asm-generic/unistd.h was sync'ed with tools/ in:

  1a787fc5ba ("tools headers uapi: Sync copy of asm-generic/unistd.h with the kernel sources")

I forgot to copy the files for the architectures that define
__ARCH_WANT_TIME32_SYSCALLS, so the perf build was breaking there, as
reported by Vineet Gupta for the ARC architecture.

After updating my ARC container to use the glibc based toolchain + cross
building libnuma, zlib and elfutils, I finally managed to reproduce the
problem and verify that this now is fixed and will not regress as will
be tested before each pull req sent upstream.

Reported-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
CC: linux-snps-arc@lists.infradead.org
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/r/20190426193531.GC28586@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:20 -04:00
Arnaldo Carvalho de Melo
c638417e1a tools build: Add -ldl to the disassembler-four-args feature test
Thomas Backlund reported that the perf build was failing on the Mageia 7
distro, that is because it uses:

  cat /tmp/build/perf/feature/test-disassembler-four-args.make.output
  /usr/bin/ld: /usr/lib64/libbfd.a(plugin.o): in function `try_load_plugin':
  /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:243:
  undefined reference to `dlopen'
  /usr/bin/ld:
  /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:271:
  undefined reference to `dlsym'
  /usr/bin/ld:
  /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:256:
  undefined reference to `dlclose'
  /usr/bin/ld:
  /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:246:
  undefined reference to `dlerror'
  as we allow dynamic linking and loading

Mageia 7 uses these linker flags:
  $ rpm --eval %ldflags
    -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro -Wl,-O1 -Wl,--build-id -Wl,--enable-new-dtags

So add -ldl to this feature LDFLAGS.

Reported-by: Thomas Backlund <tmb@mageia.org>
Tested-by: Thomas Backlund <tmb@mageia.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/r/20190501173158.GC21436@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:20 -04:00
Leo Yan
35bb59c10a perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet
Robert Walker reported a segmentation fault is observed when process
CoreSight trace data; this issue can be easily reproduced by the command
'perf report --itrace=i1000i' for decoding tracing data.

If neither the 'b' flag (synthesize branches events) nor 'l' flag
(synthesize last branch entries) are specified to option '--itrace',
cs_etm_queue::prev_packet will not been initialised.  After merging the
code to support exception packets and sample flags, there introduced a
number of uses of cs_etm_queue::prev_packet without checking whether it
is valid, for these cases any accessing to uninitialised prev_packet
will cause crash.

As cs_etm_queue::prev_packet is used more widely now and it's already
hard to follow which functions have been called in a context where the
validity of cs_etm_queue::prev_packet has been checked, this patch
always allocates memory for cs_etm_queue::prev_packet.

Reported-by: Robert Walker <robert.walker@arm.com>
Suggested-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 7100b12cf4 ("perf cs-etm: Generate branch sample for exception packet")
Fixes: 24fff5eb2b ("perf cs-etm: Avoid stale branch samples when flush packet")
Link: http://lkml.kernel.org/r/20190428083228.20246-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:20 -04:00
Leo Yan
cf0c37b6db perf cs-etm: Don't check cs_etm_queue::prev_packet validity
Since cs_etm_queue::prev_packet is allocated for all cases, it will
never be NULL pointer; now validity checking prev_packet is pointless,
remove all of them.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Robert Walker <robert.walker@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190428083228.20246-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:20 -04:00
Thomas Richter
167e418fa0 perf report: Report OOM in status line in the GTK UI
An -ENOMEM error is not reported in the GTK GUI.  Instead this error
message pops up on the screen:

[root@m35lp76 perf]# ./perf  report -i perf.data.error68-1

	Processing events... [974K/3M]
	Error:failed to process sample

	0xf4198 [0x8]: failed to process type: 68

However when I use the same perf.data file with --stdio it works:

[root@m35lp76 perf]# ./perf  report -i perf.data.error68-1 --stdio \
		| head -12

  # Total Lost Samples: 0
  #
  # Samples: 76K of event 'cycles'
  # Event count (approx.): 99056160000
  #
  # Overhead  Command          Shared Object      Symbol
  # ........  ...............  .................  .........
  #
     8.81%  find             [kernel.kallsyms]  [k] ftrace_likely_update
     8.74%  swapper          [kernel.kallsyms]  [k] ftrace_likely_update
     8.34%  sshd             [kernel.kallsyms]  [k] ftrace_likely_update
     2.19%  kworker/u512:1-  [kernel.kallsyms]  [k] ftrace_likely_update

The sample precentage is a bit low.....

The GUI always fails in the FINISHED_ROUND event (68) and does not
indicate the reason why.

When happened is the following. Perf report calls a lot of functions and
down deep when a FINISHED_ROUND event is processed, these functions are
called:

  perf_session__process_event()
  + perf_session__process_user_event()
    + process_finished_round()
      + ordered_events__flush()
        + __ordered_events__flush()
	  + do_flush()
	    + ordered_events__deliver_event()
	      + perf_session__deliver_event()
	        + machine__deliver_event()
	          + perf_evlist__deliver_event()
	            + process_sample_event()
	              + hist_entry_iter_add() --> only called in GUI case!!!
	                + hist_iter__report__callback()
	                  + symbol__inc_addr_sample()

	                    Now this functions runs out of memory and
			    returns -ENOMEM. This is reported all the way up
			    until function

perf_session__process_event() returns to its caller, where -ENOMEM is
changed to -EINVAL and processing stops:

 if ((skip = perf_session__process_event(session, event, head)) < 0) {
      pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
	     head, event->header.size, event->header.type);
      err = -EINVAL;
      goto out_err;
 }

This occurred in the FINISHED_ROUND event when it has to process some
10000 entries and ran out of memory.

This patch indicates the root cause and displays it in the status line
of ther perf report GUI.

Output before (on GUI status line):

  0xf4198 [0x8]: failed to process type: 68

Output after:

  0xf4198 [0x8]: failed to process type: 68 [not enough memory]

Committer notes:

the 'skip' variable needs to be initialized to -EINVAL, so that when the
size is less than sizeof(struct perf_event_attr) we avoid this valid
compiler warning:

  util/session.c: In function ‘perf_session__process_events’:
  util/session.c:1936:7: error: ‘skip’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
     err = skip;
     ~~~~^~~~~~
  util/session.c:1874:6: note: ‘skip’ was declared here
    s64 skip;
        ^~~~
  cc1: all warnings being treated as errors

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20190423105303.61683-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:20 -04:00
Arnaldo Carvalho de Melo
bf561d3c13 perf bench numa: Add define for RUSAGE_THREAD if not present
While cross building perf to the ARC architecture on a fedora 30 host,
we were failing with:

      CC       /tmp/build/perf/bench/numa.o
  bench/numa.c: In function ‘worker_thread’:
  bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’?
    getrusage(RUSAGE_THREAD, &rusage);
              ^~~~~~~~~~~~~
              SIGEV_THREAD
  bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in

[perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version | head -1
arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
[perfbuilder@60d5802468f6 perf]$

Trying to reproduce a report by Vineet, I noticed that, with just
cross-built zlib and numactl libraries, I ended up with the above
failure.

So, since RUSAGE_THREAD is available as a define, check for that and
numactl libraries, I ended up with the above failure.

So, since RUSAGE_THREAD is available as a define in the system headers,
check if it is defined in the 'perf bench numa' sources and define it if
not.

Now it builds and I have to figure out if the problem reported by Vineet
only takes place if we have libelf or some other library available.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-snps-arc@lists.infradead.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:20 -04:00
Leo Yan
5f05182fab tools lib traceevent: Change tag string for error
The traceevent lib is used by the perf tool, and when executing

  perf test -v 6

it outputs error log on the ARM64 platform:

  running test 33 '*:*'trace-cmd: No such file or directory

  [...]

  trace-cmd: Invalid argument

The trace event parsing code originally came from trace-cmd so it keeps
the tag string "trace-cmd" for errors, this easily introduces the
impression that the perf tool launches trace-cmd command for trace event
parsing, but in fact the related parsing is accomplished by the
traceevent lib.

This patch changes the tag string to "libtraceevent" so that we can
avoid confusion and let users to more easily connect the error with
traceevent lib.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20190424013802.27569-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:19 -04:00
Thadeu Lima de Souza Cascardo
01e985e900 perf annotate: Fix build on 32 bit for BPF annotation
Commit 6987561c9e ("perf annotate: Enable annotation of BPF programs") adds
support for BPF programs annotations but the new code does not build on 32-bit.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Song Liu <songliubraving@fb.com>
Fixes: 6987561c9e ("perf annotate: Enable annotation of BPF programs")
Link: http://lkml.kernel.org/r/20190403194452.10845-1-cascardo@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:19 -04:00
Arnaldo Carvalho de Melo
24e45b49ee tools uapi x86: Sync vmx.h with the kernel
To pick up the changes from:

  2b27924bb1 ("KVM: nVMX: always use early vmcs check when EPT is disabled")

That causes this object in the tools/perf build process to be rebuilt:

  CC       /tmp/build/perf/arch/x86/util/kvm-stat.o

But it isn't using VMX_ABORT_ prefixed constants, so no change in
behaviour.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h'
  diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/n/tip-bjbo3zc0r8i8oa0udpvftya6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:19 -04:00
Bo YU
2e712675ff perf bpf: Return value with unlocking in perf_env__find_btf()
In perf_env__find_btf(), we're returning without unlocking
"env->bpf_progs.lock". There may be cause lockdep issue.

Detected by CoversityScan, CID# 1444762:(program hangs(LOCK))

Signed-off-by: Bo YU <tsu.yubo@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Fixes: 2db7b1e0bd: (perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf())
Link: http://lkml.kernel.org/r/20190422080138.10088-1-tsu.yubo@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-05-02 16:00:19 -04:00
Linus Torvalds
ea9866793d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Out of bounds access in xfrm IPSEC policy unlink, from Yue Haibing.

 2) Missing length check for esp4 UDP encap, from Sabrina Dubroca.

 3) Fix byte order of RX STBC access in mac80211, from Johannes Berg.

 4) Inifnite loop in bpftool map create, from Alban Crequy.

 5) Register mark fix in ebpf verifier after pkt/null checks, from Paul
    Chaignon.

 6) Properly use rcu_dereference_sk_user_data in L2TP code, from Eric
    Dumazet.

 7) Buffer overrun in marvell phy driver, from Andrew Lunn.

 8) Several crash and statistics handling fixes to bnxt_en driver, from
    Michael Chan and Vasundhara Volam.

 9) Several fixes to the TLS layer from Jakub Kicinski (copying negative
    amounts of data in reencrypt, reencrypt frag copying, blind nskb->sk
    NULL deref, etc).

10) Several UDP GRO fixes, from Paolo Abeni and Eric Dumazet.

11) PID/UID checks on ipv6 flow labels are inverted, from Willem de
    Bruijn.

12) Use after free in l2tp, from Eric Dumazet.

13) IPV6 route destroy races, also from Eric Dumazet.

14) SCTP state machine can erroneously run recursively, fix from Xin
    Long.

15) Adjust AF_PACKET msg_name length checks, add padding bytes if
    necessary. From Willem de Bruijn.

16) Preserve skb_iif, so that forwarded packets have consistent values
    even if fragmentation is involved. From Shmulik Ladkani.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
  udp: fix GRO packet of death
  ipv6: A few fixes on dereferencing rt->from
  rds: ib: force endiannes annotation
  selftests: fib_rule_tests: print the result and return 1 if any tests failed
  ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
  net/tls: avoid NULL pointer deref on nskb->sk in fallback
  selftests: fib_rule_tests: Fix icmp proto with ipv6
  packet: validate msg_namelen in send directly
  packet: in recvmsg msg_name return at least sizeof sockaddr_ll
  sctp: avoid running the sctp state machine recursively
  stmmac: pci: Fix typo in IOT2000 comment
  Documentation: fix netdev-FAQ.rst markup warning
  ipv6: fix races in ip6_dst_destroy()
  l2ip: fix possible use-after-free
  appletalk: Set error code if register_snap_client failed
  net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc
  rxrpc: Fix net namespace cleanup
  ipv6/flowlabel: wait rcu grace period before put_pid()
  vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach
  tcp: add sanity tests in tcp_add_backlog()
  ...
2019-05-02 11:03:34 -07:00
Linus Torvalds
5ce3307b6d Merge tag 'for-linus-20190502' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "This is mostly io_uring fixes/tweaks. Most of these were actually done
  in time for the last -rc, but I wanted to ensure that everything
  tested out great before including them. The code delta looks larger
  than it really is, as it's mostly just comment additions/changes.

  Outside of the comment additions/changes, this is mostly removal of
  unnecessary barriers. In all, this pull request contains:

   - Tweak to how we handle errors at submission time. We now post a
     completion event if the error occurs on behalf of an sqe, instead
     of returning it through the system call. If the error happens
     outside of a specific sqe, we return the error through the system
     call. This makes it nicer to use and makes the "normal" use case
     behave the same as the offload cases. (me)

   - Fix for a missing req reference drop from async context (me)

   - If an sqe is submitted with RWF_NOWAIT, don't punt it to async
     context. Return -EAGAIN directly, instead of using it as a hint to
     do async punt. (Stefan)

   - Fix notes on barriers (Stefan)

   - Remove unnecessary barriers (Stefan)

   - Fix potential double free of memory in setup error (Mark)

   - Further improve sq poll CPU validation (Mark)

   - Fix page allocation warning and leak on buffer registration error
     (Mark)

   - Fix iov_iter_type() for new no-ref flag (Ming)

   - Fix a case where dio doesn't honor bio no-page-ref (Ming)"

* tag 'for-linus-20190502' of git://git.kernel.dk/linux-block:
  io_uring: avoid page allocation warnings
  iov_iter: fix iov_iter_type
  block: fix handling for BIO_NO_PAGE_REF
  io_uring: drop req submit reference always in async punt
  io_uring: free allocated io_memory once
  io_uring: fix SQPOLL cpu validation
  io_uring: have submission side sqe errors post a cqe
  io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP
  io_uring: remove unnecessary barrier after incrementing dropped counter
  io_uring: remove unnecessary barrier before reading SQ tail
  io_uring: remove unnecessary barrier after updating SQ head
  io_uring: remove unnecessary barrier before reading cq head
  io_uring: remove unnecessary barrier before wq_has_sleeper
  io_uring: fix notes on barriers
  io_uring: fix handling SQEs requesting NOWAIT
2019-05-02 09:55:04 -07:00
Jarkko Nikula
72bfcee11c i2c: Prevent runtime suspend of adapter when Host Notify is required
Multiple users have reported their Synaptics touchpad has stopped
working between v4.20.1 and v4.20.2 when using SMBus interface.

The culprit for this appeared to be commit c5eb119007 ("PCI / PM: Allow
runtime PM without callback functions") that fixed the runtime PM for
i2c-i801 SMBus adapter. Those Synaptics touchpad are using i2c-i801
for SMBus communication and testing showed they are able to get back
working by preventing the runtime suspend of adapter.

Normally when i2c-i801 SMBus adapter transmits with the client it resumes
before operation and autosuspends after.

However, if client requires SMBus Host Notify protocol, what those
Synaptics touchpads do, then the host adapter must not go to runtime
suspend since then it cannot process incoming SMBus Host Notify commands
the client may send.

Fix this by keeping I2C/SMBus adapter active in case client requires
Host Notify.

Reported-by: Keijo Vaara <ferdasyn@rocketmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203297
Fixes: c5eb119007 ("PCI / PM: Allow runtime PM without callback functions")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Keijo Vaara <ferdasyn@rocketmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-05-02 18:42:15 +02:00
Ard Biesheuvel
95e0cf3cae i2c: synquacer: fix enumeration of slave devices
The I2C host driver for SynQuacer fails to populate the of_node and
ACPI companion fields of the struct i2c_adapter it instantiates,
resulting in enumeration of the subordinate I2C bus to fail.

Fixes: 0d676a6c43 ("i2c: add support for Socionext SynQuacer I2C controller")
Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-05-02 18:38:53 +02:00
Wolfram Sang
fb31fbef9c MAINTAINERS: friendly takeover of i2c-gpio driver
I haven't heard from Haavard in years despite putting him to the CC list for
i2c-gpio related mails. Since I was doing the work on this driver for a while
now, let me take official maintainership, so it will be more clear to users.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Haavard Skinnemoen <hskinnemoen@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-05-02 18:34:06 +02:00
Kim Phillips
1804569d87 MAINTAINERS: Include vendor specific files under arch/*/events/*
Add an explicit subdirectory specification for arch/x86/events/amd to
the MAINTAINERS file, to distinguish it from its parent. This will
produce the correct set of maintainers for the files found therein.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Gary Hook <Gary.Hook@amd.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Fixes: 39b0332a21 ("perf/x86: Move perf_event_amd.c ........... => x86/events/amd/core.c")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-02 18:28:12 +02:00
Kim Phillips
0e3b74e262 perf/x86/amd: Update generic hardware cache events for Family 17h
Add a new amd_hw_cache_event_ids_f17h assignment structure set
for AMD families 17h and above, since a lot has changed.  Specifically:

L1 Data Cache

The data cache access counter remains the same on Family 17h.

For DC misses, PMCx041's definition changes with Family 17h,
so instead we use the L2 cache accesses from L1 data cache
misses counter (PMCx060,umask=0xc8).

For DC hardware prefetch events, Family 17h breaks compatibility
for PMCx067 "Data Prefetcher", so instead, we use PMCx05a "Hardware
Prefetch DC Fills."

L1 Instruction Cache

PMCs 0x80 and 0x81 (32-byte IC fetches and misses) are backward
compatible on Family 17h.

For prefetches, we remove the erroneous PMCx04B assignment which
counts how many software data cache prefetch load instructions were
dispatched.

LL - Last Level Cache

Removing PMCs 7D, 7E, and 7F assignments, as they do not exist
on Family 17h, where the last level cache is L3.  L3 counters
can be accessed using the existing AMD Uncore driver.

Data TLB

On Intel machines, data TLB accesses ("dTLB-loads") are assigned
to counters that count load/store instructions retired.  This
is inconsistent with instruction TLB accesses, where Intel
implementations report iTLB misses that hit in the STLB.

Ideally, dTLB-loads would count higher level dTLB misses that hit
in lower level TLBs, and dTLB-load-misses would report those
that also missed in those lower-level TLBs, therefore causing
a page table walk.  That would be consistent with instruction
TLB operation, remove the redundancy between dTLB-loads and
L1-dcache-loads, and prevent perf from producing artificially
low percentage ratios, i.e. the "0.01%" below:

        42,550,869      L1-dcache-loads
        41,591,860      dTLB-loads
             4,802      dTLB-load-misses          #    0.01% of all dTLB cache hits
         7,283,682      L1-dcache-stores
         7,912,392      dTLB-stores
               310      dTLB-store-misses

On AMD Families prior to 17h, the "Data Cache Accesses" counter is
used, which is slightly better than load/store instructions retired,
but still counts in terms of individual load/store operations
instead of TLB operations.

So, for AMD Families 17h and higher, this patch assigns "dTLB-loads"
to a counter for L1 dTLB misses that hit in the L2 dTLB, and
"dTLB-load-misses" to a counter for L1 DTLB misses that caused
L2 DTLB misses and therefore also caused page table walks.  This
results in a much more accurate view of data TLB performance:

        60,961,781      L1-dcache-loads
             4,601      dTLB-loads
               963      dTLB-load-misses          #   20.93% of all dTLB cache hits

Note that for all AMD families, data loads and stores are combined
in a single accesses counter, so no 'L1-dcache-stores' are reported
separately, and stores are counted with loads in 'L1-dcache-loads'.

Also note that the "% of all dTLB cache hits" string is misleading
because (a) "dTLB cache": although TLBs can be considered caches for
page tables, in this context, it can be misinterpreted as data cache
hits because the figures are similar (at least on Intel), and (b) not
all those loads (technically accesses) technically "hit" at that
hardware level.  "% of all dTLB accesses" would be more clear/accurate.

Instruction TLB

On Intel machines, 'iTLB-loads' measure iTLB misses that hit in the
STLB, and 'iTLB-load-misses' measure iTLB misses that also missed in
the STLB and completed a page table walk.

For AMD Family 17h and above, for 'iTLB-loads' we replace the
erroneous instruction cache fetches counter with PMCx084
"L1 ITLB Miss, L2 ITLB Hit".

For 'iTLB-load-misses' we still use PMCx085 "L1 ITLB Miss,
L2 ITLB Miss", but set a 0xff umask because without it the event
does not get counted.

Branch Predictor (BPU)

PMCs 0xc2 and 0xc3 continue to be valid across all AMD Families.

Node Level Events

Family 17h does not have a PMCx0e9 counter, and corresponding counters
have not been made available publicly, so for now, we mark them as
unsupported for Families 17h and above.

Reference:

  "Open-Source Register Reference For AMD Family 17h Processors Models 00h-2Fh"
  Released 7/17/2018, Publication #56255, Revision 3.03:
  https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf

[ mingo: tidied up the line breaks. ]
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: <stable@vger.kernel.org> # v4.9+
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Fixes: e40ed1542d ("perf/x86: Add perf support for AMD family-17h processors")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-02 18:28:12 +02:00
Wolfram Sang
6bac9bc273 i2c: designware: ratelimit 'transfer when suspended' errors
There are two problems with dev_err() here. One: It is not ratelimited.
Two: We don't see which driver tried to transfer something with a
suspended adapter. Switch to dev_WARN_ONCE to fix both issues. Drawback
is that we don't see if multiple drivers are trying to transfer while
suspended. They need to be discovered one after the other now. This is
better than a high CPU load because a really broken driver might try to
resend endlessly.

Link: https://bugs.archlinux.org/task/62391
Fixes: 2751541555 ("i2c: designware: Do not allow i2c_dw_xfer() calls while suspended")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reported-by: skidnik <skidnik@gmail.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: skidnik <skidnik@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-05-02 18:26:48 +02:00
Linus Torvalds
b7a5b22b05 Merge tag 'pci-v5.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
 "I apologize for sending these so late in the cycle. We went back and
  forth about how to deal with the unexpected logging of intentional
  link state changes and finally decided to just config them off by
  default.

  PCI fixes:

   - Stop ignoring "pci=disable_acs_redir" parameter (Logan Gunthorpe)

   - Use shared MSI/MSI-X vector for Link Bandwidth Management (Alex
     Williamson)

   - Add Kconfig option for Link Bandwidth notification messages (Keith
     Busch)"

* tag 'pci-v5.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI/LINK: Add Kconfig option (default off)
  PCI/portdrv: Use shared MSI/MSI-X vector for Bandwidth Management
  PCI: Fix issue with "pci=disable_acs_redir" parameter being ignored
2019-05-02 08:29:24 -07:00
Linus Torvalds
e2a4b102d4 Merge tag 'mtd/fixes-for-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fix from Richard Weinberger:
 "A single regression fix for the marvell nand driver"

* tag 'mtd/fixes-for-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: marvell: Clean the controller state before each operation
2019-05-02 08:27:39 -07:00
Keith Busch
2078e1e7f7 PCI/LINK: Add Kconfig option (default off)
e8303bb7a7 ("PCI/LINK: Report degraded links via link bandwidth
notification") added dmesg logging whenever a link changes speed or width
to a state that is considered degraded.  Unfortunately, it cannot
differentiate signal integrity-related link changes from those
intentionally initiated by an endpoint driver, including drivers that may
live in userspace or VMs when making use of vfio-pci.  Some GPU drivers
actively manage the link state to save power, which generates a stream of
messages like this:

  vfio-pci 0000:07:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x16 link at 0000:00:02.0 (capable of 64.000 Gb/s with 5 GT/s x16 link)

Since we can't distinguish the intentional changes from the signal
integrity issues, leave the reporting turned off by default.  Add a Kconfig
option to turn it on if desired.

Fixes: e8303bb7a7 ("PCI/LINK: Report degraded links via link bandwidth notification")
Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.com
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-05-02 08:34:32 -05:00
Al Viro
4e9036042f ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
To choose whether to pick the GID from the old (16bit) or new (32bit)
field, we should check if the old gid field is set to 0xffff.  Mainline
checks the old *UID* field instead - cut'n'paste from the corresponding
code in ufs_get_inode_uid().

Fixes: 252e211e90
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-02 02:24:50 -04:00
Christophe Leroy
12f363511d powerpc/32s: Fix BATs setting with CONFIG_STRICT_KERNEL_RWX
Serge reported some crashes with CONFIG_STRICT_KERNEL_RWX enabled
on a book3s32 machine.

Analysis shows two issues:
  - BATs addresses and sizes are not properly aligned.
  - There is a gap between the last address covered by BATs and the
    first address covered by pages.

Memory mapped with DBATs:
0: 0xc0000000-0xc07fffff 0x00000000 Kernel RO coherent
1: 0xc0800000-0xc0bfffff 0x00800000 Kernel RO coherent
2: 0xc0c00000-0xc13fffff 0x00c00000 Kernel RW coherent
3: 0xc1400000-0xc23fffff 0x01400000 Kernel RW coherent
4: 0xc2400000-0xc43fffff 0x02400000 Kernel RW coherent
5: 0xc4400000-0xc83fffff 0x04400000 Kernel RW coherent
6: 0xc8400000-0xd03fffff 0x08400000 Kernel RW coherent
7: 0xd0400000-0xe03fffff 0x10400000 Kernel RW coherent

Memory mapped with pages:
0xe1000000-0xefffffff  0x21000000       240M        rw       present           dirty  accessed

This patch fixes both issues. With the patch, we get the following
which is as expected:

Memory mapped with DBATs:
0: 0xc0000000-0xc07fffff 0x00000000 Kernel RO coherent
1: 0xc0800000-0xc0bfffff 0x00800000 Kernel RO coherent
2: 0xc0c00000-0xc0ffffff 0x00c00000 Kernel RW coherent
3: 0xc1000000-0xc1ffffff 0x01000000 Kernel RW coherent
4: 0xc2000000-0xc3ffffff 0x02000000 Kernel RW coherent
5: 0xc4000000-0xc7ffffff 0x04000000 Kernel RW coherent
6: 0xc8000000-0xcfffffff 0x08000000 Kernel RW coherent
7: 0xd0000000-0xdfffffff 0x10000000 Kernel RW coherent

Memory mapped with pages:
0xe0000000-0xefffffff  0x20000000       256M        rw       present           dirty  accessed

Fixes: 63b2bc6195 ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX")
Reported-by: Serge Belyshev <belyshev@depni.sinp.msu.ru>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-02 15:33:46 +10:00
Eric Dumazet
4dd2b82d5a udp: fix GRO packet of death
syzbot was able to crash host by sending UDP packets with a 0 payload.

TCP does not have this issue since we do not aggregate packets without
payload.

Since dev_gro_receive() sets gso_size based on skb_gro_len(skb)
it seems not worth trying to cope with padded packets.

BUG: KASAN: slab-out-of-bounds in skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
Read of size 16 at addr ffff88808893fff0 by task syz-executor612/7889

CPU: 0 PID: 7889 Comm: syz-executor612 Not tainted 5.1.0-rc7+ #96
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 __asan_report_load16_noabort+0x14/0x20 mm/kasan/generic_report.c:133
 skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
 udp_gro_receive_segment net/ipv4/udp_offload.c:382 [inline]
 call_gro_receive include/linux/netdevice.h:2349 [inline]
 udp_gro_receive+0xb61/0xfd0 net/ipv4/udp_offload.c:414
 udp4_gro_receive+0x763/0xeb0 net/ipv4/udp_offload.c:478
 inet_gro_receive+0xe72/0x1110 net/ipv4/af_inet.c:1510
 dev_gro_receive+0x1cd0/0x23c0 net/core/dev.c:5581
 napi_gro_frags+0x36b/0xd10 net/core/dev.c:5843
 tun_get_user+0x2f24/0x3fb0 drivers/net/tun.c:1981
 tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2027
 call_write_iter include/linux/fs.h:1866 [inline]
 do_iter_readv_writev+0x5e1/0x8e0 fs/read_write.c:681
 do_iter_write fs/read_write.c:957 [inline]
 do_iter_write+0x184/0x610 fs/read_write.c:938
 vfs_writev+0x1b3/0x2f0 fs/read_write.c:1002
 do_writev+0x15e/0x370 fs/read_write.c:1037
 __do_sys_writev fs/read_write.c:1110 [inline]
 __se_sys_writev fs/read_write.c:1107 [inline]
 __x64_sys_writev+0x75/0xb0 fs/read_write.c:1107
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441cc0
Code: 05 48 3d 01 f0 ff ff 0f 83 9d 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 3d 51 93 29 00 00 75 14 b8 14 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 74 09 fc ff c3 48 83 ec 08 e8 ba 2b 00 00
RSP: 002b:00007ffe8c716118 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 00007ffe8c716150 RCX: 0000000000441cc0
RDX: 0000000000000001 RSI: 00007ffe8c716170 RDI: 00000000000000f0
RBP: 0000000000000000 R08: 000000000000ffff R09: 0000000000a64668
R10: 0000000020000040 R11: 0000000000000246 R12: 000000000000c2d9
R13: 0000000000402b50 R14: 0000000000000000 R15: 0000000000000000

Allocated by task 5143:
 save_stack+0x45/0xd0 mm/kasan/common.c:75
 set_track mm/kasan/common.c:87 [inline]
 __kasan_kmalloc mm/kasan/common.c:497 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470
 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:505
 slab_post_alloc_hook mm/slab.h:437 [inline]
 slab_alloc mm/slab.c:3393 [inline]
 kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3555
 mm_alloc+0x1d/0xd0 kernel/fork.c:1030
 bprm_mm_init fs/exec.c:363 [inline]
 __do_execve_file.isra.0+0xaa3/0x23f0 fs/exec.c:1791
 do_execveat_common fs/exec.c:1865 [inline]
 do_execve fs/exec.c:1882 [inline]
 __do_sys_execve fs/exec.c:1958 [inline]
 __se_sys_execve fs/exec.c:1953 [inline]
 __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 5351:
 save_stack+0x45/0xd0 mm/kasan/common.c:75
 set_track mm/kasan/common.c:87 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:467
 __cache_free mm/slab.c:3499 [inline]
 kmem_cache_free+0x86/0x260 mm/slab.c:3765
 __mmdrop+0x238/0x320 kernel/fork.c:677
 mmdrop include/linux/sched/mm.h:49 [inline]
 finish_task_switch+0x47b/0x780 kernel/sched/core.c:2746
 context_switch kernel/sched/core.c:2880 [inline]
 __schedule+0x81b/0x1cc0 kernel/sched/core.c:3518
 preempt_schedule_irq+0xb5/0x140 kernel/sched/core.c:3745
 retint_kernel+0x1b/0x2d
 arch_local_irq_restore arch/x86/include/asm/paravirt.h:767 [inline]
 kmem_cache_free+0xab/0x260 mm/slab.c:3766
 anon_vma_chain_free mm/rmap.c:134 [inline]
 unlink_anon_vmas+0x2ba/0x870 mm/rmap.c:401
 free_pgtables+0x1af/0x2f0 mm/memory.c:394
 exit_mmap+0x2d1/0x530 mm/mmap.c:3144
 __mmput kernel/fork.c:1046 [inline]
 mmput+0x15f/0x4c0 kernel/fork.c:1067
 exec_mmap fs/exec.c:1046 [inline]
 flush_old_exec+0x8d9/0x1c20 fs/exec.c:1279
 load_elf_binary+0x9bc/0x53f0 fs/binfmt_elf.c:864
 search_binary_handler fs/exec.c:1656 [inline]
 search_binary_handler+0x17f/0x570 fs/exec.c:1634
 exec_binprm fs/exec.c:1698 [inline]
 __do_execve_file.isra.0+0x1394/0x23f0 fs/exec.c:1818
 do_execveat_common fs/exec.c:1865 [inline]
 do_execve fs/exec.c:1882 [inline]
 __do_sys_execve fs/exec.c:1958 [inline]
 __se_sys_execve fs/exec.c:1953 [inline]
 __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff88808893f7c0
 which belongs to the cache mm_struct of size 1496
The buggy address is located 600 bytes to the right of
 1496-byte region [ffff88808893f7c0, ffff88808893fd98)
The buggy address belongs to the page:
page:ffffea0002224f80 count:1 mapcount:0 mapping:ffff88821bc40ac0 index:0xffff88808893f7c0 compound_mapcount: 0
flags: 0x1fffc0000010200(slab|head)
raw: 01fffc0000010200 ffffea00025b4f08 ffffea00027b9d08 ffff88821bc40ac0
raw: ffff88808893f7c0 ffff88808893e440 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88808893fe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88808893ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88808893ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                             ^
 ffff888088940000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff888088940080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Fixes: e20cf8d3f1 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 22:29:56 -04:00
Linus Torvalds
600d725831 Merge tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply fixes from Sebastian Reichel:
 "Two more fixes for the 5.1 cycle.

  One division by zero fix in a specific driver and one core workaround
  for bad userspace behaviour from systemd regarding uevents. IMHO this
  can be considered to be a userspace bug, but the debug messages are
  useless anyways

   - cpcap-battery: fix a division by zero

   - core: fix systemd issue due to log messages produced by uevent"

* tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
  power: supply: cpcap-battery: Fix division by zero
2019-05-01 14:57:23 -07:00
Martin KaFai Lau
886b7a5010 ipv6: A few fixes on dereferencing rt->from
It is a followup after the fix in
commit 9c69a13205 ("route: Avoid crash from dereferencing NULL rt->from")

rt6_do_redirect():
1. NULL checking is needed on rt->from because a parallel
   fib6_info delete could happen that sets rt->from to NULL.
   (e.g. rt6_remove_exception() and fib6_drop_pcpu_from()).

2. fib6_info_hold() is not enough.  Same reason as (1).
   Meaning, holding dst->__refcnt cannot ensure
   rt->from is not NULL or rt->from->fib6_ref is not 0.

   Instead of using fib6_info_hold_safe() which ip6_rt_cache_alloc()
   is already doing, this patch chooses to extend the rcu section
   to keep "from" dereference-able after checking for NULL.

inet6_rtm_getroute():
1. NULL checking is also needed on rt->from for a similar reason.
   Note that inet6_rtm_getroute() is using RTNL_FLAG_DOIT_UNLOCKED.

Fixes: a68886a691 ("net/ipv6: Make from in rt6_info rcu protected")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Wei Wang <weiwan@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 17:17:54 -04:00
Nicholas Mc Guire
f3505745c0 rds: ib: force endiannes annotation
While the endiannes is being handled correctly as indicated by the comment
above the offending line - sparse was unhappy with the missing annotation
as be64_to_cpu() expects a __be64 argument. To mitigate this annotation
all involved variables are changed to a consistent __le64 and the
 conversion to uint64_t delayed to the call to rds_cong_map_updated().

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 17:15:36 -04:00
Linus Torvalds
65beea4c3a Merge tag 'arc-5.1-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
 "A few minor fixes for ARC.

   - regression in memset if line size !64

   - avoid panic if PAE and IOC"

* tag 'arc-5.1-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: memset: fix build with L1_CACHE_SHIFT != 6
  ARC: [hsdk] Make it easier to add PAE40 region to DTB
  ARC: PAE40: don't panic and instead turn off hw ioc
2019-05-01 13:40:30 -07:00
Alex Williamson
15d2aba7c6 PCI/portdrv: Use shared MSI/MSI-X vector for Bandwidth Management
The Interrupt Message Number in the PCIe Capabilities register (PCIe r4.0,
sec 7.5.3.2) indicates which MSI/MSI-X vector is shared by interrupts
related to the PCIe Capability, including Link Bandwidth Management and
Link Autonomous Bandwidth Interrupts (Link Control, 7.5.3.7), Command
Completed and Hot-Plug Interrupts (Slot Control, 7.5.3.10), and the PME
Interrupt (Root Control, 7.5.3.12).

pcie_message_numbers() checked whether we want to enable PME or Hot-Plug
interrupts but neglected to check for Link Bandwidth Management, so if we
only wanted the Bandwidth Management interrupts, it decided we didn't need
any vectors at all.  Then pcie_port_enable_irq_vec() tried to reallocate
zero vectors, which failed, resulting in fallback to INTx.

On some systems, e.g., an X79-based workstation, that INTx seems broken or
not handled correctly, so we got spurious IRQ16 interrupts for Bandwidth
Management events.

Change pcie_message_numbers() so that if we want Link Bandwidth Management
interrupts, we use the shared MSI/MSI-X vector from the PCIe Capabilities
register.

Fixes: e8303bb7a7 ("PCI/LINK: Report degraded links via link bandwidth notification")
Link: https://lore.kernel.org/lkml/155597243666.19387.1205950870601742062.stgit@gimli.home
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-05-01 15:34:01 -05:00
Linus Torvalds
fb0af61d3a Merge tag 'acpi-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
 "Revert a recent ACPICA change that caused initialization to fail on
  systems with Thunderbolt docking stations connected at the init time"

* tag 'acpi-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPICA: Clear status of GPEs before enabling them"
2019-05-01 13:03:39 -07:00
Linus Torvalds
7e74e235bb gcc-9: don't warn about uninitialized btrfs extent_type variable
The 'extent_type' variable does seem to be reliably initialized, but
it's _very_ non-obvious, since there's a "goto next" case that jumps
over the normal initialization.  That will then always trigger the
"start >= extent_end" test, which will end up never falling through to
the use of that variable.

But the code is certainly not obvious, and the compiler warning looks
reasonable.  Make 'extent_type' an int, and initialize it to an invalid
negative value, which seems to be the common pattern in other places.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-01 12:19:20 -07:00
Hangbin Liu
f68d7c44e7 selftests: fib_rule_tests: print the result and return 1 if any tests failed
Fixes: 65b2b4939a ("selftests: net: initial fib rule tests")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 14:30:23 -04:00
Linus Torvalds
459e3a2153 gcc-9: properly declare the {pv,hv}clock_page storage
The pvlock_page and hvclock_page variables are (as the name implies)
addresses to pages, created by the linker script.

But we declared them as just "extern u8" variables, which _works_, but
now that gcc does some more bounds checking, it causes warnings like

    warning: array subscript 1 is outside array bounds of ‘u8[1]’

when we then access more than one byte from those variables.

Fix this by simply making the declaration of the variables match
reality, which makes the compiler happy too.

Signed-off-by: Linus Torvalds <torvalds@-linux-foundation.org>
2019-05-01 11:20:53 -07:00
Linus Torvalds
cf67690884 gcc-9: don't warn about uninitialized variable
I'm not sure what made gcc warn about this code now.  The 'ret' variable
does end up initialized in all cases, but it's definitely not obvious,
so the compiler is quite reasonable to warn about this.

So just add initialization to make it all much more obvious both to
compilers and to humans.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-01 11:07:40 -07:00
Linus Torvalds
6f303d6053 gcc-9: silence 'address-of-packed-member' warning
We already did this for clang, but now gcc has that warning too.  Yes,
yes, the address may be unaligned.  And that's kind of the point.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-01 11:05:41 -07:00
Shmulik Ladkani
d2f0c96114 ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
Previously, during fragmentation after forwarding, skb->skb_iif isn't
preserved, i.e. 'ip_copy_metadata' does not copy skb_iif from given
'from' skb.

As a result, ip_do_fragment's creates fragments with zero skb_iif,
leading to inconsistent behavior.

Assume for example an eBPF program attached at tc egress (post
forwarding) that examines __sk_buff->ingress_ifindex:
 - the correct iif is observed if forwarding path does not involve
   fragmentation/refragmentation
 - a bogus iif is observed if forwarding path involves
   fragmentation/refragmentatiom

Fix, by preserving skb_iif during 'ip_copy_metadata'.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 13:28:34 -04:00
Mark Rutland
d4ef647510 io_uring: avoid page allocation warnings
In io_sqe_buffer_register() we allocate a number of arrays based on the
iov_len from the user-provided iov. While we limit iov_len to SZ_1G,
we can still attempt to allocate arrays exceeding MAX_ORDER.

On a 64-bit system with 4KiB pages, for an iov where iov_base = 0x10 and
iov_len = SZ_1G, we'll calculate that nr_pages = 262145. When we try to
allocate a corresponding array of (16-byte) bio_vecs, requiring 4194320
bytes, which is greater than 4MiB. This results in SLUB warning that
we're trying to allocate greater than MAX_ORDER, and failing the
allocation.

Avoid this by using kvmalloc() for allocations dependent on the
user-provided iov_len. At the same time, fix a leak of imu->bvec when
registration fails.

Full splat from before this patch:

WARNING: CPU: 1 PID: 2314 at mm/page_alloc.c:4595 __alloc_pages_nodemask+0x7ac/0x2938 mm/page_alloc.c:4595
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 2314 Comm: syz-executor326 Not tainted 5.1.0-rc7-dirty #4
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace+0x0/0x2f0 include/linux/compiler.h:193
 show_stack+0x20/0x30 arch/arm64/kernel/traps.c:158
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x110/0x190 lib/dump_stack.c:113
 panic+0x384/0x68c kernel/panic.c:214
 __warn+0x2bc/0x2c0 kernel/panic.c:571
 report_bug+0x228/0x2d8 lib/bug.c:186
 bug_handler+0xa0/0x1a0 arch/arm64/kernel/traps.c:956
 call_break_hook arch/arm64/kernel/debug-monitors.c:301 [inline]
 brk_handler+0x1d4/0x388 arch/arm64/kernel/debug-monitors.c:316
 do_debug_exception+0x1a0/0x468 arch/arm64/mm/fault.c:831
 el1_dbg+0x18/0x8c
 __alloc_pages_nodemask+0x7ac/0x2938 mm/page_alloc.c:4595
 alloc_pages_current+0x164/0x278 mm/mempolicy.c:2132
 alloc_pages include/linux/gfp.h:509 [inline]
 kmalloc_order+0x20/0x50 mm/slab_common.c:1231
 kmalloc_order_trace+0x30/0x2b0 mm/slab_common.c:1243
 kmalloc_large include/linux/slab.h:480 [inline]
 __kmalloc+0x3dc/0x4f0 mm/slub.c:3791
 kmalloc_array include/linux/slab.h:670 [inline]
 io_sqe_buffer_register fs/io_uring.c:2472 [inline]
 __io_uring_register fs/io_uring.c:2962 [inline]
 __do_sys_io_uring_register fs/io_uring.c:3008 [inline]
 __se_sys_io_uring_register fs/io_uring.c:2990 [inline]
 __arm64_sys_io_uring_register+0x9e0/0x1bc8 fs/io_uring.c:2990
 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
 invoke_syscall arch/arm64/kernel/syscall.c:47 [inline]
 el0_svc_common.constprop.0+0x148/0x2e0 arch/arm64/kernel/syscall.c:83
 el0_svc_handler+0xdc/0x100 arch/arm64/kernel/syscall.c:129
 el0_svc+0x8/0xc arch/arm64/kernel/entry.S:948
SMP: stopping secondary CPUs
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
CPU features: 0x002,23000438
Memory Limit: none
Rebooting in 1 seconds..

Fixes: edafccee56 ("io_uring: add support for pre-mapped user IO buffers")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 10:00:25 -06:00
Jakub Kicinski
2dcb003314 net/tls: avoid NULL pointer deref on nskb->sk in fallback
update_chksum() accesses nskb->sk before it has been set
by complete_skb(), move the init up.

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 11:37:56 -04:00
David Ahern
15d55bae4e selftests: fib_rule_tests: Fix icmp proto with ipv6
A recent commit returns an error if icmp is used as the ip-proto for
IPv6 fib rules. Update fib_rule_tests to send ipv6-icmp instead of icmp.

Fixes: 5e1a99eae8 ("ipv4: Add ICMPv6 support when parse route ipproto")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 11:30:17 -04:00
Willem de Bruijn
486efdc8f6 packet: validate msg_namelen in send directly
Packet sockets in datagram mode take a destination address. Verify its
length before passing to dev_hard_header.

Prior to 2.6.14-rc3, the send code ignored sll_halen. This is
established behavior. Directly compare msg_namelen to dev->addr_len.

Change v1->v2: initialize addr in all paths

Fixes: 6b8d95f179 ("packet: validate address length if non-zero")
Suggested-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 11:28:35 -04:00
Willem de Bruijn
b2cf86e156 packet: in recvmsg msg_name return at least sizeof sockaddr_ll
Packet send checks that msg_name is at least sizeof sockaddr_ll.
Packet recv must return at least this length, so that its output
can be passed unmodified to packet send.

This ceased to be true since adding support for lladdr longer than
sll_addr. Since, the return value uses true address length.

Always return at least sizeof sockaddr_ll, even if address length
is shorter. Zero the padding bytes.

Change v1->v2: do not overwrite zeroed padding again. use copy_len.

Fixes: 0fb375fb9b ("[AF_PACKET]: Allow for > 8 byte hardware addresses.")
Suggested-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 11:27:55 -04:00
Ming Lei
f5eb4d3b92 iov_iter: fix iov_iter_type
Commit 875f1d0769 ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag")
introduces one extra flag of ITER_BVEC_FLAG_NO_REF, and this flag
is stored into iter->type.

However, iov_iter_type() doesn't consider the new added flag, fix
it by masking this flag in iov_iter_type().

Fixes: 875f1d0769 ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:47 -06:00
Ming Lei
60a27b906d block: fix handling for BIO_NO_PAGE_REF
Commit 399254aaf4 ("block: add BIO_NO_PAGE_REF flag") introduces
BIO_NO_PAGE_REF, and once this flag is set for one bio, all pages
in the bio won't be get/put during IO.

However, if one bio is submitted via __blkdev_direct_IO_simple(),
even though BIO_NO_PAGE_REF is set, pages still may be put.

Fixes this issue by avoiding to put pages if BIO_NO_PAGE_REF is
set.

Fixes: 399254aaf4 ("block: add BIO_NO_PAGE_REF flag")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:47 -06:00
Jens Axboe
817869d251 io_uring: drop req submit reference always in async punt
If we don't end up actually calling submit in io_sq_wq_submit_work(),
we still need to drop the submit reference to the request. If we
don't, then we can leak the request. This can happen if we race
with ring shutdown while flushing the workqueue for requests that
require use of the mm_struct.

Fixes: e65ef56db4 ("io_uring: use regular request ref counts")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:47 -06:00
Mark Rutland
52e04ef4c9 io_uring: free allocated io_memory once
If io_allocate_scq_urings() fails to allocate an sq_* region, it will
call io_mem_free() for any previously allocated regions, but leave
dangling pointers to these regions in the ctx. Any regions which have
not yet been allocated are left NULL. Note that when returning
-EOVERFLOW, the previously allocated sq_ring is not freed, which appears
to be an unintentional leak.

When io_allocate_scq_urings() fails, io_uring_create() will call
io_ring_ctx_wait_and_kill(), which calls io_mem_free() on all the sq_*
regions, assuming the pointers are valid and not NULL.

This can result in pages being freed multiple times, which has been
observed to corrupt the page state, leading to subsequent fun. This can
also result in virt_to_page() on NULL, resulting in the use of bogus
page addresses, and yet more subsequent fun. The latter can be detected
with CONFIG_DEBUG_VIRTUAL on arm64.

Adding a cleanup path to io_allocate_scq_urings() complicates the logic,
so let's leave it to io_ring_ctx_free() to consistently free these
pointers, and simplify the io_allocate_scq_urings() error paths.

Full splats from before this patch below. Note that the pointer logged
by the DEBUG_VIRTUAL "non-linear address" warning has been hashed, and
is actually NULL.

[   26.098129] page:ffff80000e949a00 count:0 mapcount:-128 mapping:0000000000000000 index:0x0
[   26.102976] flags: 0x63fffc000000()
[   26.104373] raw: 000063fffc000000 ffff80000e86c188 ffff80000ea3df08 0000000000000000
[   26.108917] raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000
[   26.137235] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
[   26.143960] ------------[ cut here ]------------
[   26.146020] kernel BUG at include/linux/mm.h:547!
[   26.147586] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[   26.149163] Modules linked in:
[   26.150287] Process syz-executor.21 (pid: 20204, stack limit = 0x000000000e9cefeb)
[   26.153307] CPU: 2 PID: 20204 Comm: syz-executor.21 Not tainted 5.1.0-rc7-00004-g7d30b2ea43d6 #18
[   26.156566] Hardware name: linux,dummy-virt (DT)
[   26.158089] pstate: 40400005 (nZcv daif +PAN -UAO)
[   26.159869] pc : io_mem_free+0x9c/0xa8
[   26.161436] lr : io_mem_free+0x9c/0xa8
[   26.162720] sp : ffff000013003d60
[   26.164048] x29: ffff000013003d60 x28: ffff800025048040
[   26.165804] x27: 0000000000000000 x26: ffff800025048040
[   26.167352] x25: 00000000000000c0 x24: ffff0000112c2820
[   26.169682] x23: 0000000000000000 x22: 0000000020000080
[   26.171899] x21: ffff80002143b418 x20: ffff80002143b400
[   26.174236] x19: ffff80002143b280 x18: 0000000000000000
[   26.176607] x17: 0000000000000000 x16: 0000000000000000
[   26.178997] x15: 0000000000000000 x14: 0000000000000000
[   26.181508] x13: 00009178a5e077b2 x12: 0000000000000001
[   26.183863] x11: 0000000000000000 x10: 0000000000000980
[   26.186437] x9 : ffff000013003a80 x8 : ffff800025048a20
[   26.189006] x7 : ffff8000250481c0 x6 : ffff80002ffe9118
[   26.191359] x5 : ffff80002ffe9118 x4 : 0000000000000000
[   26.193863] x3 : ffff80002ffefe98 x2 : 44c06ddd107d1f00
[   26.196642] x1 : 0000000000000000 x0 : 000000000000003e
[   26.198892] Call trace:
[   26.199893]  io_mem_free+0x9c/0xa8
[   26.201155]  io_ring_ctx_wait_and_kill+0xec/0x180
[   26.202688]  io_uring_setup+0x6c4/0x6f0
[   26.204091]  __arm64_sys_io_uring_setup+0x18/0x20
[   26.205576]  el0_svc_common.constprop.0+0x7c/0xe8
[   26.207186]  el0_svc_handler+0x28/0x78
[   26.208389]  el0_svc+0x8/0xc
[   26.209408] Code: aa0203e0 d0006861 9133a021 97fcdc3c (d4210000)
[   26.211995] ---[ end trace bdb81cd43a21e50d ]---

[   81.770626] ------------[ cut here ]------------
[   81.825015] virt_to_phys used for non-linear address: 000000000d42f2c7 (          (null))
[   81.827860] WARNING: CPU: 1 PID: 30171 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x48/0x68
[   81.831202] Modules linked in:
[   81.832212] CPU: 1 PID: 30171 Comm: syz-executor.20 Not tainted 5.1.0-rc7-00004-g7d30b2ea43d6 #19
[   81.835616] Hardware name: linux,dummy-virt (DT)
[   81.836863] pstate: 60400005 (nZCv daif +PAN -UAO)
[   81.838727] pc : __virt_to_phys+0x48/0x68
[   81.840572] lr : __virt_to_phys+0x48/0x68
[   81.842264] sp : ffff80002cf67c70
[   81.843858] x29: ffff80002cf67c70 x28: ffff800014358e18
[   81.846463] x27: 0000000000000000 x26: 0000000020000080
[   81.849148] x25: 0000000000000000 x24: ffff80001bb01f40
[   81.851986] x23: ffff200011db06c8 x22: ffff2000127e3c60
[   81.854351] x21: ffff800014358cc0 x20: ffff800014358d98
[   81.856711] x19: 0000000000000000 x18: 0000000000000000
[   81.859132] x17: 0000000000000000 x16: 0000000000000000
[   81.861586] x15: 0000000000000000 x14: 0000000000000000
[   81.863905] x13: 0000000000000000 x12: ffff1000037603e9
[   81.866226] x11: 1ffff000037603e8 x10: 0000000000000980
[   81.868776] x9 : ffff80002cf67840 x8 : ffff80001bb02920
[   81.873272] x7 : ffff1000037603e9 x6 : ffff80001bb01f47
[   81.875266] x5 : ffff1000037603e9 x4 : dfff200000000000
[   81.876875] x3 : ffff200010087528 x2 : ffff1000059ecf58
[   81.878751] x1 : 44c06ddd107d1f00 x0 : 0000000000000000
[   81.880453] Call trace:
[   81.881164]  __virt_to_phys+0x48/0x68
[   81.882919]  io_mem_free+0x18/0x110
[   81.886585]  io_ring_ctx_wait_and_kill+0x13c/0x1f0
[   81.891212]  io_uring_setup+0xa60/0xad0
[   81.892881]  __arm64_sys_io_uring_setup+0x2c/0x38
[   81.894398]  el0_svc_common.constprop.0+0xac/0x150
[   81.896306]  el0_svc_handler+0x34/0x88
[   81.897744]  el0_svc+0x8/0xc
[   81.898715] ---[ end trace b4a703802243cbba ]---

Fixes: 2b188cc1bb ("Add io_uring IO interface")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:47 -06:00
Mark Rutland
975554b03e io_uring: fix SQPOLL cpu validation
In io_sq_offload_start(), we call cpu_possible() on an unbounded cpu
value from userspace. On v5.1-rc7 on arm64 with
CONFIG_DEBUG_PER_CPU_MAPS, this results in a splat:

  WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpu_max_bits_warn include/linux/cpumask.h:121 [inline]

There was an attempt to fix this in commit:

  917257daa0 ("io_uring: only test SQPOLL cpu after we've verified it")

... by adding a check after the cpu value had been limited to NR_CPU_IDS
using array_index_nospec(). However, this left an unbound check at the
start of the function, for which the warning still fires.

Let's fix this correctly by checking that the cpu value is bound by
nr_cpu_ids before passing it to cpu_possible(). Note that only
nr_cpu_ids of a cpumask are guaranteed to exist at runtime, and
nr_cpu_ids can be significantly smaller than NR_CPUs. For example, an
arm64 defconfig has NR_CPUS=256, while my test VM has 4 vCPUs.

Following the intent from the commit message for 917257daa0, the
check is moved under the SQ_AFF branch, which is the only branch where
the cpu values is consumed. The check is performed before bounding the
value with array_index_nospec() so that we don't silently accept bogus
cpu values from userspace, where array_index_nospec() would force these
values to 0.

I suspect we can remove the array_index_nospec() call entirely, but I've
conservatively left that in place, updated to use nr_cpu_ids to match
the prior check.

Tested on arm64 with the Syzkaller reproducer:

  https://syzkaller.appspot.com/bug?extid=cd714a07c6de2bc34293
  https://syzkaller.appspot.com/x/repro.syz?x=15d8b397200000

Full splat from before this patch:

WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpu_max_bits_warn include/linux/cpumask.h:121 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpumask_check include/linux/cpumask.h:128 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpumask_test_cpu include/linux/cpumask.h:344 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_sq_offload_start fs/io_uring.c:2244 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_uring_create fs/io_uring.c:2864 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_uring_setup+0x1108/0x15a0 fs/io_uring.c:2916
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 27601 Comm: syz-executor.0 Not tainted 5.1.0-rc7 #3
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace+0x0/0x2f0 include/linux/compiler.h:193
 show_stack+0x20/0x30 arch/arm64/kernel/traps.c:158
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x110/0x190 lib/dump_stack.c:113
 panic+0x384/0x68c kernel/panic.c:214
 __warn+0x2bc/0x2c0 kernel/panic.c:571
 report_bug+0x228/0x2d8 lib/bug.c:186
 bug_handler+0xa0/0x1a0 arch/arm64/kernel/traps.c:956
 call_break_hook arch/arm64/kernel/debug-monitors.c:301 [inline]
 brk_handler+0x1d4/0x388 arch/arm64/kernel/debug-monitors.c:316
 do_debug_exception+0x1a0/0x468 arch/arm64/mm/fault.c:831
 el1_dbg+0x18/0x8c
 cpu_max_bits_warn include/linux/cpumask.h:121 [inline]
 cpumask_check include/linux/cpumask.h:128 [inline]
 cpumask_test_cpu include/linux/cpumask.h:344 [inline]
 io_sq_offload_start fs/io_uring.c:2244 [inline]
 io_uring_create fs/io_uring.c:2864 [inline]
 io_uring_setup+0x1108/0x15a0 fs/io_uring.c:2916
 __do_sys_io_uring_setup fs/io_uring.c:2929 [inline]
 __se_sys_io_uring_setup fs/io_uring.c:2926 [inline]
 __arm64_sys_io_uring_setup+0x50/0x70 fs/io_uring.c:2926
 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
 invoke_syscall arch/arm64/kernel/syscall.c:47 [inline]
 el0_svc_common.constprop.0+0x148/0x2e0 arch/arm64/kernel/syscall.c:83
 el0_svc_handler+0xdc/0x100 arch/arm64/kernel/syscall.c:129
 el0_svc+0x8/0xc arch/arm64/kernel/entry.S:948
SMP: stopping secondary CPUs
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
CPU features: 0x002,23000438
Memory Limit: none
Rebooting in 1 seconds..

Fixes: 917257daa0 ("io_uring: only test SQPOLL cpu after we've verified it")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Simplied the logic

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:37 -06:00
Xin Long
fbd019737d sctp: avoid running the sctp state machine recursively
Ying triggered a call trace when doing an asconf testing:

  BUG: scheduling while atomic: swapper/12/0/0x10000100
  Call Trace:
   <IRQ>  [<ffffffffa4375904>] dump_stack+0x19/0x1b
   [<ffffffffa436fcaf>] __schedule_bug+0x64/0x72
   [<ffffffffa437b93a>] __schedule+0x9ba/0xa00
   [<ffffffffa3cd5326>] __cond_resched+0x26/0x30
   [<ffffffffa437bc4a>] _cond_resched+0x3a/0x50
   [<ffffffffa3e22be8>] kmem_cache_alloc_node+0x38/0x200
   [<ffffffffa423512d>] __alloc_skb+0x5d/0x2d0
   [<ffffffffc0995320>] sctp_packet_transmit+0x610/0xa20 [sctp]
   [<ffffffffc098510e>] sctp_outq_flush+0x2ce/0xc00 [sctp]
   [<ffffffffc098646c>] sctp_outq_uncork+0x1c/0x20 [sctp]
   [<ffffffffc0977338>] sctp_cmd_interpreter.isra.22+0xc8/0x1460 [sctp]
   [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
   [<ffffffffc099443d>] sctp_primitive_ASCONF+0x3d/0x50 [sctp]
   [<ffffffffc0977384>] sctp_cmd_interpreter.isra.22+0x114/0x1460 [sctp]
   [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
   [<ffffffffc097b3a4>] sctp_assoc_bh_rcv+0xf4/0x1b0 [sctp]
   [<ffffffffc09840f1>] sctp_inq_push+0x51/0x70 [sctp]
   [<ffffffffc099732b>] sctp_rcv+0xa8b/0xbd0 [sctp]

As it shows, the first sctp_do_sm() running under atomic context (NET_RX
softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later,
and this flag is supposed to be used in non-atomic context only. Besides,
sctp_do_sm() was called recursively, which is not expected.

Vlad tried to fix this recursive call in Commit c078669340 ("sctp: Fix
oops when sending queued ASCONF chunks") by introducing a new command
SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still
used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will
be called in this command again.

To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF
not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st
sctp_do_sm() directly.

Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 09:18:57 -04:00
Jan Kiszka
37e9c087c8 stmmac: pci: Fix typo in IOT2000 comment
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 09:15:52 -04:00
Randy Dunlap
799381e49b Documentation: fix netdev-FAQ.rst markup warning
Fix ReST underline warning:

./Documentation/networking/netdev-FAQ.rst:135: WARNING: Title underline too short.

Q: I made changes to only a few patches in a patch series should I resend only those changed?
--------------------------------------------------------------------------------------------

Fixes: ffa9125373 ("Documentation: networking: Update netdev-FAQ regarding patches")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 09:12:51 -04:00
Jens Axboe
5c8b0b54db io_uring: have submission side sqe errors post a cqe
Currently we only post a cqe if we get an error OUTSIDE of submission.
For submission, we return the error directly through io_uring_enter().
This is a bit awkward for applications, and it makes more sense to
always post a cqe with an error, if the error happens on behalf of an
sqe.

This changes submission behavior a bit. io_uring_enter() returns -ERROR
for an error, and > 0 for number of sqes submitted. Before this change,
if you wanted to submit 8 entries and had an error on the 5th entry,
io_uring_enter() would return 4 (for number of entries successfully
submitted) and rewind the sqring. The application would then have to
peek at the sqring and figure out what was wrong with the head sqe, and
then skip it itself. With this change, we'll return 5 since we did
consume 5 sqes, and the last sqe (with the error) will result in a cqe
being posted with the error.

This makes the logic easier to handle in the application, and it cleans
up the submission part.

Suggested-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 06:37:55 -06:00
Eric Dumazet
0e23387491 ipv6: fix races in ip6_dst_destroy()
We had many syzbot reports that seem to be caused by use-after-free
of struct fib6_info.

ip6_dst_destroy(), fib6_drop_pcpu_from() and rt6_remove_exception()
are writers vs rt->from, and use non consistent synchronization among
themselves.

Switching to xchg() will solve the issues with no possible
lockdep issues.

BUG: KASAN: user-memory-access in atomic_dec_and_test include/asm-generic/atomic-instrumented.h:747 [inline]
BUG: KASAN: user-memory-access in fib6_info_release include/net/ip6_fib.h:294 [inline]
BUG: KASAN: user-memory-access in fib6_info_release include/net/ip6_fib.h:292 [inline]
BUG: KASAN: user-memory-access in fib6_drop_pcpu_from net/ipv6/ip6_fib.c:927 [inline]
BUG: KASAN: user-memory-access in fib6_purge_rt+0x4f6/0x670 net/ipv6/ip6_fib.c:960
Write of size 4 at addr 0000000000ffffb4 by task syz-executor.1/7649

CPU: 0 PID: 7649 Comm: syz-executor.1 Not tainted 5.1.0-rc6+ #183
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 kasan_report.cold+0x5/0x40 mm/kasan/report.c:321
 check_memory_region_inline mm/kasan/generic.c:185 [inline]
 check_memory_region+0x123/0x190 mm/kasan/generic.c:191
 kasan_check_write+0x14/0x20 mm/kasan/common.c:108
 atomic_dec_and_test include/asm-generic/atomic-instrumented.h:747 [inline]
 fib6_info_release include/net/ip6_fib.h:294 [inline]
 fib6_info_release include/net/ip6_fib.h:292 [inline]
 fib6_drop_pcpu_from net/ipv6/ip6_fib.c:927 [inline]
 fib6_purge_rt+0x4f6/0x670 net/ipv6/ip6_fib.c:960
 fib6_del_route net/ipv6/ip6_fib.c:1813 [inline]
 fib6_del+0xac2/0x10a0 net/ipv6/ip6_fib.c:1844
 fib6_clean_node+0x3a8/0x590 net/ipv6/ip6_fib.c:2006
 fib6_walk_continue+0x495/0x900 net/ipv6/ip6_fib.c:1928
 fib6_walk+0x9d/0x100 net/ipv6/ip6_fib.c:1976
 fib6_clean_tree+0xe0/0x120 net/ipv6/ip6_fib.c:2055
 __fib6_clean_all+0x118/0x2a0 net/ipv6/ip6_fib.c:2071
 fib6_clean_all+0x2b/0x40 net/ipv6/ip6_fib.c:2082
 rt6_sync_down_dev+0x134/0x150 net/ipv6/route.c:4057
 rt6_disable_ip+0x27/0x5f0 net/ipv6/route.c:4062
 addrconf_ifdown+0xa2/0x1220 net/ipv6/addrconf.c:3705
 addrconf_notify+0x19a/0x2260 net/ipv6/addrconf.c:3630
 notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
 __raw_notifier_call_chain kernel/notifier.c:394 [inline]
 raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
 call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1753
 call_netdevice_notifiers_extack net/core/dev.c:1765 [inline]
 call_netdevice_notifiers net/core/dev.c:1779 [inline]
 dev_close_many+0x33f/0x6f0 net/core/dev.c:1522
 rollback_registered_many+0x43b/0xfd0 net/core/dev.c:8177
 rollback_registered+0x109/0x1d0 net/core/dev.c:8242
 unregister_netdevice_queue net/core/dev.c:9289 [inline]
 unregister_netdevice_queue+0x1ee/0x2c0 net/core/dev.c:9282
 unregister_netdevice include/linux/netdevice.h:2658 [inline]
 __tun_detach+0xd5b/0x1000 drivers/net/tun.c:727
 tun_detach drivers/net/tun.c:744 [inline]
 tun_chr_close+0xe0/0x180 drivers/net/tun.c:3443
 __fput+0x2e5/0x8d0 fs/file_table.c:278
 ____fput+0x16/0x20 fs/file_table.c:309
 task_work_run+0x14a/0x1c0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x90a/0x2fa0 kernel/exit.c:876
 do_group_exit+0x135/0x370 kernel/exit.c:980
 __do_sys_exit_group kernel/exit.c:991 [inline]
 __se_sys_exit_group kernel/exit.c:989 [inline]
 __x64_sys_exit_group+0x44/0x50 kernel/exit.c:989
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458da9
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffeafc2a6a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 000000000000001c RCX: 0000000000458da9
RDX: 0000000000412a80 RSI: 0000000000a54ef0 RDI: 0000000000000043
RBP: 00000000004be552 R08: 000000000000000c R09: 000000000004c0d1
R10: 0000000002341940 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00007ffeafc2a7f0 R14: 000000000004c065 R15: 00007ffeafc2a800

Fixes: a68886a691 ("net/ipv6: Make from in rt6_info rcu protected")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 23:38:38 -04:00
Jim Mattson
e8ab8d24b4 KVM: nVMX: Fix size checks in vmx_set_nested_state
The size checks in vmx_nested_state are wrong because the calculations
are made based on the size of a pointer to a struct kvm_nested_state
rather than the size of a struct kvm_nested_state.

Reported-by: Felix Wilhelm  <fwilhelm@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Drew Schmitt <dasch@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Fixes: 8fcc4b5923
Cc: stable@ver.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-01 00:43:44 +02:00
Linus Torvalds
f2bc9c908d Merge tag 'fsnotify_for_v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify fix from Jan Kara:
 "A fix of user trigerable NULL pointer dereference syzbot has recently
  spotted.

  The problem was introduced in this merge window so no CC stable is
  needed"

* tag 'fsnotify_for_v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: Fix NULL ptr deref in fanotify_get_fsid()
2019-04-30 15:03:00 -07:00
Paolo Bonzini
6245242d91 Merge tag 'kvmarm-fixes-for-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/ARM fixes for 5.1, take #2:

- Don't try to emulate timers on userspace access
- Fix unaligned huge mappings, again
- Properly reset a vcpu that fails to reset(!)
- Properly retire pending LPIs on reset
- Fix computation of emulated CNTP_TVAL
2019-04-30 21:23:06 +02:00
Vitaly Kuznetsov
eba3afde1c KVM: selftests: make hyperv_cpuid test pass on AMD
Enlightened VMCS is only supported on Intel CPUs but the test shouldn't
fail completely.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:22:16 +02:00
Sean Christopherson
b904cb8dff KVM: lapic: Check for in-kernel LAPIC before deferencing apic pointer
...to avoid dereferencing a null pointer when querying the per-vCPU
timer advance.

Fixes: 39497d7660 ("KVM: lapic: Track lapic timer advance per vCPU")
Reported-by: syzbot+f7e65445a40d3e0e4ebf@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:22:15 +02:00
Paolo Bonzini
76d58e0f07 KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size
If a memory slot's size is not a multiple of 64 pages (256K), then
the KVM_CLEAR_DIRTY_LOG API is unusable: clearing the final 64 pages
either requires the requested page range to go beyond memslot->npages,
or requires log->num_pages to be unaligned, and kvm_clear_dirty_log_protect
requires log->num_pages to be both in range and aligned.

To allow this case, allow log->num_pages not to be a multiple of 64 if
it ends exactly on the last page of the slot.

Reported-by: Peter Xu <peterx@redhat.com>
Fixes: 98938aa8ed ("KVM: validate userspace input in kvm_clear_dirty_log_protect()", 2019-01-02)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:22:15 +02:00
Vitaly Kuznetsov
0699c64a4b x86/kvm/mmu: reset MMU context when 32-bit guest switches PAE
Commit 47c42e6b41 ("KVM: x86: fix handling of role.cr4_pae and rename it
to 'gpte_size'") introduced a regression: 32-bit PAE guests stopped
working. The issue appears to be: when guest switches (enables) PAE we need
to re-initialize MMU context (set context->root_level, do
reset_rsvds_bits_mask(), ...) but init_kvm_tdp_mmu() doesn't do that
because we threw away is_pae(vcpu) flag from mmu role. Restore it to
kvm_mmu_extended_role (as we now don't need it in base role) to fix
the issue.

Fixes: 47c42e6b41 ("KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size'")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:03:58 +02:00
Sean Christopherson
8764ed55c9 KVM: x86: Whitelist port 0x7e for pre-incrementing %rip
KVM's recent bug fix to update %rip after emulating I/O broke userspace
that relied on the previous behavior of incrementing %rip prior to
exiting to userspace.  When running a Windows XP guest on AMD hardware,
Qemu may patch "OUT 0x7E" instructions in reaction to the OUT itself.
Because KVM's old behavior was to increment %rip before exiting to
userspace to handle the I/O, Qemu manually adjusted %rip to account for
the OUT instruction.

Arguably this is a userspace bug as KVM requires userspace to re-enter
the kernel to complete instruction emulation before taking any other
actions.  That being said, this is a bit of a grey area and breaking
userspace that has worked for many years is bad.

Pre-increment %rip on OUT to port 0x7e before exiting to userspace to
hack around the issue.

Fixes: 45def77ebf ("KVM: x86: update %rip after emulating IO")
Reported-by: Simon Becherer <simon@becherer.de>
Reported-and-tested-by: Iakov Karpov <srid@rkmail.ru>
Reported-by: Gabriele Balducci <balducci@units.it>
Reported-by: Antti Antinoja <reader@fennosys.fi>
Cc: stable@vger.kernel.org
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:03:42 +02:00
Rafael J. Wysocki
2c2a2fb1e2 Revert "ACPICA: Clear status of GPEs before enabling them"
Revert commit c8b1917c89 ("ACPICA: Clear status of GPEs before
enabling them") that causes problems with Thunderbolt controllers
to occur if a dock device is connected at init time (the xhci_hcd
and thunderbolt modules crash which prevents peripherals connected
through them from working).

Commit c8b1917c89 effectively causes commit ecc1165b8b ("ACPICA:
Dispatch active GPEs at init time") to get undone, so the problem
addressed by commit ecc1165b8b appears again as a result of it.

Fixes: c8b1917c89 ("ACPICA: Clear status of GPEs before enabling them")
Link: https://lore.kernel.org/lkml/s5hy33siofw.wl-tiwai@suse.de/T/#u
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1132943
Reported-by: Michael Hirmke <opensuse@mike.franken.de>
Reported-by: Takashi Iwai <tiwai@suse.de>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-04-30 20:03:44 +02:00
Gary Hook
b51ce3744f x86/mm/mem_encrypt: Disable all instrumentation for early SME setup
Enablement of AMD's Secure Memory Encryption feature is determined very
early after start_kernel() is entered. Part of this procedure involves
scanning the command line for the parameter 'mem_encrypt'.

To determine intended state, the function sme_enable() uses library
functions cmdline_find_option() and strncmp(). Their use occurs early
enough such that it cannot be assumed that any instrumentation subsystem
is initialized.

For example, making calls to a KASAN-instrumented function before KASAN
is set up will result in the use of uninitialized memory and a boot
failure.

When AMD's SME support is enabled, conditionally disable instrumentation
of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.

 [ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]

Fixes: aca20d5462 ("x86/mm: Add support to make use of Secure Memory Encryption")
Reported-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Boris Brezillon <bbrezillon@kernel.org>
Cc: Coly Li <colyli@suse.de>
Cc: "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: "luto@kernel.org" <luto@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "mingo@redhat.com" <mingo@redhat.com>
Cc: "peterz@infradead.org" <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh3.amd.com
2019-04-30 17:59:08 +02:00
David S. Miller
34259977f2 Merge tag 'wireless-drivers-for-davem-2019-04-30' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:

====================
wireless-drivers fixes for 5.1

Third set of fixes for 5.1.

iwlwifi

* fix an oops when creating debugfs entries

* fix bug when trying to capture debugging info while in rfkill

* prevent potential uninitialized memory dumps into debugging logs

* fix some initialization parameters for AX210 devices

* fix an oops with non-MSIX devices

* fix an oops when we receive a packet with bogus lengths

* fix a bug that prevented 5350 devices from working

* fix a small merge damage from the previous series

mwifiex

* fig regression with resume on SDIO

ath10k

* fix locking problem with crashdump

* fix warnings during suspend and resume

Also note that this pull conflicts with net-next. And I want to emphasie
that it's really net-next, so when you pull this to net tree it should
go without conflicts. Stephen reported the conflict here:

https://lkml.kernel.org/r/20190429115338.5decb50b@canb.auug.org.au

In iwlwifi oddly commit 154d4899e4 adds the IS_ERR_OR_NULL() in
wireless-drivers but commit c9af7528c3 removes the whole check in
wireless-drivers-next. The fix is easy, just drop the whole check for
mvmvif->dbgfs_dir in iwlwifi/mvm/debugfs-vif.c, it's unneeded anyway.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 11:52:17 -04:00
Linus Torvalds
bf3bd966df Merge tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some small USB fixes for a bunch of warnings/errors that the
  syzbot has been finding with it's new-found ability to stress-test the
  USB layer.

  All of these are tiny, but fix real issues, and are marked for stable
  as well. All of these have had lots of testing in linux-next as well"

* tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: w1 ds2490: Fix bug caused by improper use of altsetting array
  USB: yurex: Fix protection fault after device removal
  usb: usbip: fix isoc packet num validation in get_pipe
  USB: core: Fix bug caused by duplicate interface PM usage counter
  USB: dummy-hcd: Fix failure to give back unlinked URBs
  USB: core: Fix unterminated string returned by usb_string()
2019-04-30 08:41:22 -07:00
Stefan Bühler
62977281a6 io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP
There is no operation to order with afterwards, and removing the flag is
not critical in any way.

There will always be a "race condition" where the application will
trigger IORING_ENTER_SQ_WAKEUP when it isn't actually needed.

Signed-off-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00
Stefan Bühler
b841f19524 io_uring: remove unnecessary barrier after incrementing dropped counter
smp_store_release in io_commit_sqring already orders the store to
dropped before the update to SQ head.

Signed-off-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00
Stefan Bühler
82ab082c0e io_uring: remove unnecessary barrier before reading SQ tail
There is no operation before to order with.

Signed-off-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00
Stefan Bühler
9e4c15a393 io_uring: remove unnecessary barrier after updating SQ head
There is no operation afterwards to order with.

Signed-off-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00
Stefan Bühler
115e12e58d io_uring: remove unnecessary barrier before reading cq head
The memory operations before reading cq head are unrelated and we
don't care about their order.

Document that the control dependency in combination with READ_ONCE and
WRITE_ONCE forms a barrier we need.

Signed-off-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00
Stefan Bühler
4f7067c3fb io_uring: remove unnecessary barrier before wq_has_sleeper
wq_has_sleeper has a full barrier internally. The smp_rmb barrier in
io_uring_poll synchronizes with it.

Signed-off-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00
Stefan Bühler
1e84b97b73 io_uring: fix notes on barriers
The application reading the CQ ring needs a barrier to pair with the
smp_store_release in io_commit_cqring, not the barrier after it.

Also a write barrier *after* writing something (but not *before*
writing anything interesting) doesn't order anything, so an smp_wmb()
after writing SQ tail is not needed.

Additionally consider reading SQ head and writing CQ tail in the notes.

Also add some clarifications how the various other fields in the ring
buffers are used.

Signed-off-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00
Stefan Bühler
8449eedaa1 io_uring: fix handling SQEs requesting NOWAIT
Not all request types set REQ_F_FORCE_NONBLOCK when they needed async
punting; reverse logic instead and set REQ_F_NOWAIT if request mustn't
be punted.

Signed-off-by: Stefan Bühler <source@stbuehler.de>

Merged with my previous patch for this.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00
Linus Torvalds
fea27bc7ff Merge tag 'selinux-pr-20190429' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
 "One small patch for the stable folks to fix a problem when building
  against the latest glibc.

  I'll be honest and say that I'm not really thrilled with the idea of
  sending this up right now, but Greg is a little annoyed so here I
  figured I would at least send this"

* tag 'selinux-pr-20190429' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: use kernel linux/socket.h for genheaders and mdp
2019-04-30 08:38:02 -07:00
Eric Dumazet
a622b40035 l2ip: fix possible use-after-free
Before taking a refcount on a rcu protected structure,
we need to make sure the refcount is not zero.

syzbot reported :

refcount_t: increment on 0; use-after-free.
WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked lib/refcount.c:156 [inline]
WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked+0x61/0x70 lib/refcount.c:154
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 23533 Comm: syz-executor.2 Not tainted 5.1.0-rc7+ #93
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 panic+0x2cb/0x65c kernel/panic.c:214
 __warn.cold+0x20/0x45 kernel/panic.c:571
 report_bug+0x263/0x2b0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:179 [inline]
 fixup_bug arch/x86/kernel/traps.c:174 [inline]
 do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
Code: 1d 98 2b 2a 06 31 ff 89 de e8 db 2c 40 fe 84 db 75 dd e8 92 2b 40 fe 48 c7 c7 20 7a a1 87 c6 05 78 2b 2a 06 01 e8 7d d9 12 fe <0f> 0b eb c1 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
RSP: 0018:ffff888069f0fba8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 000000000000f353 RSI: ffffffff815afcb6 RDI: ffffed100d3e1f67
RBP: ffff888069f0fbb8 R08: ffff88809b1845c0 R09: ffffed1015d23ef1
R10: ffffed1015d23ef0 R11: ffff8880ae91f787 R12: ffff8880a8f26968
R13: 0000000000000004 R14: dffffc0000000000 R15: ffff8880a49a6440
 l2tp_tunnel_inc_refcount net/l2tp/l2tp_core.h:240 [inline]
 l2tp_tunnel_get+0x250/0x580 net/l2tp/l2tp_core.c:173
 pppol2tp_connect+0xc00/0x1c70 net/l2tp/l2tp_ppp.c:702
 __sys_connect+0x266/0x330 net/socket.c:1808
 __do_sys_connect net/socket.c:1819 [inline]
 __se_sys_connect net/socket.c:1816 [inline]
 __x64_sys_connect+0x73/0xb0 net/socket.c:1816

Fixes: 54652eb12c ("l2tp: hold tunnel while looking up sessions in l2tp_netlink")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 11:35:48 -04:00
YueHaibing
c93ad1337a appletalk: Set error code if register_snap_client failed
If register_snap_client fails in atalk_init,
error code should be set, otherwise it will
triggers NULL pointer dereference while unloading
module.

Fixes: 9804501fa1 ("appletalk: Fix potential NULL pointer dereference in unregister_snap_client")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 11:09:28 -04:00
Dan Carpenter
f949a12fd6 net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc
The "fs->location" is a u32 that comes from the user in ethtool_set_rxnfc().
We can't pass unclamped values to test_bit() or it results in an out of
bounds access beyond the end of the bitmap.

Fixes: 7318166cac ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 11:02:37 -04:00
David Howells
b13023421b rxrpc: Fix net namespace cleanup
In rxrpc_destroy_all_calls(), there are two phases: (1) make sure the
->calls list is empty, emitting error messages if not, and (2) wait for the
RCU cleanup to happen on outstanding calls (ie. ->nr_calls becomes 0).

To avoid taking the call_lock, the function prechecks ->calls and if empty,
it returns to avoid taking the lock - this is wrong, however: it still
needs to go and do the second phase and wait for ->nr_calls to become 0.

Without this, the rxrpc_net struct may get deallocated before we get to the
RCU cleanup for the last calls.  This can lead to:

  Slab corruption (Not tainted): kmalloc-16k start=ffff88802b178000, len=16384
  050: 6b 6b 6b 6b 6b 6b 6b 6b 61 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkakkkkkkk

Note the "61" at offset 0x58.  This corresponds to the ->nr_calls member of
struct rxrpc_net (which is >9k in size, and thus allocated out of the 16k
slab).

Fix this by flipping the condition on the if-statement, putting the locked
section inside the if-body and dropping the return from there.  The
function will then always go on to wait for the RCU cleanup on outstanding
calls.

Fixes: 2baec2c3f8 ("rxrpc: Support network namespacing")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 10:50:50 -04:00
Takashi Iwai
3887c26c0e ALSA: hda/realtek - Apply the fixup for ASUS Q325UAR
Some ASUS models like Q325UAR with ALC295 codec requires the same
fixup that has been applied to ALC294 codec.  Just copy the entry with
the pin matching to cover ALC295 too.

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1784485
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-04-30 15:13:15 +02:00
David S. Miller
b145745fc8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2019-04-30

1) Fix an out-of-bound array accesses in __xfrm_policy_unlink.
   From YueHaibing.

2) Reset the secpath on failure in the ESP GRO handlers
   to avoid dereferencing an invalid pointer on error.
   From Myungho Jung.

3) Add and revert a patch that tried to add rcu annotations
   to netns_xfrm. From Su Yanjun.

4) Wait for rcu callbacks before freeing xfrm6_tunnel_spi_kmem.
   From Su Yanjun.

5) Fix forgotten vti4 ipip tunnel deregistration.
   From Jeremy Sowden:

6) Remove some duplicated log messages in vti4.
   From Jeremy Sowden.

7) Don't use IPSEC_PROTO_ANY when flushing states because
   this will flush only IPsec portocol speciffic states.
   IPPROTO_ROUTING states may remain in the lists when
   doing net exit. Fix this by replacing IPSEC_PROTO_ANY
   with zero. From Cong Wang.

8) Add length check for UDP encapsulation to fix "Oversized IP packet"
   warnings on receive side. From Sabrina Dubroca.

9) Fix xfrm interface lookup when the interface is associated to
   a vrf layer 3 master device. From Martin Willi.

10) Reload header pointers after pskb_may_pull() in _decode_session4(),
    otherwise we may read from uninitialized memory.

11) Update the documentation about xfrm[46]_gc_thresh, it
    is not used anymore after the flowcache removal.
    From Nicolas Dichtel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 09:11:10 -04:00
Gerd Hoffmann
ab042b824c Revert "drm/qxl: drop prime import/export callbacks"
This reverts commit f4c34b1e2a.

Simliar to commit a0cecc23cf Revert "drm/virtio: drop prime
import/export callbacks".  We have to do the same with qxl,
for the same reasons (it breaks DRI3).

Drop the WARN_ON_ONCE().

Fixes: f4c34b1e2a ("drm/qxl: drop prime import/export callbacks")
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190426053324.26443-1-kraxel@redhat.com
Acked-by: Daniel Vetter <daniel@ffwll.ch>
2019-04-30 14:08:48 +02:00
Tobin C. Harding
9a4f26cc98 sched/cpufreq: Fix kobject memleak
Currently the error return path from kobject_init_and_add() is not
followed by a call to kobject_put() - which means we are leaking
the kobject.

Fix it by adding a call to kobject_put() in the error path of
kobject_init_and_add().

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tobin C. Harding <tobin@kernel.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20190430001144.24890-1-tobin@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-30 07:57:23 +02:00
Eric Dumazet
6c0afef5fb ipv6/flowlabel: wait rcu grace period before put_pid()
syzbot was able to catch a use-after-free read in pid_nr_ns() [1]

ip6fl_seq_show() seems to use RCU protection, dereferencing fl->owner.pid
but fl_free() releases fl->owner.pid before rcu grace period is started.

[1]

BUG: KASAN: use-after-free in pid_nr_ns+0x128/0x140 kernel/pid.c:407
Read of size 4 at addr ffff888094012a04 by task syz-executor.0/18087

CPU: 0 PID: 18087 Comm: syz-executor.0 Not tainted 5.1.0-rc6+ #89
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131
 pid_nr_ns+0x128/0x140 kernel/pid.c:407
 ip6fl_seq_show+0x2f8/0x4f0 net/ipv6/ip6_flowlabel.c:794
 seq_read+0xad3/0x1130 fs/seq_file.c:268
 proc_reg_read+0x1fe/0x2c0 fs/proc/inode.c:227
 do_loop_readv_writev fs/read_write.c:701 [inline]
 do_loop_readv_writev fs/read_write.c:688 [inline]
 do_iter_read+0x4a9/0x660 fs/read_write.c:922
 vfs_readv+0xf0/0x160 fs/read_write.c:984
 kernel_readv fs/splice.c:358 [inline]
 default_file_splice_read+0x475/0x890 fs/splice.c:413
 do_splice_to+0x12a/0x190 fs/splice.c:876
 splice_direct_to_actor+0x2d2/0x970 fs/splice.c:953
 do_splice_direct+0x1da/0x2a0 fs/splice.c:1062
 do_sendfile+0x597/0xd00 fs/read_write.c:1443
 __do_sys_sendfile64 fs/read_write.c:1498 [inline]
 __se_sys_sendfile64 fs/read_write.c:1490 [inline]
 __x64_sys_sendfile64+0x15a/0x220 fs/read_write.c:1490
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458da9
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f300d24bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000000458da9
RDX: 00000000200000c0 RSI: 0000000000000008 RDI: 0000000000000007
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000005a R11: 0000000000000246 R12: 00007f300d24c6d4
R13: 00000000004c5fa3 R14: 00000000004da748 R15: 00000000ffffffff

Allocated by task 17543:
 save_stack+0x45/0xd0 mm/kasan/common.c:75
 set_track mm/kasan/common.c:87 [inline]
 __kasan_kmalloc mm/kasan/common.c:497 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470
 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:505
 slab_post_alloc_hook mm/slab.h:437 [inline]
 slab_alloc mm/slab.c:3393 [inline]
 kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3555
 alloc_pid+0x55/0x8f0 kernel/pid.c:168
 copy_process.part.0+0x3b08/0x7980 kernel/fork.c:1932
 copy_process kernel/fork.c:1709 [inline]
 _do_fork+0x257/0xfd0 kernel/fork.c:2226
 __do_sys_clone kernel/fork.c:2333 [inline]
 __se_sys_clone kernel/fork.c:2327 [inline]
 __x64_sys_clone+0xbf/0x150 kernel/fork.c:2327
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 7789:
 save_stack+0x45/0xd0 mm/kasan/common.c:75
 set_track mm/kasan/common.c:87 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:467
 __cache_free mm/slab.c:3499 [inline]
 kmem_cache_free+0x86/0x260 mm/slab.c:3765
 put_pid.part.0+0x111/0x150 kernel/pid.c:111
 put_pid+0x20/0x30 kernel/pid.c:105
 fl_free+0xbe/0xe0 net/ipv6/ip6_flowlabel.c:102
 ip6_fl_gc+0x295/0x3e0 net/ipv6/ip6_flowlabel.c:152
 call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
 expire_timers kernel/time/timer.c:1362 [inline]
 __run_timers kernel/time/timer.c:1681 [inline]
 __run_timers kernel/time/timer.c:1649 [inline]
 run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
 __do_softirq+0x266/0x95a kernel/softirq.c:293

The buggy address belongs to the object at ffff888094012a00
 which belongs to the cache pid_2 of size 88
The buggy address is located 4 bytes inside of
 88-byte region [ffff888094012a00, ffff888094012a58)
The buggy address belongs to the page:
page:ffffea0002500480 count:1 mapcount:0 mapping:ffff88809a483080 index:0xffff888094012980
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea00018a3508 ffffea0002524a88 ffff88809a483080
raw: ffff888094012980 ffff888094012000 000000010000001b 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888094012900: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
 ffff888094012980: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
>ffff888094012a00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
                   ^
 ffff888094012a80: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
 ffff888094012b00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc

Fixes: 4f82f45730 ("net ip6 flowlabel: Make owner a union of struct pid * and kuid_t")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-29 23:30:13 -04:00
Stephen Suryaputra
1d3fd8a10b vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach
When there is no route to an IPv6 dest addr, skb_dst(skb) points
to loopback dev in the case of that the IP6CB(skb)->iif is
enslaved to a vrf. This causes Ip6InNoRoutes to be incremented on the
loopback dev. This also causes the lookup to fail on icmpv6_send() and
the dest unreachable to not sent and Ip6OutNoRoutes gets incremented on
the loopback dev.

To reproduce:
* Gateway configuration:
        ip link add dev vrf_258 type vrf table 258
        ip link set dev enp0s9 master vrf_258
        ip addr add 66:1/64 dev enp0s9
        ip -6 route add unreachable default metric 8192 table 258
        sysctl -w net.ipv6.conf.all.forwarding=1
        sysctl -w net.ipv6.conf.enp0s9.forwarding=1
* Sender configuration:
        ip addr add 66::2/64 dev enp0s9
        ip -6 route add default via 66::1
and ping 67::1 for example from the sender.

Fix this by counting on the original netdev and reset the skb dst to
force a fresh lookup.

v2: Fix typo of destination address in the repro steps.
v3: Simplify the loopback check (per David Ahern) and use reverse
    Christmas tree format (per David Miller).

Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-29 23:28:52 -04:00
Eric Dumazet
ca2fe2956a tcp: add sanity tests in tcp_add_backlog()
Richard and Bruno both reported that my commit added a bug,
and Bruno was able to determine the problem came when a segment
wih a FIN packet was coalesced to a prior one in tcp backlog queue.

It turns out the header prediction in tcp_rcv_established()
looks back to TCP headers in the packet, not in the metadata
(aka TCP_SKB_CB(skb)->tcp_flags)

The fast path in tcp_rcv_established() is not supposed to
handle a FIN flag (it does not call tcp_fin())

Therefore we need to make sure to propagate the FIN flag,
so that the coalesced packet does not go through the fast path,
the same than a GRO packet carrying a FIN flag.

While we are at it, make sure we do not coalesce packets with
RST or SYN, or if they do not have ACK set.

Many thanks to Richard and Bruno for pinpointing the bad commit,
and to Richard for providing a first version of the fix.

Fixes: 4f693b55c3 ("tcp: implement coalescing on backlog queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Reported-by: Bruno Prémont <bonbons@sysophe.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-29 23:20:37 -04:00
Willem de Bruijn
95c169251b ipv6: invert flowlabel sharing check in process and user mode
A request for a flowlabel fails in process or user exclusive mode must
fail if the caller pid or uid does not match. Invert the test.

Previously, the test was unsafe wrt PID recycling, but indeed tested
for inequality: fl1->owner != fl->owner

Fixes: 4f82f45730 ("net ip6 flowlabel: Make owner a union of struct pid* and kuid_t")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-29 18:22:56 -04:00
David S. Miller
6ee12b7b15 Merge branch 'ieee802154-for-davem-2019-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:

====================
ieee802154 for net 2019-04-25

An update from ieee802154 for your *net* tree.

Another fix from Kangjie Lu to ensure better checking regmap updates in the
mcr20a driver. Nothing else I have pending for the final release.

If there are any problems let me know.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-29 18:18:09 -04:00
Linus Torvalds
83a50840e7 Merge tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp fixes from Kees Cook:
 "Syzbot found a use-after-free bug in seccomp due to flags that should
  not be allowed to be used together.

  Tycho fixed this, I updated the self-tests, and the syzkaller PoC has
  been running for several days without triggering KASan (before this
  fix, it would reproduce). These patches have also been in -next for
  almost a week, just to be sure.

   - Add logic for making some seccomp flags exclusive (Tycho)

   - Update selftests for exclusivity testing (Kees)"

* tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: Make NEW_LISTENER and TSYNC flags exclusive
  selftests/seccomp: Prepare for exclusive seccomp flags
2019-04-29 13:24:34 -07:00
Linus Torvalds
80871482fd x86: make ZERO_PAGE() at least parse its argument
This doesn't really do anything, but at least we now parse teh
ZERO_PAGE() address argument so that we'll catch the most obvious errors
in usage next time they'll happen.

See commit 6a5c5d26c4 ("rdma: fix build errors on s390 and MIPS due to
bad ZERO_PAGE use") what happens when we don't have any use of the macro
argument at all.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-29 09:51:29 -07:00
Linus Torvalds
6a5c5d26c4 rdma: fix build errors on s390 and MIPS due to bad ZERO_PAGE use
The parameter to ZERO_PAGE() was wrong, but since all architectures
except for MIPS and s390 ignore it, it wasn't noticed until 0-day
reported the build error.

Fixes: 67f269b37f ("RDMA/ucontext: Fix regression with disassociate")
Cc: stable@vger.kernel.org
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-29 09:48:53 -07:00
Kalle Valo
7a0f8ad5ff Merge ath-current from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
ath.git fixes for 5.1. Major changes:

ath10k

* fix locking problem with crashdump

* fix warnings during suspend and resume
2019-04-29 19:33:33 +03:00
Paulo Alcantara
dfbd199a7c selinux: use kernel linux/socket.h for genheaders and mdp
When compiling genheaders and mdp from a newer host kernel, the
following error happens:

    In file included from scripts/selinux/genheaders/genheaders.c:18:
    ./security/selinux/include/classmap.h:238:2: error: #error New
    address family defined, please update secclass_map.  #error New
    address family defined, please update secclass_map.  ^~~~~
    make[3]: *** [scripts/Makefile.host:107:
    scripts/selinux/genheaders/genheaders] Error 1 make[2]: ***
    [scripts/Makefile.build:599: scripts/selinux/genheaders] Error 2
    make[1]: *** [scripts/Makefile.build:599: scripts/selinux] Error 2
    make[1]: *** Waiting for unfinished jobs....

Instead of relying on the host definition, include linux/socket.h in
classmap.h to have PF_MAX.

Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara <paulo@paulo.ac>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: manually merge in mdp.c, subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-04-29 11:34:58 -04:00
David S. Miller
2ae7a39770 Merge tag 'mac80211-for-davem-2019-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
 * fix use-after-free in mac80211 TXQs
 * fix RX STBC byte order
 * fix debugfs rename crashing due to ERR_PTR()
 * fix missing regulatory notification
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-29 11:00:58 -04:00
Rafael J. Wysocki
9e80ad37f6 ath10k: Drop WARN_ON()s that always trigger during system resume
ath10k_mac_vif_chan() always returns an error for the given vif
during system-wide resume which reliably triggers two WARN_ON()s
in ath10k_bss_info_changed() and they are not particularly
useful in that code path, so drop them.

Tested: QCA6174 hw3.2 PCI with WLAN.RM.2.0-00180-QCARMSWPZ-1
Tested: QCA6174 hw3.2 SDIO with WLAN.RMH.4.4.1-00007-QCARMSWP-1

Fixes: cd93b83ad9 ("ath10k: support for multicast rate control")
Fixes: f279294e9e ("ath10k: add support for configuring management packet rate")
Cc: stable@vger.kernel.org
Reviewed-by: Brian Norris <briannorris@chromium.org>
Tested-by: Brian Norris <briannorris@chromium.org>
Tested-by: Claire Chang <tientzu@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-04-29 17:26:14 +03:00
Brian Norris
38faed1504 ath10k: perform crash dump collection in workqueue
Commit 25733c4e67 ("ath10k: pci: use mutex for diagnostic window CE
polling") introduced a regression where we try to sleep (grab a mutex)
in an atomic context:

[  233.602619] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:254
[  233.602626] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[  233.602636] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.1.0-rc2 #4
[  233.602642] Hardware name: Google Scarlet (DT)
[  233.602647] Call trace:
[  233.602663]  dump_backtrace+0x0/0x11c
[  233.602672]  show_stack+0x20/0x28
[  233.602681]  dump_stack+0x98/0xbc
[  233.602690]  ___might_sleep+0x154/0x16c
[  233.602696]  __might_sleep+0x78/0x88
[  233.602704]  mutex_lock+0x2c/0x5c
[  233.602717]  ath10k_pci_diag_read_mem+0x68/0x21c [ath10k_pci]
[  233.602725]  ath10k_pci_diag_read32+0x48/0x74 [ath10k_pci]
[  233.602733]  ath10k_pci_dump_registers+0x5c/0x16c [ath10k_pci]
[  233.602741]  ath10k_pci_fw_crashed_dump+0xb8/0x548 [ath10k_pci]
[  233.602749]  ath10k_pci_napi_poll+0x60/0x128 [ath10k_pci]
[  233.602757]  net_rx_action+0x140/0x388
[  233.602766]  __do_softirq+0x1b0/0x35c
[...]

ath10k_pci_fw_crashed_dump() is called from NAPI contexts, and firmware
memory dumps are retrieved using the diag memory interface.

A simple reproduction case is to run this on QCA6174A /
WLAN.RM.4.4.1-00132-QCARMSWP-1, which happens to be a way to b0rk the
firmware:

  dd if=/sys/kernel/debug/ieee80211/phy0/ath10k/mem_value bs=4K count=1
of=/dev/null

(NB: simulated firmware crashes, via debugfs, don't trigger firmware
dumps.)

The fix is to move the crash-dump into a workqueue context, and avoid
relying on 'data_lock' for most mutual exclusion. We only keep using it
here for protecting 'fw_crash_counter', while the rest of the coredump
buffers are protected by a new 'dump_mutex'.

I've tested the above with simulated firmware crashes (debugfs 'reset'
file), real firmware crashes (the 'dd' command above), and a variety of
reboot and suspend/resume configurations on QCA6174A.

Reported here:
http://lkml.kernel.org/linux-wireless/20190325202706.GA68720@google.com

Fixes: 25733c4e67 ("ath10k: pci: use mutex for diagnostic window CE polling")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-04-29 17:24:37 +03:00
Andrew Jones
dbcdae185a Documentation: kvm: fix dirty log ioctl arch lists
KVM_GET_DIRTY_LOG is implemented by all architectures, not just x86,
and KVM_CAP_MANUAL_DIRTY_LOG_PROTECT is additionally implemented by
arm, arm64, and mips.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-29 14:06:04 +02:00
Alexander Lochmann
f69e749a49 Abort file_remove_privs() for non-reg. files
file_remove_privs() might be called for non-regular files, e.g.
blkdev inode. There is no reason to do its job on things
like blkdev inodes, pipes, or cdevs. Hence, abort if
file does not refer to a regular inode.

AV: more to the point, for devices there might be any number of
inodes refering to given device.  Which one to strip the permissions
from, even if that made any sense in the first place?  All of them
will be observed with contents modified, after all.

Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
Spinczyk)

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Alexander Lochmann <alexander.lochmann@tu-dortmund.de>
Signed-off-by: Horst Schirmeier <horst.schirmeier@tu-dortmund.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-28 21:46:57 -04:00
Al Viro
ee948837d7 [fix] get rid of checking for absent device name in vfs_get_tree()
It has no business being there, it's checked by relevant ->get_tree()
as it is *and* it returns the wrong error for no reason whatsoever.

Fixes: f3a09c9201 "introduce fs_context methods"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-28 21:34:21 -04:00
Jan Kara
b1da6a5187 fsnotify: Fix NULL ptr deref in fanotify_get_fsid()
fanotify_get_fsid() is reading mark->connector->fsid under srcu. It can
happen that it sees mark not fully initialized or mark that is already
detached from the object list. In these cases mark->connector
can be NULL leading to NULL ptr dereference. Fix the problem by
being careful when reading mark->connector and check it for being NULL.
Also use WRITE_ONCE when writing the mark just to prevent compiler from
doing something stupid.

Reported-by: syzbot+15927486a4f1bfcbaf91@syzkaller.appspotmail.com
Fixes: 77115225ac ("fanotify: cache fsid in fsnotify_mark_connector")
Signed-off-by: Jan Kara <jack@suse.cz>
2019-04-28 22:14:50 +02:00
Greg Kroah-Hartman
e5c812e84f ALSA: line6: use dynamic buffers
The line6 driver uses a lot of USB buffers off of the stack, which is
not allowed on many systems, causing the driver to crash on some of
them.  Fix this up by dynamically allocating the buffers with kmalloc()
which allows for proper DMA-able memory.

Reported-by: Christo Gouws <gouws.christo@gmail.com>
Reported-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Christo Gouws <gouws.christo@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-04-28 18:40:26 +02:00
Kalle Valo
5c403533fb Merge tag 'iwlwifi-for-kalle-2019-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes
Fourth batch of patches intended for v5.1

* Fix an oops when we receive a packet with bogus lengths;
* Fix a bug that prevented 5350 devices from working;
* Fix a small merge damage from the previous series;
2019-04-28 14:25:33 +03:00
Luca Coelho
d156e67d3f iwlwifi: mvm: fix merge damage in iwl_mvm_vif_dbgfs_register()
When I rebased Greg's patch, I accidentally left the old if block that
was already there.  Remove it.

Fixes: 154d4899e4 ("iwlwifi: mvm: properly check debugfs dentry before using it")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-04-28 09:59:59 +03:00
Emmanuel Grumbach
5c9adef978 iwlwifi: fix driver operation for 5350
We introduced a bug that prevented this old device from
working. The driver would simply not be able to complete
the INIT flow while spewing this warning:

 CSR addresses aren't configured
 WARNING: CPU: 0 PID: 819 at drivers/net/wireless/intel/iwlwifi/pcie/drv.c:917
 iwl_pci_probe+0x160/0x1e0 [iwlwifi]

Cc: stable@vger.kernel.org # v4.18+
Fixes: a8cbb46f83 ("iwlwifi: allow different csr flags for different device families")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Fixes: c8f1b51e50 ("iwlwifi: allow different csr flags for different device families")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-04-28 09:59:59 +03:00
Luca Coelho
de1887c064 iwlwifi: mvm: check for length correctness in iwl_mvm_create_skb()
We don't check for the validity of the lengths in the packet received
from the firmware.  If the MPDU length received in the rx descriptor
is too short to contain the header length and the crypt length
together, we may end up trying to copy a negative number of bytes
(headlen - hdrlen < 0) which will underflow and cause us to try to
copy a huge amount of data.  This causes oopses such as this one:

BUG: unable to handle kernel paging request at ffff896be2970000
PGD 5e201067 P4D 5e201067 PUD 5e205067 PMD 16110d063 PTE 8000000162970161
Oops: 0003 [#1] PREEMPT SMP NOPTI
CPU: 2 PID: 1824 Comm: irq/134-iwlwifi Not tainted 4.19.33-04308-geea41cf4930f #1
Hardware name: [...]
RIP: 0010:memcpy_erms+0x6/0x10
Code: 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3
 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
RSP: 0018:ffffa4630196fc60 EFLAGS: 00010287
RAX: ffff896be2924618 RBX: ffff896bc8ecc600 RCX: 00000000fffb4610
RDX: 00000000fffffff8 RSI: ffff896a835e2a38 RDI: ffff896be2970000
RBP: ffffa4630196fd30 R08: ffff896bc8ecc600 R09: ffff896a83597000
R10: ffff896bd6998400 R11: 000000000200407f R12: ffff896a83597050
R13: 00000000fffffff8 R14: 0000000000000010 R15: ffff896a83597038
FS:  0000000000000000(0000) GS:ffff896be8280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff896be2970000 CR3: 000000005dc12002 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 iwl_mvm_rx_mpdu_mq+0xb51/0x121b [iwlmvm]
 iwl_pcie_rx_handle+0x58c/0xa89 [iwlwifi]
 iwl_pcie_irq_rx_msix_handler+0xd9/0x12a [iwlwifi]
 irq_thread_fn+0x24/0x49
 irq_thread+0xb0/0x122
 kthread+0x138/0x140
 ret_from_fork+0x1f/0x40

Fix that by checking the lengths for correctness and trigger a warning
to show that we have received wrong data.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-04-28 09:59:59 +03:00
Kailang Yang
0700d3d117 ALSA: hda/realtek - Fixed Dell AIO speaker noise
Fixed Dell AIO speaker noise.
spec->gen.auto_mute_via_amp = 1, this option was solved speaker white
noise at boot.
codec->power_save_node = 0, this option was solved speaker noise at
resume back.

Fixes: 9226665159 ("ALSA: hda/realtek - Fix Dell AIO LineOut issue")
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-04-28 08:20:14 +02:00
Paolo Abeni
21f1b8a663 udp: fix GRO reception in case of length mismatch
Currently, the UDP GRO code path does bad things on some edge
conditions - Aggregation can happen even on packet with different
lengths.

Fix the above by rewriting the 'complete' condition for GRO
packets. While at it, note explicitly that we allow merging the
first packet per burst below gso_size.

Reported-by: Sean Tong <seantong114@gmail.com>
Fixes: e20cf8d3f1 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 22:07:24 -04:00
David S. Miller
fbef9478ff Merge branch 'tls-data-copies'
Jakub Kicinski says:

====================
net/tls: fix data copies in tls_device_reencrypt()

This series fixes the tls_device_reencrypt() which is broken
if record starts in the frags of the message skb.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 20:17:20 -04:00
Jakub Kicinski
eb3d38d5ad net/tls: fix copy to fragments in reencrypt
Fragments may contain data from other records so we have to account
for that when we calculate the destination and max length of copy we
can perform.  Note that 'offset' is the offset within the message,
so it can't be passed as offset within the frag..

Here skb_store_bits() would have realised the call is wrong and
simply not copy data.

Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 20:17:19 -04:00
Jakub Kicinski
97e1caa517 net/tls: don't copy negative amounts of data in reencrypt
There is no guarantee the record starts before the skb frags.
If we don't check for this condition copy amount will get
negative, leading to reads and writes to random memory locations.
Familiar hilarity ensues.

Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 20:17:19 -04:00
David S. Miller
b2a20fd072 Merge branch 'bnxt_en-Misc-bug-fixes'
Michael Chan says:

====================
bnxt_en: Misc. bug fixes.

6 miscellaneous bug fixes covering several issues in error code paths,
a setup issue for statistics DMA, and an improvement for setting up
multicast address filters.

Please queue these for stable as well.
Patch #5 (bnxt_en: Fix statistics context reservation logic) is for the
most recent 5.0 stable only.  Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 17:00:19 -04:00
Michael Chan
0b397b17a4 bnxt_en: Fix uninitialized variable usage in bnxt_rx_pkt().
In bnxt_rx_pkt(), if the driver encounters BD errors, it will recycle
the buffers and jump to the end where the uninitailized variable "len"
is referenced.  Fix it by adding a new jump label that will skip
the length update.  This is the most correct fix since the length
may not be valid when we get this type of error.

Fixes: 6a8788f256 ("bnxt_en: add support for software dynamic interrupt moderation")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 17:00:19 -04:00
Michael Chan
3f93cd3f09 bnxt_en: Fix statistics context reservation logic.
In an earlier commit that fixes the number of stats contexts to
reserve for the RDMA driver, we added a function parameter to pass in
the number of stats contexts to all the relevant functions.  The passed
in parameter should have been used to set the enables field of the
firmware message.

Fixes: 780baad44f ("bnxt_en: Reserve 1 stat_ctx for RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 17:00:19 -04:00
Michael Chan
ad361adf0d bnxt_en: Pass correct extended TX port statistics size to firmware.
If driver determines that extended TX port statistics are not supported
or allocation of the data structure fails, make sure to pass 0 TX stats
size to firmware to disable it.  The firmware returned TX stats size should
also be set to 0 for consistency.  This will prevent
bnxt_get_ethtool_stats() from accessing the NULL TX stats pointer in
case there is mismatch between firmware and driver.

Fixes: 36e53349b6 ("bnxt_en: Add additional extended port statistics.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 17:00:19 -04:00
Michael Chan
1f83391bd6 bnxt_en: Fix possible crash in bnxt_hwrm_ring_free() under error conditions.
If we encounter errors during open and proceed to clean up,
bnxt_hwrm_ring_free() may crash if the rings we try to free have never
been allocated.  bnxt_cp_ring_for_rx() or bnxt_cp_ring_for_tx()
may reference pointers that have not been allocated.

Fix it by checking for valid fw_ring_id first before calling
bnxt_cp_ring_for_rx() or bnxt_cp_ring_for_tx().

Fixes: 2c61d2117e ("bnxt_en: Add helper functions to get firmware CP ring ID.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 17:00:18 -04:00
Vasundhara Volam
f9099d6114 bnxt_en: Free short FW command HWRM memory in error path in bnxt_init_one()
In the bnxt_init_one() error path, short FW command request memory
is not freed. This patch fixes it.

Fixes: e605db801b ("bnxt_en: Support for Short Firmware Message")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 17:00:18 -04:00
Michael Chan
b4e30e8e7e bnxt_en: Improve multicast address setup logic.
The driver builds a list of multicast addresses and sends it to the
firmware when the driver's ndo_set_rx_mode() is called.  In rare
cases, the firmware can fail this call if internal resources to
add multicast addresses are exhausted.  In that case, we should
try the call again by setting the ALL_MCAST flag which is more
guaranteed to succeed.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 17:00:18 -04:00
Rick Edgecombe
f2fde6a5bc KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit
The not-so-recent change to move VMX's VM-Exit handing to a dedicated
"function" unintentionally exposed KVM to a speculative attack from the
guest by executing a RET prior to stuffing the RSB.  Make RSB stuffing
happen immediately after VM-Exit, before any unpaired returns.

Alternatively, the VM-Exit path could postpone full RSB stuffing until
its current location by stuffing the RSB only as needed, or by avoiding
returns in the VM-Exit path entirely, but both alternatives are beyond
ugly since vmx_vmexit() has multiple indirect callers (by way of
vmx_vmenter()).  And putting the RSB stuffing immediately after VM-Exit
makes it much less likely to be re-broken in the future.

Note, the cost of PUSH/POP could be avoided in the normal flow by
pairing the PUSH RAX with the POP RAX in __vmx_vcpu_run() and adding an
a POP to nested_vmx_check_vmentry_hw(), but such a weird/subtle
dependency is likely to cause problems in the long run, and PUSH/POP
will take all of a few cycles, which is peanuts compared to the number
of cycles required to fill the RSB.

Fixes: 453eafbe65 ("KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines")
Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-27 09:48:52 +02:00
Andrew Lunn
fdfdf86720 net: phy: marvell: Fix buffer overrun with stats counters
marvell_get_sset_count() returns how many statistics counters there
are. If the PHY supports fibre, there are 3, otherwise two.

marvell_get_strings() does not make this distinction, and always
returns 3 strings. This then often results in writing past the end
of the buffer for the strings.

Fixes: 2170fef78a ("Marvell phy: add field to get errors from fiber link.")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 12:06:14 -04:00
Marcel Holtmann
4e43df38a2 genetlink: use idr_alloc_cyclic for family->id assignment
When allocating the next family->id it makes more sense to use
idr_alloc_cyclic to avoid re-using a previously used family->id as much
as possible.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 11:59:58 -04:00
Bjørn Mork
88ef66a283 qmi_wwan: new Wistron, ZTE and D-Link devices
Adding device entries found in vendor modified versions of this
driver.  Function maps for some of the devices follow:

WNC D16Q1, D16Q5, D18Q1 LTE CAT3 module (1435:0918)

MI_00 Qualcomm HS-USB Diagnostics
MI_01 Android Debug interface
MI_02 Qualcomm HS-USB Modem
MI_03 Qualcomm Wireless HS-USB Ethernet Adapter
MI_04 Qualcomm Wireless HS-USB Ethernet Adapter
MI_05 Qualcomm Wireless HS-USB Ethernet Adapter
MI_06 USB Mass Storage Device

 T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 0
 D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
 P:  Vendor=1435 ProdID=0918 Rev= 2.32
 S:  Manufacturer=Android
 S:  Product=Android
 S:  SerialNumber=0123456789ABCDEF
 C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA
 I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
 E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
 E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
 E:  Ad=84(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=86(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=88(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=8a(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

WNC D18 LTE CAT3 module (1435:d182)

MI_00 Qualcomm HS-USB Diagnostics
MI_01 Androd Debug interface
MI_02 Qualcomm HS-USB Modem
MI_03 Qualcomm HS-USB NMEA
MI_04 Qualcomm Wireless HS-USB Ethernet Adapter
MI_05 Qualcomm Wireless HS-USB Ethernet Adapter
MI_06 USB Mass Storage Device

ZM8510/ZM8620/ME3960 (19d2:0396)

MI_00 ZTE Mobile Broadband Diagnostics Port
MI_01 ZTE Mobile Broadband AT Port
MI_02 ZTE Mobile Broadband Modem
MI_03 ZTE Mobile Broadband NDIS Port (qmi_wwan)
MI_04 ZTE Mobile Broadband ADB Port

ME3620_X (19d2:1432)

MI_00 ZTE Diagnostics Device
MI_01 ZTE UI AT Interface
MI_02 ZTE Modem Device
MI_03 ZTE Mobile Broadband Network Adapter
MI_04 ZTE Composite ADB Interface

Reported-by: Lars Melin <larsm17@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 11:30:44 -04:00
Fabien Dessenne
56c5bc1849 net: ethernet: stmmac: manage the get_irq probe defer case
Manage the -EPROBE_DEFER error case for "stm32_pwr_wakeup"  IRQ.

Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 11:25:39 -04:00
Eric Dumazet
c1c4772178 l2tp: use rcu_dereference_sk_user_data() in l2tp_udp_encap_recv()
Canonical way to fetch sk_user_data from an encap_rcv() handler called
from UDP stack in rcu protected section is to use rcu_dereference_sk_user_data(),
otherwise compiler might read it multiple times.

Fixes: d00fa9adc5 ("il2tp: fix races with tunnel socket close")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 11:18:23 -04:00
David S. Miller
ad759c9069 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2019-04-25

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) the bpf verifier fix to properly mark registers in all stack frames, from Paul.

2) preempt_enable_no_resched->preempt_enable fix, from Peter.

3) other misc fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-26 01:54:42 -04:00
Peter Zijlstra
0edd6b64d1 bpf: Fix preempt_enable_no_resched() abuse
Unless the very next line is schedule(), or implies it, one must not use
preempt_enable_no_resched(). It can cause a preemption to go missing and
thereby cause arbitrary delays, breaking the PREEMPT=y invariant.

Cc: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 17:20:06 -07:00
Paul Chaignon
6dd7f14080 selftests/bpf: test cases for pkt/null checks in subprogs
The first test case, for pointer null checks, is equivalent to the
following pseudo-code.  It checks that the verifier does not complain on
line 6 and recognizes that ptr isn't null.

1: ptr = bpf_map_lookup_elem(map, &key);
2: ret = subprog(ptr) {
3:   return ptr != NULL;
4: }
5: if (ret)
6:   value = *ptr;

The second test case, for packet bound checks, is equivalent to the
following pseudo-code.  It checks that the verifier does not complain on
line 7 and recognizes that the packet is at least 1 byte long.

1: pkt_end = ctx.pkt_end;
2: ptr = ctx.pkt + 8;
3: ret = subprog(ptr, pkt_end) {
4:   return ptr <= pkt_end;
5: }
6: if (ret)
7:   value = *(u8 *)ctx.pkt;

Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 17:20:06 -07:00
Paul Chaignon
c6a9efa1d8 bpf: mark registers in all frames after pkt/null checks
In case of a null check on a pointer inside a subprog, we should mark all
registers with this pointer as either safe or unknown, in both the current
and previous frames.  Currently, only spilled registers and registers in
the current frame are marked.  Packet bound checks in subprogs have the
same issue.  This patch fixes it to mark registers in previous frames as
well.

A good reproducer for null checks looks as follow:

1: ptr = bpf_map_lookup_elem(map, &key);
2: ret = subprog(ptr) {
3:   return ptr != NULL;
4: }
5: if (ret)
6:   value = *ptr;

With the above, the verifier will complain on line 6 because it sees ptr
as map_value_or_null despite the null check in subprog 1.

Note that this patch fixes another resulting bug when using
bpf_sk_release():

1: sk = bpf_sk_lookup_tcp(...);
2: subprog(sk) {
3:   if (sk)
4:     bpf_sk_release(sk);
5: }
6: if (!sk)
7:   return 0;
8: return 1;

In the above, mark_ptr_or_null_regs will warn on line 6 because it will
try to free the reference state, even though it was already freed on
line 3.

Fixes: f4d7e40a5b ("bpf: introduce function calls (verification)")
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 17:20:06 -07:00
Matteo Croce
39391377f8 libbpf: add binary to gitignore
Some binaries are generated when building libbpf from tools/lib/bpf/,
namely libbpf.so.0.0.2 and libbpf.so.0.
Add them to the local .gitignore.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 17:20:06 -07:00
Alban Crequy
8694d8c1f8 tools: bpftool: fix infinite loop in map create
"bpftool map create" has an infinite loop on "while (argc)". The error
case is missing.

Symptoms: when forgetting to type the keyword 'type' in front of 'hash':
$ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128
(infinite loop, taking all the CPU)
^C

After the patch:
$ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128
Error: unknown arg hash

Fixes: 0b592b5a01 ("tools: bpftool: add map create command")
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 17:20:06 -07:00
YueHaibing
ecfc3fcabb MIPS: eBPF: Make ebpf_to_mips_reg() static
Fix sparse warning:

arch/mips/net/ebpf_jit.c:196:5: warning:
 symbol 'ebpf_to_mips_reg' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-25 17:20:06 -07:00
Tycho Andersen
7a0df7fbc1 seccomp: Make NEW_LISTENER and TSYNC flags exclusive
As the comment notes, the return codes for TSYNC and NEW_LISTENER
conflict, because they both return positive values, one in the case of
success and one in the case of error. So, let's disallow both of these
flags together.

While this is technically a userspace break, all the users I know
of are still waiting on me to land this feature in libseccomp, so I
think it'll be safe. Also, at present my use case doesn't require
TSYNC at all, so this isn't a big deal to disallow. If someone
wanted to support this, a path forward would be to add a new flag like
TSYNC_AND_LISTENER_YES_I_UNDERSTAND_THAT_TSYNC_WILL_JUST_RETURN_EAGAIN,
but the use cases are so different I don't see it really happening.

Finally, it's worth noting that this does actually fix a UAF issue: at the
end of seccomp_set_mode_filter(), we have:

        if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER) {
                if (ret < 0) {
                        listener_f->private_data = NULL;
                        fput(listener_f);
                        put_unused_fd(listener);
                } else {
                        fd_install(listener, listener_f);
                        ret = listener;
                }
        }
out_free:
        seccomp_filter_free(prepared);

But if ret > 0 because TSYNC raced, we'll install the listener fd and then
free the filter out from underneath it, causing a UAF when the task closes
it or dies. This patch also switches the condition to be simply if (ret),
so that if someone does add the flag mentioned above, they won't have to
remember to fix this too.

Reported-by: syzbot+b562969adb2e04af3442@syzkaller.appspotmail.com
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
CC: stable@vger.kernel.org # v5.0+
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: James Morris <jamorris@linux.microsoft.com>
2019-04-25 15:55:58 -07:00
Kees Cook
4ee0776760 selftests/seccomp: Prepare for exclusive seccomp flags
Some seccomp flags will become exclusive, so the selftest needs to
be adjusted to mask those out and test them individually for the "all
flags" tests.

Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Acked-by: James Morris <jamorris@linux.microsoft.com>
2019-04-25 15:55:48 -07:00
Andrey Smirnov
349ced9984 power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
Fix a similar endless event loop as was done in commit
8dcf32175b ("i2c: prevent endless uevent loop with
CONFIG_I2C_DEBUG_CORE"):

  The culprit is the dev_dbg printk in the i2c uevent handler. If
  this is activated (for instance by CONFIG_I2C_DEBUG_CORE) it results
  in an endless loop with systemd-journald.

  This happens if user-space scans the system log and reads the uevent
  file to get information about a newly created device, which seems
  fair use to me. Unfortunately reading the "uevent" file uses the
  same function that runs for creating the uevent for a new device,
  generating the next syslog entry

Both CONFIG_I2C_DEBUG_CORE and CONFIG_POWER_SUPPLY_DEBUG were reported
in https://bugs.freedesktop.org/show_bug.cgi?id=76886 but only former
seems to have been fixed. Drop debug prints as it was done in I2C
subsystem to resolve the issue.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2019-04-26 00:06:56 +02:00
Miquel Raynal
9a8f612ca0 mtd: rawnand: marvell: Clean the controller state before each operation
Since the migration of the driver to stop using the legacy
->select_chip() hook, there is nothing deselecting the target anymore,
thus the selection is not forced at the next access. Ensure the ND_RUN
bit and the interrupts are always in a clean state.

Cc: Daniel Mack <daniel@zonque.org>
Cc: stable@vger.kernel.org
Fixes: b25251414f ("mtd: rawnand: marvell: Stop implementing ->select_chip()")
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Tested-by: Daniel Mack <daniel@zonque.org>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-04-25 23:21:51 +02:00
Dmitry Osipenko
b88c9f4129 clk: Add missing stubs for a few functions
Compilation fails if any of undeclared clk_set_*() functions are in use
and CONFIG_HAVE_CLK=n.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-25 08:19:15 -07:00
Christoffer Dall
6bc210003d KVM: arm/arm64: Don't emulate virtual timers on userspace ioctls
When a VCPU never runs before a guest exists, but we set timer registers
up via ioctls, the associated hrtimer might never get cancelled.

Since we moved vcpu_load/put into the arch-specific implementations and
only have load/put for KVM_RUN, we won't ever have a scheduled hrtimer
for emulating a timer when modifying the timer state via an ioctl from
user space.  All we need to do is make sure that we pick up the right
state when we load the timer state next time userspace calls KVM_RUN
again.

We also do not need to worry about this interacting with the bg_timer,
because if we were in WFI from the guest, and somehow ended up in a
kvm_arm_timer_set_reg, it means that:

 1. the VCPU thread has received a signal,
 2. we have called vcpu_load when being scheduled in again,
 3. we have called vcpu_put when we returned to userspace for it to issue
    another ioctl

And therefore will not have a bg_timer programmed and the event is
treated as a spurious wakeup from WFI if userspace decides to run the
vcpu again even if there are not virtual interrupts.

This fixes stray virtual timer interrupts triggered by an expiring
hrtimer, which happens after a failed live migration, for instance.

Fixes: bee038a674 ("KVM: arm/arm64: Rework the timer code to use a timer_map")
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Reported-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-25 14:13:31 +01:00
Douglas Anderson
b82d6c1f8f mwifiex: Make resume actually do something useful again on SDIO cards
The commit fc3a2fcaa1 ("mwifiex: use atomic bitops to represent
adapter status variables") had a fairly straightforward bug in it.  It
contained this bit of diff:

 - if (!adapter->is_suspended) {
 + if (test_bit(MWIFIEX_IS_SUSPENDED, &adapter->work_flags)) {

As you can see the patch missed the "!" when converting to the atomic
bitops.  This meant that the resume hasn't done anything at all since
that commit landed and suspend/resume for mwifiex SDIO cards has been
totally broken.

After fixing this mwifiex suspend/resume appears to work again, at
least with the simple testing I've done.

Fixes: fc3a2fcaa1 ("mwifiex: use atomic bitops to represent adapter status variables")
Cc: <stable@vger.kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-04-25 14:05:14 +03:00
Suzuki K Poulose
2e8010bb71 kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP
With commit a80868f398, we no longer ensure that the
THP page is properly aligned in the guest IPA. Skip the stage2
huge mapping for unaligned IPA backed by transparent hugepages.

Fixes: a80868f398 ("KVM: arm/arm64: Enforce PTE mappings at stage2 when needed")
Reported-by: Eric Auger <eric.auger@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Chirstoffer Dall <christoffer.dall@arm.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Zheng Xiang <zhengxiang9@huawei.com>
Cc: Andrew Murray <andrew.murray@arm.com>
Cc: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-25 11:50:31 +01:00
Andrew Jones
811328fc32 KVM: arm/arm64: Ensure vcpu target is unset on reset failure
A failed KVM_ARM_VCPU_INIT should not set the vcpu target,
as the vcpu target is used by kvm_vcpu_initialized() to
determine if other vcpu ioctls may proceed. We need to set
the target before calling kvm_reset_vcpu(), but if that call
fails, we should then unset it and clear the feature bitmap
while we're at it.

Signed-off-by: Andrew Jones <drjones@redhat.com>
[maz: Simplified patch, completed commit message]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-25 11:50:31 +01:00
Alan Stern
c114944d7d USB: w1 ds2490: Fix bug caused by improper use of altsetting array
The syzkaller USB fuzzer spotted a slab-out-of-bounds bug in the
ds2490 driver.  This bug is caused by improper use of the altsetting
array in the usb_interface structure (the array's entries are not
always stored in numerical order), combined with a naive assumption
that all interfaces probed by the driver will have the expected number
of altsettings.

The bug can be fixed by replacing references to the possibly
non-existent intf->altsetting[alt] entry with the guaranteed-to-exist
intf->cur_altsetting entry.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+d65f673b847a1a96cdba@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 11:11:41 +02:00
Alan Stern
ef61eb43ad USB: yurex: Fix protection fault after device removal
The syzkaller USB fuzzer found a general-protection-fault bug in the
yurex driver.  The fault occurs when a device has been unplugged; the
driver's interrupt-URB handler logs an error message referring to the
device by name, after the device has been unregistered and its name
deallocated.

This problem is caused by the fact that the interrupt URB isn't
cancelled until the driver's private data structure is released, which
can happen long after the device is gone.  The cure is to make sure
that the interrupt URB is killed before yurex_disconnect() returns;
this is exactly the sort of thing that usb_poison_urb() was meant for.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+2eb9121678bdb36e6d57@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 11:11:41 +02:00
Malte Leip
c409ca3be3 usb: usbip: fix isoc packet num validation in get_pipe
Change the validation of number_of_packets in get_pipe to compare the
number of packets to a fixed maximum number of packets allowed, set to
be 1024. This number was chosen due to it being used by other drivers as
well, for example drivers/usb/host/uhci-q.c

Background/reason:
The get_pipe function in stub_rx.c validates the number of packets in
isochronous mode and aborts with an error if that number is too large,
in order to prevent malicious input from possibly triggering large
memory allocations. This was previously done by checking whether
pdu->u.cmd_submit.number_of_packets is bigger than the number of packets
that would be needed for pdu->u.cmd_submit.transfer_buffer_length bytes
if all except possibly the last packet had maximum length, given by
usb_endpoint_maxp(epd) *  usb_endpoint_maxp_mult(epd). This leads to an
error if URBs with packets shorter than the maximum possible length are
submitted, which is allowed according to
Documentation/driver-api/usb/URB.rst and occurs for example with the
snd-usb-audio driver.

Fixes: c6688ef9f2 ("usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input")
Signed-off-by: Malte Leip <malte@leip.net>
Cc: stable <stable@vger.kernel.org>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 11:11:41 +02:00
Kangjie Lu
22e8860cf8 net: ieee802154: fix missing checks for regmap_update_bits
regmap_update_bits could fail and deserves a check.

The patch adds the checks and if it fails, returns its error
code upstream.

Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2019-04-24 20:15:15 +02:00
Kalle Valo
792a2fdcee Merge tag 'iwlwifi-for-kalle-2019-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes
Third batch of iwlwifi fixes intended for v5.1

* Fix an oops when creating debugfs entries;
* Fix bug when trying to capture debugging info while in rfkill;
* Prevent potential uninitialized memory dumps into debugging logs;
* Fix some initialization parameters for AX210 devices;
* Fix an oops with non-MSIX devices;
2019-04-24 18:14:43 +03:00
Kailang Yang
0a29c57b76 ALSA: hda/realtek - Add new Dell platform for headset mode
Add two Dell platform for headset mode.

[ Note: this is a further correction / addition of the previous
  pin-based quirks for Dell machines; another entry for ALC236 with
  the d-mic pin 0x12 and an entry for ALC295 -- tiwai ]

Fixes: b26e36b7ef ("ALSA: hda/realtek - add two more pin configuration sets to quirk table")
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-04-24 12:22:20 +02:00
Anson Huang
d386bb9042 i2c: imx: correct the method of getting private data in notifier_call
The way of getting private imx_i2c_struct in i2c_imx_clk_notifier_call()
is incorrect, should use clk_change_nb element to get correct address
and avoid below kernel dump during POST_RATE_CHANGE notify by clk
framework:

Unable to handle kernel paging request at virtual address 03ef1488
pgd = (ptrval)
[03ef1488] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: events reduce_bus_freq_handler
PC is at i2c_imx_set_clk+0x10/0xb8
LR is at i2c_imx_clk_notifier_call+0x20/0x28
pc : [<806a893c>]    lr : [<806a8a04>]    psr: a0080013
sp : bf399dd8  ip : bf3432ac  fp : bf7c1dc0
r10: 00000002  r9 : 00000000  r8 : 00000000
r7 : 03ef1480  r6 : bf399e50  r5 : ffffffff  r4 : 00000000
r3 : bf025300  r2 : bf399e50  r1 : 00b71b00  r0 : bf399be8
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 4e03004a  DAC: 00000051
Process kworker/2:1 (pid: 38, stack limit = 0x(ptrval))
Stack: (0xbf399dd8 to 0xbf39a000)
9dc0:                                                       806a89e4 00000000
9de0: ffffffff bf399e50 00000002 806a8a04 806a89e4 80142900 ffffffff 00000000
9e00: bf34ef18 bf34ef04 00000000 ffffffff bf399e50 80142d84 00000000 bf399e6c
9e20: bf34ef00 80f214c4 bf025300 00000002 80f08d08 bf017480 00000000 80142df0
9e40: 00000000 80166ed8 80c27638 8045de58 bf352340 03ef1480 00b71b00 0f82e242
9e60: bf025300 00000002 03ef1480 80f60e5c 00000001 8045edf0 00000002 8045eb08
9e80: bf025300 00000002 03ef1480 8045ee10 03ef1480 8045eb08 bf01be40 00000002
9ea0: 03ef1480 8045ee10 07de2900 8045eb08 bf01b780 00000002 07de2900 8045ee10
9ec0: 80c27898 bf399ee4 bf020a80 00000002 1f78a400 8045ee10 80f60e5c 80460514
9ee0: 80f60e5c bf01b600 bf01b480 80460460 0f82e242 bf383a80 bf383a00 80f60e5c
9f00: 00000000 bf7c1dc0 80f60e70 80460564 80f60df0 80f60d24 80f60df0 8011e72c
9f20: 00000000 80f60df0 80f60e6c bf7c4f00 00000000 8011e7ac bf274000 8013bd84
9f40: bf7c1dd8 80f03d00 bf274000 bf7c1dc0 bf274014 bf7c1dd8 80f03d00 bf398000
9f60: 00000008 8013bfb4 00000000 bf25d100 bf25d0c0 00000000 bf274000 8013bf88
9f80: bf25d11c bf0cfebc 00000000 8014140c bf25d0c0 801412ec 00000000 00000000
9fa0: 00000000 00000000 00000000 801010e8 00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<806a893c>] (i2c_imx_set_clk) from [<806a8a04>] (i2c_imx_clk_notifier_call+0x20/0x28)
[<806a8a04>] (i2c_imx_clk_notifier_call) from [<80142900>] (notifier_call_chain+0x44/0x84)
[<80142900>] (notifier_call_chain) from [<80142d84>] (__srcu_notifier_call_chain+0x44/0x98)
[<80142d84>] (__srcu_notifier_call_chain) from [<80142df0>] (srcu_notifier_call_chain+0x18/0x20)
[<80142df0>] (srcu_notifier_call_chain) from [<8045de58>] (__clk_notify+0x78/0xa4)
[<8045de58>] (__clk_notify) from [<8045edf0>] (__clk_recalc_rates+0x60/0xb4)
[<8045edf0>] (__clk_recalc_rates) from [<8045ee10>] (__clk_recalc_rates+0x80/0xb4)
Code: e92d40f8 e5903298 e59072a0 e1530001 (e5975008)
---[ end trace fc7f5514b97b6cbb ]---

Fixes: 90ad2cbe88 ("i2c: imx: use clk notifier for rate changes")
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
2019-04-24 00:14:57 +02:00
Johannes Berg
5178791474 mac80211: don't attempt to rename ERR_PTR() debugfs dirs
We need to dereference the directory to get its parent to
be able to rename it, so it's clearly not safe to try to
do this with ERR_PTR() pointers. Skip in this case.

It seems that this is most likely what was causing the
report by syzbot, but I'm not entirely sure as it didn't
come with a reproducer this time.

Cc: stable@vger.kernel.org
Reported-by: syzbot+4ece1a28b8f4730547c9@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-04-23 13:47:05 +02:00
Sriram R
8772eed9a9 cfg80211: Notify previous user request during self managed wiphy registration
Commit c82c06ce43d3("cfg80211: Notify all User Hints To self managed wiphys")
notified all new user hints to self managed wiphy's after device registration.
But it didn't do this for anything other than cell base hints done before
registration.

This needs to be done during wiphy registration of a self managed device also,
so that the previous user settings are retained.

Fixes: c82c06ce43 ("cfg80211: Notify all User Hints To self managed wiphys")
Signed-off-by: Sriram R <srirrama@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-04-23 13:45:30 +02:00
Bhagavathi Perumal S
f1267cf3c0 mac80211: Fix kernel panic due to use of txq after free
The txq of vif is added to active_txqs list for ATF TXQ scheduling
in the function ieee80211_queue_skb(), but it was not properly removed
before freeing the txq object. It was causing use after free of the txq
objects from the active_txqs list, result was kernel panic
due to invalid memory access.

Fix kernel invalid memory access by properly removing txq object
from active_txqs list before free the object.

Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-04-23 13:45:30 +02:00
Stephen Boyd
ac71e68746 Merge tag 'clk-fixes-for-5.1' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-fixes
Pull Allwinner clk fixes from Maxime Ripard:

 - Some fixes for odd cases of the NKMP clocks

* tag 'clk-fixes-for-5.1' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  clk: sunxi-ng: nkmp: Explain why zero width check is needed
  clk: sunxi-ng: nkmp: Avoid GENMASK(-1, 0)
2019-04-19 13:06:58 -07:00
Alan Stern
c2b71462d2 USB: core: Fix bug caused by duplicate interface PM usage counter
The syzkaller fuzzer reported a bug in the USB hub driver which turned
out to be caused by a negative runtime-PM usage counter.  This allowed
a hub to be runtime suspended at a time when the driver did not expect
it.  The symptom is a WARNING issued because the hub's status URB is
submitted while it is already active:

	URB 0000000031fb463e submitted while active
	WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363

The negative runtime-PM usage count was caused by an unfortunate
design decision made when runtime PM was first implemented for USB.
At that time, USB class drivers were allowed to unbind from their
interfaces without balancing the usage counter (i.e., leaving it with
a positive count).  The core code would take care of setting the
counter back to 0 before allowing another driver to bind to the
interface.

Later on when runtime PM was implemented for the entire kernel, the
opposite decision was made: Drivers were required to balance their
runtime-PM get and put calls.  In order to maintain backward
compatibility, however, the USB subsystem adapted to the new
implementation by keeping an independent usage counter for each
interface and using it to automatically adjust the normal usage
counter back to 0 whenever a driver was unbound.

This approach involves duplicating information, but what is worse, it
doesn't work properly in cases where a USB class driver delays
decrementing the usage counter until after the driver's disconnect()
routine has returned and the counter has been adjusted back to 0.
Doing so would cause the usage counter to become negative.  There's
even a warning about this in the USB power management documentation!

As it happens, this is exactly what the hub driver does.  The
kick_hub_wq() routine increments the runtime-PM usage counter, and the
corresponding decrement is carried out by hub_event() in the context
of the hub_wq work-queue thread.  This work routine may sometimes run
after the driver has been unbound from its interface, and when it does
it causes the usage counter to go negative.

It is not possible for hub_disconnect() to wait for a pending
hub_event() call to finish, because hub_disconnect() is called with
the device lock held and hub_event() acquires that lock.  The only
feasible fix is to reverse the original design decision: remove the
duplicate interface-specific usage counter and require USB drivers to
balance their runtime PM gets and puts.  As far as I know, all
existing drivers currently do this.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+7634edaea4d0b341c625@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-19 21:15:13 +02:00
Alan Stern
fc834e607a USB: dummy-hcd: Fix failure to give back unlinked URBs
The syzkaller USB fuzzer identified a failure mode in which dummy-hcd
would never give back an unlinked URB.  This causes usb_kill_urb() to
hang, leading to WARNINGs and unkillable threads.

In dummy-hcd, all URBs are given back by the dummy_timer() routine as
it scans through the list of pending URBS.  Failure to give back URBs
can be caused by failure to start or early exit from the scanning
loop.  The code currently has two such pathways: One is triggered when
an unsupported bus transfer speed is encountered, and the other by
exhausting the simulated bandwidth for USB transfers during a frame.

This patch removes those two paths, thereby allowing all unlinked URBs
to be given back in a timely manner.  It adds a check for the bus
speed when the gadget first starts running, so that dummy_timer() will
never thereafter encounter an unsupported speed.  And it prevents the
loop from exiting as soon as the total bandwidth has been used up (the
scanning loop continues, giving back unlinked URBs as they are found,
but not transferring any more data).

Thanks to Andrey Konovalov for manually running the syzkaller fuzzer
to help track down the source of the bug.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+d919b0f29d7b5a4994b9@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-19 14:15:26 +02:00
Stephen Boyd
5a7efdacb9 clkdev: Hold clocks_mutex while iterating clocks list
We recently introduced a change to support devm clk lookups. That change
introduced a code-path that used clk_find() without holding the
'clocks_mutex'. Unfortunately, clk_find() iterates over the 'clocks'
list and so we need to prevent the list from being modified at the same
time. Do this by holding the mutex and checking to make sure it's held
while iterating the list.

Note, we don't really care if the lookup is freed after we find it with
clk_find() because we're just doing a pointer comparison, but if we did
care we would need to keep holding the mutex while we dereference the
clk_lookup pointer.

Fixes: 3eee6c7d11 ("clkdev: add managed clkdev lookup registration")
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Jeffrey Hugo <jhugo@codeaurora.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Acked-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-18 13:33:13 -07:00
Sean Christopherson
b6aa57c69c KVM: lapic: Convert guest TSC to host time domain if necessary
To minimize the latency of timer interrupts as observed by the guest,
KVM adjusts the values it programs into the host timers to account for
the host's overhead of programming and handling the timer event.  In
the event that the adjustments are too aggressive, i.e. the timer fires
earlier than the guest expects, KVM busy waits immediately prior to
entering the guest.

Currently, KVM manually converts the delay from nanoseconds to clock
cycles.  But, the conversion is done in the guest's time domain, while
the delay occurs in the host's time domain.  This is perfectly ok when
the guest and host are using the same TSC ratio, but if the guest is
using a different ratio then the delay may not be accurate and could
wait too little or too long.

When the guest is not using the host's ratio, convert the delay from
guest clock cycles to host nanoseconds and use ndelay() instead of
__delay() to provide more accurate timing.  Because converting to
nanoseconds is relatively expensive, e.g. requires division and more
multiplication ops, continue using __delay() directly when guest and
host TSCs are running at the same ratio.

Cc: Liran Alon <liran.alon@oracle.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Fixes: 3b8a5df6c4 ("KVM: LAPIC: Tune lapic_timer_advance_ns automatically")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-18 18:56:30 +02:00
Sean Christopherson
c3941d9e0c KVM: lapic: Allow user to disable adaptive tuning of timer advancement
The introduction of adaptive tuning of lapic timer advancement did not
allow for the scenario where userspace would want to disable adaptive
tuning but still employ timer advancement, e.g. for testing purposes or
to handle a use case where adaptive tuning is unable to settle on a
suitable time.  This is epecially pertinent now that KVM places a hard
threshold on the maximum advancment time.

Rework the timer semantics to accept signed values, with a value of '-1'
being interpreted as "use adaptive tuning with KVM's internal default",
and any other value being used as an explicit advancement time, e.g. a
time of '0' effectively disables advancement.

Note, this does not completely restore the original behavior of
lapic_timer_advance_ns.  Prior to tracking the advancement per vCPU,
which is necessary to support autotuning, userspace could adjust
lapic_timer_advance_ns for *running* vCPU.  With per-vCPU tracking, the
module params are snapshotted at vCPU creation, i.e. applying a new
advancement effectively requires restarting a VM.

Dynamically updating a running vCPU is possible, e.g. a helper could be
added to retrieve the desired delay, choosing between the global module
param and the per-VCPU value depending on whether or not auto-tuning is
(globally) enabled, but introduces a great deal of complexity.  The
wrapper itself is not complex, but understanding and documenting the
effects of dynamically toggling auto-tuning and/or adjusting the timer
advancement is nigh impossible since the behavior would be dependent on
KVM's implementation as well as compiler optimizations.  In other words,
providing stable behavior would require extremely careful consideration
now and in the future.

Given that the expected use of a manually-tuned timer advancement is to
"tune once, run many", use the vastly simpler approach of recognizing
changes to the module params only when creating a new vCPU.

Cc: Liran Alon <liran.alon@oracle.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Cc: stable@vger.kernel.org
Fixes: 3b8a5df6c4 ("KVM: LAPIC: Tune lapic_timer_advance_ns automatically")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-18 18:56:15 +02:00
Sean Christopherson
39497d7660 KVM: lapic: Track lapic timer advance per vCPU
Automatically adjusting the globally-shared timer advancement could
corrupt the timer, e.g. if multiple vCPUs are concurrently adjusting
the advancement value.  That could be partially fixed by using a local
variable for the arithmetic, but it would still be susceptible to a
race when setting timer_advance_adjust_done.

And because virtual_tsc_khz and tsc_scaling_ratio are per-vCPU, the
correct calibration for a given vCPU may not apply to all vCPUs.

Furthermore, lapic_timer_advance_ns is marked __read_mostly, which is
effectively violated when finding a stable advancement takes an extended
amount of timer.

Opportunistically change the definition of lapic_timer_advance_ns to
a u32 so that it matches the style of struct kvm_timer.  Explicitly
pass the param to kvm_create_lapic() so that it doesn't have to be
exposed to lapic.c, thus reducing the probability of unintentionally
using the global value instead of the per-vCPU value.

Cc: Liran Alon <liran.alon@oracle.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Cc: stable@vger.kernel.org
Fixes: 3b8a5df6c4 ("KVM: LAPIC: Tune lapic_timer_advance_ns automatically")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-18 18:55:41 +02:00
Sean Christopherson
57bf67e73c KVM: lapic: Disable timer advancement if adaptive tuning goes haywire
To minimize the latency of timer interrupts as observed by the guest,
KVM adjusts the values it programs into the host timers to account for
the host's overhead of programming and handling the timer event.  Now
that the timer advancement is automatically tuned during runtime, it's
effectively unbounded by default, e.g. if KVM is running as L1 the
advancement can measure in hundreds of milliseconds.

Disable timer advancement if adaptive tuning yields an advancement of
more than 5000ns, as large advancements can break reasonable assumptions
of the guest, e.g. that a timer configured to fire after 1ms won't
arrive on the next instruction.  Although KVM busy waits to mitigate the
case of a timer event arriving too early, complications can arise when
shifting the interrupt too far, e.g. kvm-unit-test's vmx.interrupt test
will fail when its "host" exits on interrupts as KVM may inject the INTR
before the guest executes STI+HLT.   Arguably the unit test is "broken"
in the sense that delaying a timer interrupt by 1ms doesn't technically
guarantee the interrupt will arrive after STI+HLT, but it's a reasonable
assumption that KVM should support.

Furthermore, an unbounded advancement also effectively unbounds the time
spent busy waiting, e.g. if the guest programs a timer with a very large
delay.

5000ns is a somewhat arbitrary threshold.  When running on bare metal,
which is the intended use case, timer advancement is expected to be in
the general vicinity of 1000ns.  5000ns is high enough that false
positives are unlikely, while not being so high as to negatively affect
the host's performance/stability.

Note, a future patch will enable userspace to disable KVM's adaptive
tuning, which will allow priveleged userspace will to specifying an
advancement value in excess of this arbitrary threshold in order to
satisfy an abnormal use case.

Cc: Liran Alon <liran.alon@oracle.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Fixes: 3b8a5df6c4 ("KVM: LAPIC: Tune lapic_timer_advance_ns automatically")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-18 18:53:18 +02:00
Vitaly Kuznetsov
da66761c2d x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012
It was reported that with some special Multi Processor Group configuration,
e.g:
 bcdedit.exe /set groupsize 1
 bcdedit.exe /set maxgroup on
 bcdedit.exe /set groupaware on
for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism
is in use.

Tracing kvm_hv_flush_tlb immediately reveals the issue:

 kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2

The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES,
however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified.
We don't flush anything and apparently it's not what Windows expects.

TLFS doesn't say anything about such requests and newer Windows versions
seem to be unaffected. This all feels like a WS2012 bug, which is, however,
easy to workaround in KVM: let's flush everything when we see an empty
flush request, over-flushing doesn't hurt.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-18 18:53:18 +02:00
Liran Alon
c09d65d9ea KVM: x86: Consider LAPIC TSC-Deadline timer expired if deadline too short
If guest sets MSR_IA32_TSCDEADLINE to value such that in host
time-domain it's shorter than lapic_timer_advance_ns, we can
reach a case that we call hrtimer_start() with expiration time set at
the past.

Because lapic_timer.timer is init with HRTIMER_MODE_ABS_PINNED, it
is not allowed to run in softirq and therefore will never expire.

To avoid such a scenario, verify that deadline expiration time is set on
host time-domain further than (now + lapic_timer_advance_ns).

A future patch can also consider adding a min_timer_deadline_ns module parameter,
similar to min_timer_period_us to avoid races that amount of ns it takes
to run logic could still call hrtimer_start() with expiration timer set
at the past.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-18 18:53:17 +02:00
Paolo Bonzini
78671ab4c9 Merge tag 'kvm-ppc-fixes-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
KVM/PPC fixes for 5.1

- Fix host hang in the HTM assist code for POWER9
- Take srcu read lock around memslot lookup
2019-04-18 18:53:12 +02:00
Shaul Triebitz
c537e07b00 iwlwifi: cfg: use family 22560 based_params for AX210 family
Specifically, max_tfd_queue_size should be 0x10000 like in
22560 family and not 0x100 like in 22000 family.

Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-04-18 15:06:44 +03:00
Greg Kroah-Hartman
154d4899e4 iwlwifi: mvm: properly check debugfs dentry before using it
debugfs can now report an error code if something went wrong instead of
just NULL.  So if the return value is to be used as a "real" dentry, it
needs to be checked if it is an error before dereferencing it.

This is now happening because of ff9fb72bc0 ("debugfs: return error
values, not NULL").  If multiple iwlwifi devices are in the system, this
can cause problems when the driver attempts to create the main debugfs
directory again.  Later on in the code we fail horribly by trying to
dereference a pointer that is an error value.

Reported-by: Laura Abbott <labbott@redhat.com>
Reported-by: Gabriel Ramirez <gabriello.ramirez@gmail.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Intel Linux Wireless <linuxwifi@intel.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: stable <stable@vger.kernel.org> # 5.0
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-04-18 15:06:44 +03:00
Shahar S Matityahu
b35f63972c iwlwifi: dbg_ini: check debug TLV type explicitly
In ini debug TLVs bit 24 is set. The driver relies on it in the memory
allocation for the debug configuration. This implementation is
problematic in case of a new debug TLV that is not supported yet is added
and uses bit 24. In such a scenario the driver allocate space without
using it which causes errors in the apply point enabling flow.

Solve it by explicitly checking if a given TLV is part of the list of
the supported ini debug TLVs.

Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Fixes: f14cda6f3b ("iwlwifi: trans: parse and store debug ini TLVs")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-04-18 14:07:39 +03:00
Johannes Berg
72d3c7bbc9 iwlwifi: mvm: don't attempt debug collection in rfkill
If we fail to initialize because rfkill is enabled, then trying
to do debug collection currently just fails. Prevent that in the
high-level code, although we should probably also fix the lower
level code to do things more carefully.

It's not 100% clear that it fixes this commit, as the original
dump code at the time might've been more careful. In any case,
we don't really need to dump anything in this expected scenario.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: 7125648074 ("iwlwifi: add fw dump upon RT ucode start failure")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-04-18 14:07:39 +03:00
Shahar S Matityahu
1c6bca6d75 iwlwifi: don't panic in error path on non-msix systems
The driver uses msix causes-register to handle both msix and non msix
interrupts when performing sync nmi.  On devices that do not support
msix this register is unmapped and accessing it causes a kernel panic.

Solve this by differentiating the two cases and accessing the proper
causes-register in each case.

Reported-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-04-18 14:07:39 +03:00
Alan Stern
c01c348ecd USB: core: Fix unterminated string returned by usb_string()
Some drivers (such as the vub300 MMC driver) expect usb_string() to
return a properly NUL-terminated string, even when an error occurs.
(In fact, vub300's probe routine doesn't bother to check the return
code from usb_string().)  When the driver goes on to use an
unterminated string, it leads to kernel errors such as
stack-out-of-bounds, as found by the syzkaller USB fuzzer.

An out-of-range string index argument is not at all unlikely, given
that some devices don't provide string descriptors and therefore list
0 as the value for their string indexes.  This patch makes
usb_string() return a properly terminated empty string along with the
-EINVAL error code when an out-of-range index is encountered.

And since a USB string index is a single-byte value, indexes >= 256
are just as invalid as values of 0 or below.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: syzbot+b75b85111c10b8d680f1@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-16 12:23:01 +02:00
Logan Gunthorpe
d5bc73f34c PCI: Fix issue with "pci=disable_acs_redir" parameter being ignored
In most cases, kmalloc() will not be available early in boot when
pci_setup() is called.  Thus, the kstrdup() call that was added to fix the
__initdata bug with the disable_acs_redir parameter usually returns NULL,
so the parameter is discarded and has no effect.

To fix this, store the string that's in initdata until an initcall function
can allocate the memory appropriately.  This way we don't need any
additional static memory.

Fixes: d2fd6e8191 ("PCI: Fix __initdata issue with "pci=disable_acs_redir" parameter")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-04-12 15:32:08 -05:00
Nicolas Dichtel
837f741165 xfrm: update doc about xfrm[46]_gc_thresh
Those entries are not used anymore.

CC: Florian Westphal <fw@strlen.de>
Fixes: 09c7570480 ("xfrm: remove flow cache")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-04-12 09:38:23 +02:00
Al Viro
f51dcd0f62 apparmorfs: fix use-after-free on symlink traversal
symlink body shouldn't be freed without an RCU delay.  Switch apparmorfs
to ->destroy_inode() and use of call_rcu(); free both the inode and symlink
body in the callback.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-10 14:04:34 -04:00
Al Viro
46c8744196 securityfs: fix use-after-free on symlink traversal
symlink body shouldn't be freed without an RCU delay.  Switch securityfs
to ->destroy_inode() and use of call_rcu(); free both the inode and symlink
body in the callback.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-10 14:03:45 -04:00
Johannes Berg
e9f33a8fee mac80211: fix RX STBC override byte order
The original patch neglected to take byte order conversions
into account, fix that.

Fixes: d9bb410888 ("mac80211: allow overriding HT STBC capabilities")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-04-10 14:13:54 +02:00
Tony Lindgren
dbe7208c6c power: supply: cpcap-battery: Fix division by zero
If called fast enough so samples do not increment, we can get
division by zero in kernel:

__div0
cpcap_battery_cc_raw_div
cpcap_battery_get_property
power_supply_get_property.part.1
power_supply_get_property
power_supply_show_property
power_supply_uevent

Fixes: 874b2adbed ("power: supply: cpcap-battery: Add a battery driver")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2019-04-10 00:53:20 +02:00
Eugeniy Paltsev
55c0c4c793 ARC: memset: fix build with L1_CACHE_SHIFT != 6
In case of 'L1_CACHE_SHIFT != 6' we define dummy assembly macroses
PREALLOC_INSTR and PREFETCHW_INSTR without arguments. However
we pass arguments to them in code which cause build errors.
Fix that.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Cc: <stable@vger.kernel.org>    [5.0]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-04-08 08:41:44 -07:00
Alexey Kardashevskiy
345077c8e1 KVM: PPC: Book3S: Protect memslots while validating user address
Guest physical to user address translation uses KVM memslots and reading
these requires holding the kvm->srcu lock. However recently introduced
kvmppc_tce_validate() broke the rule (see the lockdep warning below).

This moves srcu_read_lock(&vcpu->kvm->srcu) earlier to protect
kvmppc_tce_validate() as well.

=============================
WARNING: suspicious RCU usage
5.1.0-rc2-le_nv2_aikATfstn1-p1 #380 Not tainted
-----------------------------
include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by qemu-system-ppc/8020:
 #0: 0000000094972fe9 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0xdc/0x850 [kvm]

stack backtrace:
CPU: 44 PID: 8020 Comm: qemu-system-ppc Not tainted 5.1.0-rc2-le_nv2_aikATfstn1-p1 #380
Call Trace:
[c000003fece8f740] [c000000000bcc134] dump_stack+0xe8/0x164 (unreliable)
[c000003fece8f790] [c000000000181be0] lockdep_rcu_suspicious+0x130/0x170
[c000003fece8f810] [c0000000000d5f50] kvmppc_tce_to_ua+0x280/0x290
[c000003fece8f870] [c00800001a7e2c78] kvmppc_tce_validate+0x80/0x1b0 [kvm]
[c000003fece8f8e0] [c00800001a7e3fac] kvmppc_h_put_tce+0x94/0x3e4 [kvm]
[c000003fece8f9a0] [c00800001a8baac4] kvmppc_pseries_do_hcall+0x30c/0xce0 [kvm_hv]
[c000003fece8fa10] [c00800001a8bd89c] kvmppc_vcpu_run_hv+0x694/0xec0 [kvm_hv]
[c000003fece8fae0] [c00800001a7d95dc] kvmppc_vcpu_run+0x34/0x48 [kvm]
[c000003fece8fb00] [c00800001a7d56bc] kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
[c000003fece8fb90] [c00800001a7c3618] kvm_vcpu_ioctl+0x460/0x850 [kvm]
[c000003fece8fd00] [c00000000041c4f4] do_vfs_ioctl+0xe4/0x930
[c000003fece8fdb0] [c00000000041ce04] ksys_ioctl+0xc4/0x110
[c000003fece8fe00] [c00000000041ce78] sys_ioctl+0x28/0x80
[c000003fece8fe20] [c00000000000b5a4] system_call+0x5c/0x70

Fixes: 42de7b9e21 ("KVM: PPC: Validate TCEs against preregistered memory page sizes", 2018-09-10)
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-04-05 14:37:24 +11:00
Suraj Jitindar Singh
7cb9eb106d KVM: PPC: Book3S HV: Perserve PSSCR FAKE_SUSPEND bit on guest exit
There is a hardware bug in some POWER9 processors where a treclaim in
fake suspend mode can cause an inconsistency in the XER[SO] bit across
the threads of a core, the workaround being to force the core into SMT4
when doing the treclaim.

The FAKE_SUSPEND bit (bit 10) in the PSSCR is used to control whether a
thread is in fake suspend or real suspend. The important difference here
being that thread reconfiguration is blocked in real suspend but not
fake suspend mode.

When we exit a guest which was in fake suspend mode, we force the core
into SMT4 while we do the treclaim in kvmppc_save_tm_hv().
However on the new exit path introduced with the function
kvmhv_run_single_vcpu() we restore the host PSSCR before calling
kvmppc_save_tm_hv() which means that if we were in fake suspend mode we
put the thread into real suspend mode when we clear the
PSSCR[FAKE_SUSPEND] bit. This means that we block thread reconfiguration
and the thread which is trying to get the core into SMT4 before it can
do the treclaim spins forever since it itself is blocking thread
reconfiguration. The result is that that core is essentially lost.

This results in a trace such as:
[   93.512904] CPU: 7 PID: 13352 Comm: qemu-system-ppc Not tainted 5.0.0 #4
[   93.512905] NIP:  c000000000098a04 LR: c0000000000cc59c CTR: 0000000000000000
[   93.512908] REGS: c000003fffd2bd70 TRAP: 0100   Not tainted  (5.0.0)
[   93.512908] MSR:  9000000302883033 <SF,HV,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[SE]>  CR: 22222444  XER: 00000000
[   93.512914] CFAR: c000000000098a5c IRQMASK: 3
[   93.512915] PACATMSCRATCH: 0000000000000001
[   93.512916] GPR00: 0000000000000001 c000003f6cc1b830 c000000001033100 0000000000000004
[   93.512928] GPR04: 0000000000000004 0000000000000002 0000000000000004 0000000000000007
[   93.512930] GPR08: 0000000000000000 0000000000000004 0000000000000000 0000000000000004
[   93.512932] GPR12: c000203fff7fc000 c000003fffff9500 0000000000000000 0000000000000000
[   93.512935] GPR16: 2000000000300375 000000000000059f 0000000000000000 0000000000000000
[   93.512951] GPR20: 0000000000000000 0000000000080053 004000000256f41f c000003f6aa88ef0
[   93.512953] GPR24: c000003f6aa89100 0000000000000010 0000000000000000 0000000000000000
[   93.512956] GPR28: c000003f9e9a0800 0000000000000000 0000000000000001 c000203fff7fc000
[   93.512959] NIP [c000000000098a04] pnv_power9_force_smt4_catch+0x1b4/0x2c0
[   93.512960] LR [c0000000000cc59c] kvmppc_save_tm_hv+0x40/0x88
[   93.512960] Call Trace:
[   93.512961] [c000003f6cc1b830] [0000000000080053] 0x80053 (unreliable)
[   93.512965] [c000003f6cc1b8a0] [c00800001e9cb030] kvmhv_p9_guest_entry+0x508/0x6b0 [kvm_hv]
[   93.512967] [c000003f6cc1b940] [c00800001e9cba44] kvmhv_run_single_vcpu+0x2dc/0xb90 [kvm_hv]
[   93.512968] [c000003f6cc1ba10] [c00800001e9cc948] kvmppc_vcpu_run_hv+0x650/0xb90 [kvm_hv]
[   93.512969] [c000003f6cc1bae0] [c00800001e8f620c] kvmppc_vcpu_run+0x34/0x48 [kvm]
[   93.512971] [c000003f6cc1bb00] [c00800001e8f2d4c] kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
[   93.512972] [c000003f6cc1bb90] [c00800001e8e3918] kvm_vcpu_ioctl+0x460/0x7d0 [kvm]
[   93.512974] [c000003f6cc1bd00] [c0000000003ae2c0] do_vfs_ioctl+0xe0/0x8e0
[   93.512975] [c000003f6cc1bdb0] [c0000000003aeb24] ksys_ioctl+0x64/0xe0
[   93.512978] [c000003f6cc1be00] [c0000000003aebc8] sys_ioctl+0x28/0x80
[   93.512981] [c000003f6cc1be20] [c00000000000b3a4] system_call+0x5c/0x70
[   93.512983] Instruction dump:
[   93.512986] 419dffbc e98c0000 2e8b0000 38000001 60000000 60000000 60000000 40950068
[   93.512993] 392bffff 39400000 79290020 39290001 <7d2903a6> 60000000 60000000 7d235214

To fix this we preserve the PSSCR[FAKE_SUSPEND] bit until we call
kvmppc_save_tm_hv() which will mean the core can get into SMT4 and
perform the treclaim. Note kvmppc_save_tm_hv() clears the
PSSCR[FAKE_SUSPEND] bit again so there is no need to explicitly do that.

Fixes: 95a6432ce9 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests")

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-04-05 14:37:24 +11:00
Jernej Skrabec
1054e4dd1c clk: sunxi-ng: nkmp: Explain why zero width check is needed
Add an explanation why zero width check is needed when generating factor
mask using GENMASK() macro.

Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
2019-04-04 09:32:32 +02:00
Jernej Skrabec
2abc330c51 clk: sunxi-ng: nkmp: Avoid GENMASK(-1, 0)
Sometimes one of the nkmp factors is unused. This means that one of the
factors shift and width values are set to 0. Current nkmp clock code
generates a mask for each factor with GENMASK(width + shift - 1, shift).
For unused factor this translates to GENMASK(-1, 0). This code is
further expanded by C preprocessor to final version:
(((~0UL) - (1UL << (0)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (-1))))
or a bit simplified:
(~0UL & (~0UL >> BITS_PER_LONG))

It turns out that result of the second part (~0UL >> BITS_PER_LONG) is
actually undefined by C standard, which clearly specifies:

"If the value of the right operand is negative or is greater than or
equal to the width of the promoted left operand, the behavior is
undefined."

Additionally, compiling kernel with aarch64-linux-gnu-gcc 8.3.0 gave
different results whether literals or variables with same values as
literals were used. GENMASK with literals -1 and 0 gives zero and with
variables gives 0xFFFFFFFFFFFFFFF (~0UL). Because nkmp driver uses
GENMASK with variables as parameter, expression calculates mask as ~0UL
instead of 0. This has further consequences that LSB in register is
always set to 1 (1 is neutral value for a factor and shift is 0).

For example, H6 pll-de clock is set to 600 MHz by sun4i-drm driver, but
due to this bug ends up being 300 MHz. Additionally, 300 MHz seems to be
too low because following warning can be found in dmesg:

[    1.752763] WARNING: CPU: 2 PID: 41 at drivers/clk/sunxi-ng/ccu_common.c:41 ccu_helper_wait_for_lock.part.0+0x6c/0x90
[    1.763378] Modules linked in:
[    1.766441] CPU: 2 PID: 41 Comm: kworker/2:1 Not tainted 5.1.0-rc2-next-20190401 #138
[    1.774269] Hardware name: Pine H64 (DT)
[    1.778200] Workqueue: events deferred_probe_work_func
[    1.783341] pstate: 40000005 (nZcv daif -PAN -UAO)
[    1.788135] pc : ccu_helper_wait_for_lock.part.0+0x6c/0x90
[    1.793623] lr : ccu_helper_wait_for_lock.part.0+0x48/0x90
[    1.799107] sp : ffff000010f93840
[    1.802422] x29: ffff000010f93840 x28: 0000000000000000
[    1.807735] x27: ffff800073ce9d80 x26: ffff000010afd1b8
[    1.813049] x25: ffffffffffffffff x24: 00000000ffffffff
[    1.818362] x23: 0000000000000001 x22: ffff000010abd5c8
[    1.823675] x21: 0000000010000000 x20: 00000000685f367e
[    1.828987] x19: 0000000000001801 x18: 0000000000000001
[    1.834300] x17: 0000000000000001 x16: 0000000000000000
[    1.839613] x15: 0000000000000000 x14: ffff000010789858
[    1.844926] x13: 0000000000000000 x12: 0000000000000001
[    1.850239] x11: 0000000000000000 x10: 0000000000000970
[    1.855551] x9 : ffff000010f936c0 x8 : ffff800074cec0d0
[    1.860864] x7 : 0000800067117000 x6 : 0000000115c30b41
[    1.866177] x5 : 00ffffffffffffff x4 : 002c959300bfe500
[    1.871490] x3 : 0000000000000018 x2 : 0000000029aaaaab
[    1.876802] x1 : 00000000000002e6 x0 : 00000000686072bc
[    1.882114] Call trace:
[    1.884565]  ccu_helper_wait_for_lock.part.0+0x6c/0x90
[    1.889705]  ccu_helper_wait_for_lock+0x10/0x20
[    1.894236]  ccu_nkmp_set_rate+0x244/0x2a8
[    1.898334]  clk_change_rate+0x144/0x290
[    1.902258]  clk_core_set_rate_nolock+0x180/0x1b8
[    1.906963]  clk_set_rate+0x34/0xa0
[    1.910455]  sun8i_mixer_bind+0x484/0x558
[    1.914466]  component_bind_all+0x10c/0x230
[    1.918651]  sun4i_drv_bind+0xc4/0x1a0
[    1.922401]  try_to_bring_up_master+0x164/0x1c0
[    1.926932]  __component_add+0xa0/0x168
[    1.930769]  component_add+0x10/0x18
[    1.934346]  sun8i_dw_hdmi_probe+0x18/0x20
[    1.938443]  platform_drv_probe+0x50/0xa0
[    1.942455]  really_probe+0xcc/0x280
[    1.946032]  driver_probe_device+0x54/0xe8
[    1.950130]  __device_attach_driver+0x80/0xb8
[    1.954488]  bus_for_each_drv+0x78/0xc8
[    1.958326]  __device_attach+0xd4/0x130
[    1.962163]  device_initial_probe+0x10/0x18
[    1.966348]  bus_probe_device+0x90/0x98
[    1.970185]  deferred_probe_work_func+0x6c/0xa0
[    1.974720]  process_one_work+0x1e0/0x320
[    1.978732]  worker_thread+0x228/0x428
[    1.982484]  kthread+0x120/0x128
[    1.985714]  ret_from_fork+0x10/0x18
[    1.989290] ---[ end trace 9babd42e1ca4b84f ]---

This commit solves the issue by first checking value of the factor
width. If it is equal to 0 (unused factor), mask is set to 0, otherwise
GENMASK() macro is used as before.

Fixes: d897ef56fa ("clk: sunxi-ng: Mask nkmp factors when setting register")
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
2019-04-03 09:53:04 +02:00
Marc Zyngier
96085b9496 KVM: arm/arm64: vgic-v3: Retire pending interrupts on disabling LPIs
When disabling LPIs (for example on reset) at the redistributor
level, it is expected that LPIs that was pending in the CPU
interface are eventually retired.

Currently, this is not what is happening, and these LPIs will
stay in the ap_list, eventually being acknowledged by the vcpu
(which didn't quite expect this behaviour).

The fix is thus to retire these LPIs from the list of pending
interrupts as we disable LPIs.

Reported-by: Heyi Guo <guoheyi@huawei.com>
Tested-by: Heyi Guo <guoheyi@huawei.com>
Fixes: 0e4e82f154 ("KVM: arm64: vgic-its: Enable ITS emulation as a virtual MSI controller")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-03 02:18:43 +01:00
Vineet Gupta
21cee1bd15 ARC: [hsdk] Make it easier to add PAE40 region to DTB
1. Bump top level address-cells/size-cells nodes to 2 (to ensure all
   down stream addresses are 64-bits, unless explicitly specified
   otherwise (in "soc" bus with all peripherals)

2. "memory" also specified with address/size 2

3. Add a commented reference for PAE40 region beyond 4GB physical
   address space

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-04-02 12:11:01 -07:00
Vineet Gupta
99bd5fcc50 ARC: PAE40: don't panic and instead turn off hw ioc
HSDK currently panics when built for HIGHMEM/ARC_HAS_PAE40 because ioc
is enabled with default which doesn't work for the 2 non contiguous
memory nodes. So get PAE working by disabling ioc instead.

Tested with !PAE40 by forcing @ioc_enable=0 and running the glibc
testsuite over ssh

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2019-04-02 12:03:07 -07:00
Steffen Klassert
8742dc86d0 xfrm4: Fix uninitialized memory read in _decode_session4
We currently don't reload pointers pointing into skb header
after doing pskb_may_pull() in _decode_session4(). So in case
pskb_may_pull() changed the pointers, we read from random
memory. Fix this by putting all the needed infos on the
stack, so that we don't need to access the header pointers
after doing pskb_may_pull().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-04-02 08:18:39 +02:00
Wei Huang
8fa7616248 KVM: arm/arm64: arch_timer: Fix CNTP_TVAL calculation
Recently the generic timer test of kvm-unit-tests failed to complete
(stalled) when a physical timer is being used. This issue is caused
by incorrect update of CNTP_CVAL when CNTP_TVAL is being accessed,
introduced by 'Commit 84135d3d18 ("KVM: arm/arm64: consolidate arch
timer trap handlers")'. According to Arm ARM, the read/write behavior
of accesses to the TVAL registers is expected to be:

  * READ: TimerValue = (CompareValue – (Counter - Offset)
  * WRITE: CompareValue = ((Counter - Offset) + Sign(TimerValue)

This patch fixes the TVAL read/write code path according to the
specification.

Fixes: 84135d3d18 ("KVM: arm/arm64: consolidate arch timer trap handlers")
Signed-off-by: Wei Huang <wei@redhat.com>
[maz: commit message tidy-up]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-03-30 10:06:00 +00:00
Martin Willi
025c65e119 xfrm: Honor original L3 slave device in xfrmi policy lookup
If an xfrmi is associated to a vrf layer 3 master device,
xfrm_policy_check() fails after traffic decapsulation. The input
interface is replaced by the layer 3 master device, and hence
xfrmi_decode_session() can't match the xfrmi anymore to satisfy
policy checking.

Extend ingress xfrmi lookup to honor the original layer 3 slave
device, allowing xfrm interfaces to operate within a vrf domain.

Fixes: f203b76d78 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-27 16:14:05 +01:00
Sabrina Dubroca
8dfb4eba41 esp4: add length check for UDP encapsulation
esp_output_udp_encap can produce a length that doesn't fit in the 16
bits of a UDP header's length field. In that case, we'll send a
fragmented packet whose length is larger than IP_MAX_MTU (resulting in
"Oversized IP packet" warnings on receive) and with a bogus UDP
length.

To prevent this, add a length check to esp_output_udp_encap and return
 -EMSGSIZE on failure.

This seems to be older than git history.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-26 08:39:30 +01:00
Cong Wang
dbb2483b2a xfrm: clean up xfrm protocol checks
In commit 6a53b75932 ("xfrm: check id proto in validate_tmpl()")
I introduced a check for xfrm protocol, but according to Herbert
IPSEC_PROTO_ANY should only be used as a wildcard for lookup, so
it should be removed from validate_tmpl().

And, IPSEC_PROTO_ANY is expected to only match 3 IPSec-specific
protocols, this is why xfrm_state_flush() could still miss
IPPROTO_ROUTING, which leads that those entries are left in
net->xfrm.state_all before exit net. Fix this by replacing
IPSEC_PROTO_ANY with zero.

This patch also extracts the check from validate_tmpl() to
xfrm_id_proto_valid() and uses it in parse_ipsecrequest().
With this, no other protocols should be added into xfrm.

Fixes: 6a53b75932 ("xfrm: check id proto in validate_tmpl()")
Reported-by: syzbot+0bf0519d6e0de15914fe@syzkaller.appspotmail.com
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-26 08:35:36 +01:00
Jeremy Sowden
01ce31c57b vti4: removed duplicate log message.
Removed info log-message if ipip tunnel registration fails during
module-initialization: it adds nothing to the error message that is
written on all failures.

Fixes: dd9ee34440 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-21 12:54:01 +01:00
Jeremy Sowden
5483844c3f vti4: ipip tunnel deregistration fixes.
If tunnel registration failed during module initialization, the module
would fail to deregister the IPPROTO_COMP protocol and would attempt to
deregister the tunnel.

The tunnel was not deregistered during module-exit.

Fixes: dd9ee34440 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-21 12:54:01 +01:00
Steffen Klassert
bfc01ddff2 Revert "net: xfrm: Add '_rcu' tag for rcu protected pointer in netns_xfrm"
This reverts commit f10e0010fa.

This commit was just wrong. It caused a lot of
syzbot warnings, so just revert it.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-20 17:56:13 +01:00
Su Yanjun
6ee02a54ef xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module
When unloading xfrm6_tunnel module, xfrm6_tunnel_fini directly
frees the xfrm6_tunnel_spi_kmem. Maybe someone has gotten the
xfrm6_tunnel_spi, so need to wait it.

Fixes: 91cc3bb0b04ff("xfrm6_tunnel: RCU conversion")
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-15 11:11:42 +01:00
Su Yanjun
f10e0010fa net: xfrm: Add '_rcu' tag for rcu protected pointer in netns_xfrm
For rcu protected pointers, we'd better add '__rcu' for them.

Once added '__rcu' tag for rcu protected pointer, the sparse tool reports
warnings.

net/xfrm/xfrm_user.c:1198:39: sparse:    expected struct sock *sk
net/xfrm/xfrm_user.c:1198:39: sparse:    got struct sock [noderef] <asn:4> *nlsk
[...]

So introduce a new wrapper function of nlmsg_unicast  to handle type
conversions.

This patch also fixes a direct access of a rcu protected socket.

Fixes: be33690d8fcf("[XFRM]: Fix aevent related crash")
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-08 13:17:31 +01:00
Myungho Jung
6ed69184ed xfrm: Reset secpath in xfrm failure
In esp4_gro_receive() and esp6_gro_receive(), secpath can be allocated
without adding xfrm state to xvec. Then, sp->xvec[sp->len - 1] would
fail and result in dereferencing invalid pointer in esp4_gso_segment()
and esp6_gso_segment(). Reset secpath if xfrm function returns error.

Fixes: 7785bba299 ("esp: Add a software GRO codepath")
Reported-by: syzbot+b69368fd933c6c592f4c@syzkaller.appspotmail.com
Signed-off-by: Myungho Jung <mhjungk@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-08 07:32:16 +01:00
YueHaibing
b805d78d30 xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink
UBSAN report this:

UBSAN: Undefined behaviour in net/xfrm/xfrm_policy.c:1289:24
index 6 is out of range for type 'unsigned int [6]'
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.162-514.55.6.9.x86_64+ #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
 0000000000000000 1466cf39b41b23c9 ffff8801f6b07a58 ffffffff81cb35f4
 0000000041b58ab3 ffffffff83230f9c ffffffff81cb34e0 ffff8801f6b07a80
 ffff8801f6b07a20 1466cf39b41b23c9 ffffffff851706e0 ffff8801f6b07ae8
Call Trace:
 <IRQ>  [<ffffffff81cb35f4>] __dump_stack lib/dump_stack.c:15 [inline]
 <IRQ>  [<ffffffff81cb35f4>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51
 [<ffffffff81d94225>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164
 [<ffffffff81d954db>] __ubsan_handle_out_of_bounds+0x16e/0x1b2 lib/ubsan.c:382
 [<ffffffff82a25acd>] __xfrm_policy_unlink+0x3dd/0x5b0 net/xfrm/xfrm_policy.c:1289
 [<ffffffff82a2e572>] xfrm_policy_delete+0x52/0xb0 net/xfrm/xfrm_policy.c:1309
 [<ffffffff82a3319b>] xfrm_policy_timer+0x30b/0x590 net/xfrm/xfrm_policy.c:243
 [<ffffffff813d3927>] call_timer_fn+0x237/0x990 kernel/time/timer.c:1144
 [<ffffffff813d8e7e>] __run_timers kernel/time/timer.c:1218 [inline]
 [<ffffffff813d8e7e>] run_timer_softirq+0x6ce/0xb80 kernel/time/timer.c:1401
 [<ffffffff8120d6f9>] __do_softirq+0x299/0xe10 kernel/softirq.c:273
 [<ffffffff8120e676>] invoke_softirq kernel/softirq.c:350 [inline]
 [<ffffffff8120e676>] irq_exit+0x216/0x2c0 kernel/softirq.c:391
 [<ffffffff82c5edab>] exiting_irq arch/x86/include/asm/apic.h:652 [inline]
 [<ffffffff82c5edab>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926
 [<ffffffff82c5c985>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:735
 <EOI>  [<ffffffff81188096>] ? native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:52
 [<ffffffff810834d7>] arch_safe_halt arch/x86/include/asm/paravirt.h:111 [inline]
 [<ffffffff810834d7>] default_idle+0x27/0x430 arch/x86/kernel/process.c:446
 [<ffffffff81085f05>] arch_cpu_idle+0x15/0x20 arch/x86/kernel/process.c:437
 [<ffffffff8132abc3>] default_idle_call+0x53/0x90 kernel/sched/idle.c:92
 [<ffffffff8132b32d>] cpuidle_idle_call kernel/sched/idle.c:156 [inline]
 [<ffffffff8132b32d>] cpu_idle_loop kernel/sched/idle.c:251 [inline]
 [<ffffffff8132b32d>] cpu_startup_entry+0x60d/0x9a0 kernel/sched/idle.c:299
 [<ffffffff8113e119>] start_secondary+0x3c9/0x560 arch/x86/kernel/smpboot.c:245

The issue is triggered as this:

xfrm_add_policy
    -->verify_newpolicy_info  //check the index provided by user with XFRM_POLICY_MAX
			      //In my case, the index is 0x6E6BB6, so it pass the check.
    -->xfrm_policy_construct  //copy the user's policy and set xfrm_policy_timer
    -->xfrm_policy_insert
	--> __xfrm_policy_link //use the orgin dir, in my case is 2
	--> xfrm_gen_index   //generate policy index, there is 0x6E6BB6

then xfrm_policy_timer be fired

xfrm_policy_timer
   --> xfrm_policy_id2dir  //get dir from (policy index & 7), in my case is 6
   --> xfrm_policy_delete
      --> __xfrm_policy_unlink //access policy_count[dir], trigger out of range access

Add xfrm_policy_id2dir check in verify_newpolicy_info, make sure the computed dir is
valid, to fix the issue.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: e682adf021 ("xfrm: Try to honor policy index if it's supplied by user")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-03-01 12:17:41 +01:00
167 changed files with 1561 additions and 671 deletions

View File

@@ -370,11 +370,15 @@ autosuspend the interface's device. When the usage counter is = 0
then the interface is considered to be idle, and the kernel may
autosuspend the device.
Drivers need not be concerned about balancing changes to the usage
counter; the USB core will undo any remaining "get"s when a driver
is unbound from its interface. As a corollary, drivers must not call
any of the ``usb_autopm_*`` functions after their ``disconnect``
routine has returned.
Drivers must be careful to balance their overall changes to the usage
counter. Unbalanced "get"s will remain in effect when a driver is
unbound from its interface, preventing the device from going into
runtime suspend should the interface be bound to a driver again. On
the other hand, drivers are allowed to achieve this balance by calling
the ``usb_autopm_*`` functions even after their ``disconnect`` routine
has returned -- say from within a work-queue routine -- provided they
retain an active reference to the interface (via ``usb_get_intf`` and
``usb_put_intf``).
Drivers using the async routines are responsible for their own
synchronization and mutual exclusion.

View File

@@ -1337,6 +1337,7 @@ tag - INTEGER
Default value is 0.
xfrm4_gc_thresh - INTEGER
(Obsolete since linux-4.14)
The threshold at which we will start garbage collecting for IPv4
destination cache entries. At twice this value the system will
refuse new allocations.
@@ -1920,6 +1921,7 @@ echo_ignore_all - BOOLEAN
Default: 0
xfrm6_gc_thresh - INTEGER
(Obsolete since linux-4.14)
The threshold at which we will start garbage collecting for IPv6
destination cache entries. At twice this value the system will
refuse new allocations.

View File

@@ -132,7 +132,7 @@ version that should be applied. If there is any doubt, the maintainer
will reply and ask what should be done.
Q: I made changes to only a few patches in a patch series should I resend only those changed?
--------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
A: No, please resend the entire patch series and make sure you do number your
patches such that it is clear this is the latest and greatest set of patches
that can be applied.

View File

@@ -321,7 +321,7 @@ cpu's hardware control block.
4.8 KVM_GET_DIRTY_LOG (vm ioctl)
Capability: basic
Architectures: x86
Architectures: all
Type: vm ioctl
Parameters: struct kvm_dirty_log (in/out)
Returns: 0 on success, -1 on error
@@ -3810,7 +3810,7 @@ to I/O ports.
4.117 KVM_CLEAR_DIRTY_LOG (vm ioctl)
Capability: KVM_CAP_MANUAL_DIRTY_LOG_PROTECT
Architectures: x86
Architectures: x86, arm, arm64, mips
Type: vm ioctl
Parameters: struct kvm_dirty_log (in)
Returns: 0 on success, -1 on error
@@ -3830,8 +3830,9 @@ The ioctl clears the dirty status of pages in a memory slot, according to
the bitmap that is passed in struct kvm_clear_dirty_log's dirty_bitmap
field. Bit 0 of the bitmap corresponds to page "first_page" in the
memory slot, and num_pages is the size in bits of the input bitmap.
Both first_page and num_pages must be a multiple of 64. For each bit
that is set in the input bitmap, the corresponding page is marked "clean"
first_page must be a multiple of 64; num_pages must also be a multiple of
64 unless first_page + num_pages is the size of the memory slot. For each
bit that is set in the input bitmap, the corresponding page is marked "clean"
in KVM's dirty bitmap, and dirty tracking is re-enabled for that page
(for example via write-protection, or by clearing the dirty bit in
a page table entry).
@@ -4799,7 +4800,7 @@ and injected exceptions.
7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT
Architectures: all
Architectures: x86, arm, arm64, mips
Parameters: args[0] whether feature should be enabled or not
With this capability enabled, KVM_GET_DIRTY_LOG will not automatically

View File

@@ -6462,7 +6462,7 @@ S: Maintained
F: drivers/media/radio/radio-gemtek*
GENERIC GPIO I2C DRIVER
M: Haavard Skinnemoen <hskinnemoen@gmail.com>
M: Wolfram Sang <wsa+renesas@sang-engineering.com>
S: Supported
F: drivers/i2c/busses/i2c-gpio.c
F: include/linux/platform_data/i2c-gpio.h
@@ -12176,6 +12176,7 @@ F: arch/*/kernel/*/*/perf_event*.c
F: arch/*/include/asm/perf_event.h
F: arch/*/kernel/perf_callchain.c
F: arch/*/events/*
F: arch/*/events/*/*
F: tools/perf/
PERSONALITY HANDLING

View File

@@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 1
SUBLEVEL = 0
EXTRAVERSION = -rc7
EXTRAVERSION =
NAME = Shy Crocodile
# *DOCUMENTATION*
@@ -678,6 +678,7 @@ KBUILD_CFLAGS += $(call cc-disable-warning,frame-address,)
KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation)
KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow)
KBUILD_CFLAGS += $(call cc-disable-warning, int-in-bool-context)
KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
KBUILD_CFLAGS += -Os
@@ -719,7 +720,6 @@ ifdef CONFIG_CC_IS_CLANG
KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier)
KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
# Quiet clang warning: comparison of unsigned expression < 0 is always false
KBUILD_CFLAGS += $(call cc-disable-warning, tautological-compare)
# CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the

View File

@@ -18,8 +18,8 @@
model = "snps,hsdk";
compatible = "snps,hsdk";
#address-cells = <1>;
#size-cells = <1>;
#address-cells = <2>;
#size-cells = <2>;
chosen {
bootargs = "earlycon=uart8250,mmio32,0xf0005000,115200n8 console=ttyS0,115200n8 debug print-fatal-signals=1";
@@ -105,7 +105,7 @@
#size-cells = <1>;
interrupt-parent = <&idu_intc>;
ranges = <0x00000000 0xf0000000 0x10000000>;
ranges = <0x00000000 0x0 0xf0000000 0x10000000>;
cgu_rst: reset-controller@8a0 {
compatible = "snps,hsdk-reset";
@@ -269,9 +269,10 @@
};
memory@80000000 {
#address-cells = <1>;
#size-cells = <1>;
#address-cells = <2>;
#size-cells = <2>;
device_type = "memory";
reg = <0x80000000 0x40000000>; /* 1 GiB */
reg = <0x0 0x80000000 0x0 0x40000000>; /* 1 GB lowmem */
/* 0x1 0x00000000 0x0 0x40000000>; 1 GB highmem */
};
};

View File

@@ -30,10 +30,10 @@
#else
.macro PREALLOC_INSTR
.macro PREALLOC_INSTR reg, off
.endm
.macro PREFETCHW_INSTR
.macro PREFETCHW_INSTR reg, off
.endm
#endif

View File

@@ -113,10 +113,24 @@ static void read_decode_cache_bcr_arcv2(int cpu)
}
READ_BCR(ARC_REG_CLUSTER_BCR, cbcr);
if (cbcr.c)
if (cbcr.c) {
ioc_exists = 1;
else
/*
* As for today we don't support both IOC and ZONE_HIGHMEM enabled
* simultaneously. This happens because as of today IOC aperture covers
* only ZONE_NORMAL (low mem) and any dma transactions outside this
* region won't be HW coherent.
* If we want to use both IOC and ZONE_HIGHMEM we can use
* bounce_buffer to handle dma transactions to HIGHMEM.
* Also it is possible to modify dma_direct cache ops or increase IOC
* aperture size if we are planning to use HIGHMEM without PAE.
*/
if (IS_ENABLED(CONFIG_HIGHMEM) || is_pae40_enabled())
ioc_enable = 0;
} else {
ioc_enable = 0;
}
/* HS 2.0 didn't have AUX_VOL */
if (cpuinfo_arc700[cpu].core.family > 0x51) {
@@ -1158,19 +1172,6 @@ noinline void __init arc_ioc_setup(void)
if (!ioc_enable)
return;
/*
* As for today we don't support both IOC and ZONE_HIGHMEM enabled
* simultaneously. This happens because as of today IOC aperture covers
* only ZONE_NORMAL (low mem) and any dma transactions outside this
* region won't be HW coherent.
* If we want to use both IOC and ZONE_HIGHMEM we can use
* bounce_buffer to handle dma transactions to HIGHMEM.
* Also it is possible to modify dma_direct cache ops or increase IOC
* aperture size if we are planning to use HIGHMEM without PAE.
*/
if (IS_ENABLED(CONFIG_HIGHMEM))
panic("IOC and HIGHMEM can't be used simultaneously");
/* Flush + invalidate + disable L1 dcache */
__dc_disable();

View File

@@ -186,8 +186,9 @@ enum which_ebpf_reg {
* separate frame pointer, so BPF_REG_10 relative accesses are
* adjusted to be $sp relative.
*/
int ebpf_to_mips_reg(struct jit_ctx *ctx, const struct bpf_insn *insn,
enum which_ebpf_reg w)
static int ebpf_to_mips_reg(struct jit_ctx *ctx,
const struct bpf_insn *insn,
enum which_ebpf_reg w)
{
int ebpf_reg = (w == src_reg || w == src_reg_no_fp) ?
insn->src_reg : insn->dst_reg;

View File

@@ -543,14 +543,14 @@ long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
if (ret != H_SUCCESS)
return ret;
idx = srcu_read_lock(&vcpu->kvm->srcu);
ret = kvmppc_tce_validate(stt, tce);
if (ret != H_SUCCESS)
return ret;
goto unlock_exit;
dir = iommu_tce_direction(tce);
idx = srcu_read_lock(&vcpu->kvm->srcu);
if ((dir != DMA_NONE) && kvmppc_tce_to_ua(vcpu->kvm, tce, &ua, NULL)) {
ret = H_PARAMETER;
goto unlock_exit;

View File

@@ -3423,7 +3423,9 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
vcpu->arch.shregs.sprg2 = mfspr(SPRN_SPRG2);
vcpu->arch.shregs.sprg3 = mfspr(SPRN_SPRG3);
mtspr(SPRN_PSSCR, host_psscr);
/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
mtspr(SPRN_PSSCR, host_psscr |
(local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
mtspr(SPRN_HFSCR, host_hfscr);
mtspr(SPRN_CIABR, host_ciabr);
mtspr(SPRN_DAWR, host_dawr);

View File

@@ -98,10 +98,20 @@ static int find_free_bat(void)
return -1;
}
/*
* This function calculates the size of the larger block usable to map the
* beginning of an area based on the start address and size of that area:
* - max block size is 8M on 601 and 256 on other 6xx.
* - base address must be aligned to the block size. So the maximum block size
* is identified by the lowest bit set to 1 in the base address (for instance
* if base is 0x16000000, max size is 0x02000000).
* - block size has to be a power of two. This is calculated by finding the
* highest bit set to 1.
*/
static unsigned int block_size(unsigned long base, unsigned long top)
{
unsigned int max_size = (cpu_has_feature(CPU_FTR_601) ? 8 : 256) << 20;
unsigned int base_shift = (fls(base) - 1) & 31;
unsigned int base_shift = (ffs(base) - 1) & 31;
unsigned int block_shift = (fls(top - base) - 1) & 31;
return min3(max_size, 1U << base_shift, 1U << block_shift);
@@ -157,7 +167,7 @@ static unsigned long __init __mmu_mapin_ram(unsigned long base, unsigned long to
unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
{
int done;
unsigned long done;
unsigned long border = (unsigned long)__init_begin - PAGE_OFFSET;
if (__map_without_bats) {
@@ -169,10 +179,10 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
return __mmu_mapin_ram(base, top);
done = __mmu_mapin_ram(base, border);
if (done != border - base)
if (done != border)
return done;
return done + __mmu_mapin_ram(border, top);
return __mmu_mapin_ram(border, top);
}
void mmu_mark_initmem_nx(void)

View File

@@ -29,12 +29,12 @@ extern int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz);
extern time_t __vdso_time(time_t *t);
#ifdef CONFIG_PARAVIRT_CLOCK
extern u8 pvclock_page
extern u8 pvclock_page[PAGE_SIZE]
__attribute__((visibility("hidden")));
#endif
#ifdef CONFIG_HYPERV_TSCPAGE
extern u8 hvclock_page
extern u8 hvclock_page[PAGE_SIZE]
__attribute__((visibility("hidden")));
#endif

View File

@@ -116,6 +116,110 @@ static __initconst const u64 amd_hw_cache_event_ids
},
};
static __initconst const u64 amd_hw_cache_event_ids_f17h
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
[C(L1D)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = 0x0040, /* Data Cache Accesses */
[C(RESULT_MISS)] = 0xc860, /* L2$ access from DC Miss */
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = 0,
[C(RESULT_MISS)] = 0,
},
[C(OP_PREFETCH)] = {
[C(RESULT_ACCESS)] = 0xff5a, /* h/w prefetch DC Fills */
[C(RESULT_MISS)] = 0,
},
},
[C(L1I)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = 0x0080, /* Instruction cache fetches */
[C(RESULT_MISS)] = 0x0081, /* Instruction cache misses */
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = -1,
[C(RESULT_MISS)] = -1,
},
[C(OP_PREFETCH)] = {
[C(RESULT_ACCESS)] = 0,
[C(RESULT_MISS)] = 0,
},
},
[C(LL)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = 0,
[C(RESULT_MISS)] = 0,
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = 0,
[C(RESULT_MISS)] = 0,
},
[C(OP_PREFETCH)] = {
[C(RESULT_ACCESS)] = 0,
[C(RESULT_MISS)] = 0,
},
},
[C(DTLB)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = 0xff45, /* All L2 DTLB accesses */
[C(RESULT_MISS)] = 0xf045, /* L2 DTLB misses (PT walks) */
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = 0,
[C(RESULT_MISS)] = 0,
},
[C(OP_PREFETCH)] = {
[C(RESULT_ACCESS)] = 0,
[C(RESULT_MISS)] = 0,
},
},
[C(ITLB)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = 0x0084, /* L1 ITLB misses, L2 ITLB hits */
[C(RESULT_MISS)] = 0xff85, /* L1 ITLB misses, L2 misses */
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = -1,
[C(RESULT_MISS)] = -1,
},
[C(OP_PREFETCH)] = {
[C(RESULT_ACCESS)] = -1,
[C(RESULT_MISS)] = -1,
},
},
[C(BPU)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = 0x00c2, /* Retired Branch Instr. */
[C(RESULT_MISS)] = 0x00c3, /* Retired Mispredicted BI */
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = -1,
[C(RESULT_MISS)] = -1,
},
[C(OP_PREFETCH)] = {
[C(RESULT_ACCESS)] = -1,
[C(RESULT_MISS)] = -1,
},
},
[C(NODE)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = 0,
[C(RESULT_MISS)] = 0,
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = -1,
[C(RESULT_MISS)] = -1,
},
[C(OP_PREFETCH)] = {
[C(RESULT_ACCESS)] = -1,
[C(RESULT_MISS)] = -1,
},
},
};
/*
* AMD Performance Monitor K7 and later, up to and including Family 16h:
*/
@@ -865,9 +969,10 @@ __init int amd_pmu_init(void)
x86_pmu.amd_nb_constraints = 0;
}
/* Events are common for all AMDs */
memcpy(hw_cache_event_ids, amd_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
if (boot_cpu_data.x86 >= 0x17)
memcpy(hw_cache_event_ids, amd_hw_cache_event_ids_f17h, sizeof(hw_cache_event_ids));
else
memcpy(hw_cache_event_ids, amd_hw_cache_event_ids, sizeof(hw_cache_event_ids));
return 0;
}

View File

@@ -2091,15 +2091,19 @@ static void intel_pmu_disable_event(struct perf_event *event)
cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
cpuc->intel_cp_status &= ~(1ull << hwc->idx);
if (unlikely(event->attr.precise_ip))
intel_pmu_pebs_disable(event);
if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
intel_pmu_disable_fixed(hwc);
return;
}
x86_pmu_disable_event(event);
/*
* Needs to be called after x86_pmu_disable_event,
* so we don't trigger the event without PEBS bit set.
*/
if (unlikely(event->attr.precise_ip))
intel_pmu_pebs_disable(event);
}
static void intel_pmu_del_event(struct perf_event *event)

View File

@@ -1525,8 +1525,7 @@ static __init int pt_init(void)
}
if (!intel_pt_validate_hw_cap(PT_CAP_topa_multiple_entries))
pt_pmu.pmu.capabilities =
PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_AUX_SW_DOUBLEBUF;
pt_pmu.pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG;
pt_pmu.pmu.capabilities |= PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE;
pt_pmu.pmu.attr_groups = pt_attr_groups;

View File

@@ -295,6 +295,7 @@ union kvm_mmu_extended_role {
unsigned int valid:1;
unsigned int execonly:1;
unsigned int cr0_pg:1;
unsigned int cr4_pae:1;
unsigned int cr4_pse:1;
unsigned int cr4_pke:1;
unsigned int cr4_smap:1;

View File

@@ -46,7 +46,7 @@ void ptdump_walk_user_pgd_level_checkwx(void);
*/
extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
__visible;
#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
#define ZERO_PAGE(vaddr) ((void)(vaddr),virt_to_page(empty_zero_page))
extern spinlock_t pgd_lock;
extern struct list_head pgd_list;

View File

@@ -381,6 +381,7 @@ struct kvm_sync_regs {
#define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0)
#define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1)
#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2)
#define KVM_X86_QUIRK_OUT_7E_INC_RIP (1 << 3)
#define KVM_STATE_NESTED_GUEST_MODE 0x00000001
#define KVM_STATE_NESTED_RUN_PENDING 0x00000002

View File

@@ -1371,7 +1371,16 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
valid_bank_mask = BIT_ULL(0);
sparse_banks[0] = flush.processor_mask;
all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS;
/*
* Work around possible WS2012 bug: it sends hypercalls
* with processor_mask = 0x0 and HV_FLUSH_ALL_PROCESSORS clear,
* while also expecting us to flush something and crashing if
* we don't. Let's treat processor_mask == 0 same as
* HV_FLUSH_ALL_PROCESSORS.
*/
all_cpus = (flush.flags & HV_FLUSH_ALL_PROCESSORS) ||
flush.processor_mask == 0;
} else {
if (unlikely(kvm_read_guest(kvm, ingpa, &flush_ex,
sizeof(flush_ex))))

View File

@@ -70,7 +70,6 @@
#define APIC_BROADCAST 0xFF
#define X2APIC_BROADCAST 0xFFFFFFFFul
static bool lapic_timer_advance_adjust_done = false;
#define LAPIC_TIMER_ADVANCE_ADJUST_DONE 100
/* step-by-step approximation to mitigate fluctuation */
#define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
@@ -1482,14 +1481,32 @@ static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu)
return false;
}
static inline void __wait_lapic_expire(struct kvm_vcpu *vcpu, u64 guest_cycles)
{
u64 timer_advance_ns = vcpu->arch.apic->lapic_timer.timer_advance_ns;
/*
* If the guest TSC is running at a different ratio than the host, then
* convert the delay to nanoseconds to achieve an accurate delay. Note
* that __delay() uses delay_tsc whenever the hardware has TSC, thus
* always for VMX enabled hardware.
*/
if (vcpu->arch.tsc_scaling_ratio == kvm_default_tsc_scaling_ratio) {
__delay(min(guest_cycles,
nsec_to_cycles(vcpu, timer_advance_ns)));
} else {
u64 delay_ns = guest_cycles * 1000000ULL;
do_div(delay_ns, vcpu->arch.virtual_tsc_khz);
ndelay(min_t(u32, delay_ns, timer_advance_ns));
}
}
void wait_lapic_expire(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;
u32 timer_advance_ns = apic->lapic_timer.timer_advance_ns;
u64 guest_tsc, tsc_deadline, ns;
if (!lapic_in_kernel(vcpu))
return;
if (apic->lapic_timer.expired_tscdeadline == 0)
return;
@@ -1501,33 +1518,37 @@ void wait_lapic_expire(struct kvm_vcpu *vcpu)
guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
trace_kvm_wait_lapic_expire(vcpu->vcpu_id, guest_tsc - tsc_deadline);
/* __delay is delay_tsc whenever the hardware has TSC, thus always. */
if (guest_tsc < tsc_deadline)
__delay(min(tsc_deadline - guest_tsc,
nsec_to_cycles(vcpu, lapic_timer_advance_ns)));
__wait_lapic_expire(vcpu, tsc_deadline - guest_tsc);
if (!lapic_timer_advance_adjust_done) {
if (!apic->lapic_timer.timer_advance_adjust_done) {
/* too early */
if (guest_tsc < tsc_deadline) {
ns = (tsc_deadline - guest_tsc) * 1000000ULL;
do_div(ns, vcpu->arch.virtual_tsc_khz);
lapic_timer_advance_ns -= min((unsigned int)ns,
lapic_timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP);
timer_advance_ns -= min((u32)ns,
timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP);
} else {
/* too late */
ns = (guest_tsc - tsc_deadline) * 1000000ULL;
do_div(ns, vcpu->arch.virtual_tsc_khz);
lapic_timer_advance_ns += min((unsigned int)ns,
lapic_timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP);
timer_advance_ns += min((u32)ns,
timer_advance_ns / LAPIC_TIMER_ADVANCE_ADJUST_STEP);
}
if (abs(guest_tsc - tsc_deadline) < LAPIC_TIMER_ADVANCE_ADJUST_DONE)
lapic_timer_advance_adjust_done = true;
apic->lapic_timer.timer_advance_adjust_done = true;
if (unlikely(timer_advance_ns > 5000)) {
timer_advance_ns = 0;
apic->lapic_timer.timer_advance_adjust_done = true;
}
apic->lapic_timer.timer_advance_ns = timer_advance_ns;
}
}
static void start_sw_tscdeadline(struct kvm_lapic *apic)
{
u64 guest_tsc, tscdeadline = apic->lapic_timer.tscdeadline;
struct kvm_timer *ktimer = &apic->lapic_timer;
u64 guest_tsc, tscdeadline = ktimer->tscdeadline;
u64 ns = 0;
ktime_t expire;
struct kvm_vcpu *vcpu = apic->vcpu;
@@ -1542,13 +1563,15 @@ static void start_sw_tscdeadline(struct kvm_lapic *apic)
now = ktime_get();
guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
if (likely(tscdeadline > guest_tsc)) {
ns = (tscdeadline - guest_tsc) * 1000000ULL;
do_div(ns, this_tsc_khz);
ns = (tscdeadline - guest_tsc) * 1000000ULL;
do_div(ns, this_tsc_khz);
if (likely(tscdeadline > guest_tsc) &&
likely(ns > apic->lapic_timer.timer_advance_ns)) {
expire = ktime_add_ns(now, ns);
expire = ktime_sub_ns(expire, lapic_timer_advance_ns);
hrtimer_start(&apic->lapic_timer.timer,
expire, HRTIMER_MODE_ABS_PINNED);
expire = ktime_sub_ns(expire, ktimer->timer_advance_ns);
hrtimer_start(&ktimer->timer, expire, HRTIMER_MODE_ABS_PINNED);
} else
apic_timer_expired(apic);
@@ -2255,7 +2278,7 @@ static enum hrtimer_restart apic_timer_fn(struct hrtimer *data)
return HRTIMER_NORESTART;
}
int kvm_create_lapic(struct kvm_vcpu *vcpu)
int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
{
struct kvm_lapic *apic;
@@ -2279,6 +2302,14 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu)
hrtimer_init(&apic->lapic_timer.timer, CLOCK_MONOTONIC,
HRTIMER_MODE_ABS_PINNED);
apic->lapic_timer.timer.function = apic_timer_fn;
if (timer_advance_ns == -1) {
apic->lapic_timer.timer_advance_ns = 1000;
apic->lapic_timer.timer_advance_adjust_done = false;
} else {
apic->lapic_timer.timer_advance_ns = timer_advance_ns;
apic->lapic_timer.timer_advance_adjust_done = true;
}
/*
* APIC is created enabled. This will prevent kvm_lapic_set_base from

View File

@@ -31,8 +31,10 @@ struct kvm_timer {
u32 timer_mode_mask;
u64 tscdeadline;
u64 expired_tscdeadline;
u32 timer_advance_ns;
atomic_t pending; /* accumulated triggered timers */
bool hv_timer_in_use;
bool timer_advance_adjust_done;
};
struct kvm_lapic {
@@ -62,7 +64,7 @@ struct kvm_lapic {
struct dest_map;
int kvm_create_lapic(struct kvm_vcpu *vcpu);
int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns);
void kvm_free_lapic(struct kvm_vcpu *vcpu);
int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu);

View File

@@ -4781,6 +4781,7 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu)
union kvm_mmu_extended_role ext = {0};
ext.cr0_pg = !!is_paging(vcpu);
ext.cr4_pae = !!is_pae(vcpu);
ext.cr4_smep = !!kvm_read_cr4_bits(vcpu, X86_CR4_SMEP);
ext.cr4_smap = !!kvm_read_cr4_bits(vcpu, X86_CR4_SMAP);
ext.cr4_pse = !!is_pse(vcpu);

View File

@@ -5423,7 +5423,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
return ret;
/* Empty 'VMXON' state is permitted */
if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12))
if (kvm_state->size < sizeof(*kvm_state) + sizeof(*vmcs12))
return 0;
if (kvm_state->vmx.vmcs_pa != -1ull) {
@@ -5467,7 +5467,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
vmcs12->vmcs_link_pointer != -1ull) {
struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);
if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12))
if (kvm_state->size < sizeof(*kvm_state) + 2 * sizeof(*vmcs12))
return -EINVAL;
if (copy_from_user(shadow_vmcs12,

View File

@@ -3,6 +3,7 @@
#include <asm/asm.h>
#include <asm/bitsperlong.h>
#include <asm/kvm_vcpu_regs.h>
#include <asm/nospec-branch.h>
#define WORD_SIZE (BITS_PER_LONG / 8)
@@ -77,6 +78,17 @@ ENDPROC(vmx_vmenter)
* referred to by VMCS.HOST_RIP.
*/
ENTRY(vmx_vmexit)
#ifdef CONFIG_RETPOLINE
ALTERNATIVE "jmp .Lvmexit_skip_rsb", "", X86_FEATURE_RETPOLINE
/* Preserve guest's RAX, it's used to stuff the RSB. */
push %_ASM_AX
/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
pop %_ASM_AX
.Lvmexit_skip_rsb:
#endif
ret
ENDPROC(vmx_vmexit)

View File

@@ -6462,9 +6462,6 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
x86_spec_ctrl_restore_host(vmx->spec_ctrl, 0);
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();
/* All fields are clean at this point */
if (static_branch_unlikely(&enable_evmcs))
current_evmcs->hv_clean_fields |=
@@ -7032,6 +7029,7 @@ static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc)
{
struct vcpu_vmx *vmx;
u64 tscl, guest_tscl, delta_tsc, lapic_timer_advance_cycles;
struct kvm_timer *ktimer = &vcpu->arch.apic->lapic_timer;
if (kvm_mwait_in_guest(vcpu->kvm))
return -EOPNOTSUPP;
@@ -7040,7 +7038,8 @@ static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc)
tscl = rdtsc();
guest_tscl = kvm_read_l1_tsc(vcpu, tscl);
delta_tsc = max(guest_deadline_tsc, guest_tscl) - guest_tscl;
lapic_timer_advance_cycles = nsec_to_cycles(vcpu, lapic_timer_advance_ns);
lapic_timer_advance_cycles = nsec_to_cycles(vcpu,
ktimer->timer_advance_ns);
if (delta_tsc > lapic_timer_advance_cycles)
delta_tsc -= lapic_timer_advance_cycles;

View File

@@ -136,10 +136,14 @@ EXPORT_SYMBOL_GPL(kvm_default_tsc_scaling_ratio);
static u32 __read_mostly tsc_tolerance_ppm = 250;
module_param(tsc_tolerance_ppm, uint, S_IRUGO | S_IWUSR);
/* lapic timer advance (tscdeadline mode only) in nanoseconds */
unsigned int __read_mostly lapic_timer_advance_ns = 1000;
/*
* lapic timer advance (tscdeadline mode only) in nanoseconds. '-1' enables
* adaptive tuning starting from default advancment of 1000ns. '0' disables
* advancement entirely. Any other value is used as-is and disables adaptive
* tuning, i.e. allows priveleged userspace to set an exact advancement time.
*/
static int __read_mostly lapic_timer_advance_ns = -1;
module_param(lapic_timer_advance_ns, uint, S_IRUGO | S_IWUSR);
EXPORT_SYMBOL_GPL(lapic_timer_advance_ns);
static bool __read_mostly vector_hashing = true;
module_param(vector_hashing, bool, S_IRUGO);
@@ -6535,6 +6539,12 @@ int kvm_emulate_instruction_from_buffer(struct kvm_vcpu *vcpu,
}
EXPORT_SYMBOL_GPL(kvm_emulate_instruction_from_buffer);
static int complete_fast_pio_out_port_0x7e(struct kvm_vcpu *vcpu)
{
vcpu->arch.pio.count = 0;
return 1;
}
static int complete_fast_pio_out(struct kvm_vcpu *vcpu)
{
vcpu->arch.pio.count = 0;
@@ -6551,12 +6561,23 @@ static int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size,
unsigned long val = kvm_register_read(vcpu, VCPU_REGS_RAX);
int ret = emulator_pio_out_emulated(&vcpu->arch.emulate_ctxt,
size, port, &val, 1);
if (ret)
return ret;
if (!ret) {
/*
* Workaround userspace that relies on old KVM behavior of %rip being
* incremented prior to exiting to userspace to handle "OUT 0x7e".
*/
if (port == 0x7e &&
kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_OUT_7E_INC_RIP)) {
vcpu->arch.complete_userspace_io =
complete_fast_pio_out_port_0x7e;
kvm_skip_emulated_instruction(vcpu);
} else {
vcpu->arch.pio.linear_rip = kvm_get_linear_rip(vcpu);
vcpu->arch.complete_userspace_io = complete_fast_pio_out;
}
return ret;
return 0;
}
static int complete_fast_pio_in(struct kvm_vcpu *vcpu)
@@ -7873,7 +7894,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
}
trace_kvm_entry(vcpu->vcpu_id);
if (lapic_timer_advance_ns)
if (lapic_in_kernel(vcpu) &&
vcpu->arch.apic->lapic_timer.timer_advance_ns)
wait_lapic_expire(vcpu);
guest_enter_irqoff();
@@ -9061,7 +9083,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
if (irqchip_in_kernel(vcpu->kvm)) {
vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv(vcpu);
r = kvm_create_lapic(vcpu);
r = kvm_create_lapic(vcpu, lapic_timer_advance_ns);
if (r < 0)
goto fail_mmu_destroy;
} else

View File

@@ -294,8 +294,6 @@ extern u64 kvm_supported_xcr0(void);
extern unsigned int min_timer_period_us;
extern unsigned int lapic_timer_advance_ns;
extern bool enable_vmware_backdoor;
extern struct static_key kvm_no_apic_vcpu;

View File

@@ -6,6 +6,18 @@
# Produces uninteresting flaky coverage.
KCOV_INSTRUMENT_delay.o := n
# Early boot use of cmdline; don't instrument it
ifdef CONFIG_AMD_MEM_ENCRYPT
KCOV_INSTRUMENT_cmdline.o := n
KASAN_SANITIZE_cmdline.o := n
ifdef CONFIG_FUNCTION_TRACER
CFLAGS_REMOVE_cmdline.o = -pg
endif
CFLAGS_cmdline.o := $(call cc-option, -fno-stack-protector)
endif
inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk
inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt
quiet_cmd_inat_tables = GEN $@

View File

@@ -81,12 +81,8 @@ acpi_status acpi_ev_enable_gpe(struct acpi_gpe_event_info *gpe_event_info)
ACPI_FUNCTION_TRACE(ev_enable_gpe);
/* Clear the GPE status */
status = acpi_hw_clear_gpe(gpe_event_info);
if (ACPI_FAILURE(status))
return_ACPI_STATUS(status);
/* Enable the requested GPE */
status = acpi_hw_low_set_gpe(gpe_event_info, ACPI_GPE_ENABLE);
return_ACPI_STATUS(status);
}

View File

@@ -46,6 +46,8 @@ static struct clk_lookup *clk_find(const char *dev_id, const char *con_id)
if (con_id)
best_possible += 1;
lockdep_assert_held(&clocks_mutex);
list_for_each_entry(p, &clocks, node) {
match = 0;
if (p->dev_id) {
@@ -402,7 +404,10 @@ void devm_clk_release_clkdev(struct device *dev, const char *con_id,
struct clk_lookup *cl;
int rval;
mutex_lock(&clocks_mutex);
cl = clk_find(dev_id, con_id);
mutex_unlock(&clocks_mutex);
WARN_ON(!cl);
rval = devres_release(dev, devm_clkdev_release,
devm_clk_match_clkdev, cl);

View File

@@ -167,7 +167,7 @@ static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned long rate,
unsigned long parent_rate)
{
struct ccu_nkmp *nkmp = hw_to_ccu_nkmp(hw);
u32 n_mask, k_mask, m_mask, p_mask;
u32 n_mask = 0, k_mask = 0, m_mask = 0, p_mask = 0;
struct _ccu_nkmp _nkmp;
unsigned long flags;
u32 reg;
@@ -186,10 +186,24 @@ static int ccu_nkmp_set_rate(struct clk_hw *hw, unsigned long rate,
ccu_nkmp_find_best(parent_rate, rate, &_nkmp);
n_mask = GENMASK(nkmp->n.width + nkmp->n.shift - 1, nkmp->n.shift);
k_mask = GENMASK(nkmp->k.width + nkmp->k.shift - 1, nkmp->k.shift);
m_mask = GENMASK(nkmp->m.width + nkmp->m.shift - 1, nkmp->m.shift);
p_mask = GENMASK(nkmp->p.width + nkmp->p.shift - 1, nkmp->p.shift);
/*
* If width is 0, GENMASK() macro may not generate expected mask (0)
* as it falls under undefined behaviour by C standard due to shifts
* which are equal or greater than width of left operand. This can
* be easily avoided by explicitly checking if width is 0.
*/
if (nkmp->n.width)
n_mask = GENMASK(nkmp->n.width + nkmp->n.shift - 1,
nkmp->n.shift);
if (nkmp->k.width)
k_mask = GENMASK(nkmp->k.width + nkmp->k.shift - 1,
nkmp->k.shift);
if (nkmp->m.width)
m_mask = GENMASK(nkmp->m.width + nkmp->m.shift - 1,
nkmp->m.shift);
if (nkmp->p.width)
p_mask = GENMASK(nkmp->p.width + nkmp->p.shift - 1,
nkmp->p.shift);
spin_lock_irqsave(nkmp->common.lock, flags);

View File

@@ -255,10 +255,14 @@ static struct drm_driver qxl_driver = {
#if defined(CONFIG_DEBUG_FS)
.debugfs_init = qxl_debugfs_init,
#endif
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
.gem_prime_export = drm_gem_prime_export,
.gem_prime_import = drm_gem_prime_import,
.gem_prime_pin = qxl_gem_prime_pin,
.gem_prime_unpin = qxl_gem_prime_unpin,
.gem_prime_get_sg_table = qxl_gem_prime_get_sg_table,
.gem_prime_import_sg_table = qxl_gem_prime_import_sg_table,
.gem_prime_vmap = qxl_gem_prime_vmap,
.gem_prime_vunmap = qxl_gem_prime_vunmap,
.gem_prime_mmap = qxl_gem_prime_mmap,

View File

@@ -42,6 +42,18 @@ void qxl_gem_prime_unpin(struct drm_gem_object *obj)
qxl_bo_unpin(bo);
}
struct sg_table *qxl_gem_prime_get_sg_table(struct drm_gem_object *obj)
{
return ERR_PTR(-ENOSYS);
}
struct drm_gem_object *qxl_gem_prime_import_sg_table(
struct drm_device *dev, struct dma_buf_attachment *attach,
struct sg_table *table)
{
return ERR_PTR(-ENOSYS);
}
void *qxl_gem_prime_vmap(struct drm_gem_object *obj)
{
struct qxl_bo *bo = gem_to_qxl_bo(obj);

View File

@@ -426,8 +426,7 @@ i2c_dw_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num)
pm_runtime_get_sync(dev->dev);
if (dev->suspended) {
dev_err(dev->dev, "Error %s call while suspended\n", __func__);
if (dev_WARN_ONCE(dev->dev, dev->suspended, "Transfer while suspended\n")) {
ret = -ESHUTDOWN;
goto done_nolock;
}

View File

@@ -515,9 +515,9 @@ static int i2c_imx_clk_notifier_call(struct notifier_block *nb,
unsigned long action, void *data)
{
struct clk_notifier_data *ndata = data;
struct imx_i2c_struct *i2c_imx = container_of(&ndata->clk,
struct imx_i2c_struct *i2c_imx = container_of(nb,
struct imx_i2c_struct,
clk);
clk_change_nb);
if (action & POST_RATE_CHANGE)
i2c_imx_set_clk(i2c_imx, ndata->new_rate);

View File

@@ -597,6 +597,8 @@ static int synquacer_i2c_probe(struct platform_device *pdev)
i2c->adapter = synquacer_i2c_ops;
i2c_set_adapdata(&i2c->adapter, i2c);
i2c->adapter.dev.parent = &pdev->dev;
i2c->adapter.dev.of_node = pdev->dev.of_node;
ACPI_COMPANION_SET(&i2c->adapter.dev, ACPI_COMPANION(&pdev->dev));
i2c->adapter.nr = pdev->id;
init_completion(&i2c->completion);

View File

@@ -185,7 +185,7 @@ static int i2c_generic_bus_free(struct i2c_adapter *adap)
int i2c_generic_scl_recovery(struct i2c_adapter *adap)
{
struct i2c_bus_recovery_info *bri = adap->bus_recovery_info;
int i = 0, scl = 1, ret;
int i = 0, scl = 1, ret = 0;
if (bri->prepare_recovery)
bri->prepare_recovery(adap);
@@ -327,6 +327,8 @@ static int i2c_device_probe(struct device *dev)
if (client->flags & I2C_CLIENT_HOST_NOTIFY) {
dev_dbg(dev, "Using Host Notify IRQ\n");
/* Keep adapter active when Host Notify is required */
pm_runtime_get_sync(&client->adapter->dev);
irq = i2c_smbus_host_notify_to_irq(client);
} else if (dev->of_node) {
irq = of_irq_get_byname(dev->of_node, "irq");
@@ -431,6 +433,8 @@ static int i2c_device_remove(struct device *dev)
device_init_wakeup(&client->dev, false);
client->irq = client->init_irq;
if (client->flags & I2C_CLIENT_HOST_NOTIFY)
pm_runtime_put(&client->adapter->dev);
return status;
}

View File

@@ -895,7 +895,7 @@ static vm_fault_t rdma_umap_fault(struct vm_fault *vmf)
/* Read only pages can just use the system zero page. */
if (!(vmf->vma->vm_flags & (VM_WRITE | VM_MAYWRITE))) {
vmf->page = ZERO_PAGE(vmf->vm_start);
vmf->page = ZERO_PAGE(vmf->address);
get_page(vmf->page);
return 0;
}

View File

@@ -722,12 +722,6 @@ static void marvell_nfc_select_target(struct nand_chip *chip,
struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
u32 ndcr_generic;
if (chip == nfc->selected_chip && die_nr == marvell_nand->selected_die)
return;
writel_relaxed(marvell_nand->ndtr0, nfc->regs + NDTR0);
writel_relaxed(marvell_nand->ndtr1, nfc->regs + NDTR1);
/*
* Reset the NDCR register to a clean state for this particular chip,
* also clear ND_RUN bit.
@@ -739,6 +733,12 @@ static void marvell_nfc_select_target(struct nand_chip *chip,
/* Also reset the interrupt status register */
marvell_nfc_clear_int(nfc, NDCR_ALL_INT);
if (chip == nfc->selected_chip && die_nr == marvell_nand->selected_die)
return;
writel_relaxed(marvell_nand->ndtr0, nfc->regs + NDTR0);
writel_relaxed(marvell_nand->ndtr1, nfc->regs + NDTR1);
nfc->selected_chip = chip;
marvell_nand->selected_die = die_nr;
}

View File

@@ -886,6 +886,9 @@ static int bcm_sf2_cfp_rule_set(struct dsa_switch *ds, int port,
fs->m_ext.data[1]))
return -EINVAL;
if (fs->location != RX_CLS_LOC_ANY && fs->location >= CFP_NUM_RULES)
return -EINVAL;
if (fs->location != RX_CLS_LOC_ANY &&
test_bit(fs->location, priv->cfp.used))
return -EBUSY;
@@ -974,6 +977,9 @@ static int bcm_sf2_cfp_rule_del(struct bcm_sf2_priv *priv, int port, u32 loc)
struct cfp_rule *rule;
int ret;
if (loc >= CFP_NUM_RULES)
return -EINVAL;
/* Refuse deleting unused rules, and those that are not unique since
* that could leave IPv6 rules with one of the chained rule in the
* table.

View File

@@ -1625,7 +1625,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
netdev_warn(bp->dev, "RX buffer error %x\n", rx_err);
bnxt_sched_reset(bp, rxr);
}
goto next_rx;
goto next_rx_no_len;
}
len = le32_to_cpu(rxcmp->rx_cmp_len_flags_type) >> RX_CMP_LEN_SHIFT;
@@ -1706,12 +1706,13 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
rc = 1;
next_rx:
rxr->rx_prod = NEXT_RX(prod);
rxr->rx_next_cons = NEXT_RX(cons);
cpr->rx_packets += 1;
cpr->rx_bytes += len;
next_rx_no_len:
rxr->rx_prod = NEXT_RX(prod);
rxr->rx_next_cons = NEXT_RX(cons);
next_rx_no_prod_no_len:
*raw_cons = tmp_raw_cons;
@@ -5135,10 +5136,10 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
for (i = 0; i < bp->tx_nr_rings; i++) {
struct bnxt_tx_ring_info *txr = &bp->tx_ring[i];
struct bnxt_ring_struct *ring = &txr->tx_ring_struct;
u32 cmpl_ring_id;
cmpl_ring_id = bnxt_cp_ring_for_tx(bp, txr);
if (ring->fw_ring_id != INVALID_HW_RING_ID) {
u32 cmpl_ring_id = bnxt_cp_ring_for_tx(bp, txr);
hwrm_ring_free_send_msg(bp, ring,
RING_FREE_REQ_RING_TYPE_TX,
close_path ? cmpl_ring_id :
@@ -5151,10 +5152,10 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
struct bnxt_ring_struct *ring = &rxr->rx_ring_struct;
u32 grp_idx = rxr->bnapi->index;
u32 cmpl_ring_id;
cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
if (ring->fw_ring_id != INVALID_HW_RING_ID) {
u32 cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
hwrm_ring_free_send_msg(bp, ring,
RING_FREE_REQ_RING_TYPE_RX,
close_path ? cmpl_ring_id :
@@ -5173,10 +5174,10 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
struct bnxt_ring_struct *ring = &rxr->rx_agg_ring_struct;
u32 grp_idx = rxr->bnapi->index;
u32 cmpl_ring_id;
cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
if (ring->fw_ring_id != INVALID_HW_RING_ID) {
u32 cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
hwrm_ring_free_send_msg(bp, ring, type,
close_path ? cmpl_ring_id :
INVALID_HW_RING_ID);
@@ -5315,17 +5316,16 @@ __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct hwrm_func_cfg_input *req,
req->num_tx_rings = cpu_to_le16(tx_rings);
if (BNXT_NEW_RM(bp)) {
enables |= rx_rings ? FUNC_CFG_REQ_ENABLES_NUM_RX_RINGS : 0;
enables |= stats ? FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
if (bp->flags & BNXT_FLAG_CHIP_P5) {
enables |= cp_rings ? FUNC_CFG_REQ_ENABLES_NUM_MSIX : 0;
enables |= tx_rings + ring_grps ?
FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS |
FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS : 0;
enables |= rx_rings ?
FUNC_CFG_REQ_ENABLES_NUM_RSSCOS_CTXS : 0;
} else {
enables |= cp_rings ?
FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS |
FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS : 0;
enables |= ring_grps ?
FUNC_CFG_REQ_ENABLES_NUM_HW_RING_GRPS |
FUNC_CFG_REQ_ENABLES_NUM_RSSCOS_CTXS : 0;
@@ -5365,14 +5365,13 @@ __bnxt_hwrm_reserve_vf_rings(struct bnxt *bp,
enables |= tx_rings ? FUNC_VF_CFG_REQ_ENABLES_NUM_TX_RINGS : 0;
enables |= rx_rings ? FUNC_VF_CFG_REQ_ENABLES_NUM_RX_RINGS |
FUNC_VF_CFG_REQ_ENABLES_NUM_RSSCOS_CTXS : 0;
enables |= stats ? FUNC_VF_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
if (bp->flags & BNXT_FLAG_CHIP_P5) {
enables |= tx_rings + ring_grps ?
FUNC_VF_CFG_REQ_ENABLES_NUM_CMPL_RINGS |
FUNC_VF_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
FUNC_VF_CFG_REQ_ENABLES_NUM_CMPL_RINGS : 0;
} else {
enables |= cp_rings ?
FUNC_VF_CFG_REQ_ENABLES_NUM_CMPL_RINGS |
FUNC_VF_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
FUNC_VF_CFG_REQ_ENABLES_NUM_CMPL_RINGS : 0;
enables |= ring_grps ?
FUNC_VF_CFG_REQ_ENABLES_NUM_HW_RING_GRPS : 0;
}
@@ -6753,6 +6752,7 @@ static int bnxt_hwrm_port_qstats_ext(struct bnxt *bp)
struct hwrm_queue_pri2cos_qcfg_input req2 = {0};
struct hwrm_port_qstats_ext_input req = {0};
struct bnxt_pf_info *pf = &bp->pf;
u32 tx_stat_size;
int rc;
if (!(bp->flags & BNXT_FLAG_PORT_STATS_EXT))
@@ -6762,13 +6762,16 @@ static int bnxt_hwrm_port_qstats_ext(struct bnxt *bp)
req.port_id = cpu_to_le16(pf->port_id);
req.rx_stat_size = cpu_to_le16(sizeof(struct rx_port_stats_ext));
req.rx_stat_host_addr = cpu_to_le64(bp->hw_rx_port_stats_ext_map);
req.tx_stat_size = cpu_to_le16(sizeof(struct tx_port_stats_ext));
tx_stat_size = bp->hw_tx_port_stats_ext ?
sizeof(*bp->hw_tx_port_stats_ext) : 0;
req.tx_stat_size = cpu_to_le16(tx_stat_size);
req.tx_stat_host_addr = cpu_to_le64(bp->hw_tx_port_stats_ext_map);
mutex_lock(&bp->hwrm_cmd_lock);
rc = _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
if (!rc) {
bp->fw_rx_stats_ext_size = le16_to_cpu(resp->rx_stat_size) / 8;
bp->fw_tx_stats_ext_size = le16_to_cpu(resp->tx_stat_size) / 8;
bp->fw_tx_stats_ext_size = tx_stat_size ?
le16_to_cpu(resp->tx_stat_size) / 8 : 0;
} else {
bp->fw_rx_stats_ext_size = 0;
bp->fw_tx_stats_ext_size = 0;
@@ -8961,8 +8964,15 @@ static int bnxt_cfg_rx_mode(struct bnxt *bp)
skip_uc:
rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, 0);
if (rc && vnic->mc_list_count) {
netdev_info(bp->dev, "Failed setting MC filters rc: %d, turning on ALL_MCAST mode\n",
rc);
vnic->rx_mask |= CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST;
vnic->mc_list_count = 0;
rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, 0);
}
if (rc)
netdev_err(bp->dev, "HWRM cfa l2 rx mask failure rc: %x\n",
netdev_err(bp->dev, "HWRM cfa l2 rx mask failure rc: %d\n",
rc);
return rc;
@@ -10685,6 +10695,7 @@ init_err_cleanup_tc:
bnxt_clear_int_mode(bp);
init_err_pci_clean:
bnxt_free_hwrm_short_cmd_req(bp);
bnxt_free_hwrm_resources(bp);
bnxt_free_ctx_mem(bp);
kfree(bp->ctx);

View File

@@ -333,6 +333,9 @@ static int stm32mp1_parse_data(struct stm32_dwmac *dwmac,
*/
dwmac->irq_pwr_wakeup = platform_get_irq_byname(pdev,
"stm32_pwr_wakeup");
if (dwmac->irq_pwr_wakeup == -EPROBE_DEFER)
return -EPROBE_DEFER;
if (!dwmac->clk_eth_ck && dwmac->irq_pwr_wakeup >= 0) {
err = device_init_wakeup(&pdev->dev, true);
if (err) {

View File

@@ -160,7 +160,7 @@ static const struct dmi_system_id quark_pci_dmi[] = {
.driver_data = (void *)&galileo_stmmac_dmi_data,
},
/*
* There are 2 types of SIMATIC IOT2000: IOT20202 and IOT2040.
* There are 2 types of SIMATIC IOT2000: IOT2020 and IOT2040.
* The asset tag "6ES7647-0AA00-0YA2" is only for IOT2020 which
* has only one pci network device while other asset tags are
* for IOT2040 which has two.

View File

@@ -533,6 +533,8 @@ mcr20a_start(struct ieee802154_hw *hw)
dev_dbg(printdev(lp), "no slotted operation\n");
ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL1,
DAR_PHY_CTRL1_SLOTTED, 0x0);
if (ret < 0)
return ret;
/* enable irq */
enable_irq(lp->spi->irq);
@@ -540,11 +542,15 @@ mcr20a_start(struct ieee802154_hw *hw)
/* Unmask SEQ interrupt */
ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL2,
DAR_PHY_CTRL2_SEQMSK, 0x0);
if (ret < 0)
return ret;
/* Start the RX sequence */
dev_dbg(printdev(lp), "start the RX sequence\n");
ret = regmap_update_bits(lp->regmap_dar, DAR_PHY_CTRL1,
DAR_PHY_CTRL1_XCVSEQ_MASK, MCR20A_XCVSEQ_RX);
if (ret < 0)
return ret;
return 0;
}

View File

@@ -1489,9 +1489,10 @@ static int marvell_get_sset_count(struct phy_device *phydev)
static void marvell_get_strings(struct phy_device *phydev, u8 *data)
{
int count = marvell_get_sset_count(phydev);
int i;
for (i = 0; i < ARRAY_SIZE(marvell_hw_stats); i++) {
for (i = 0; i < count; i++) {
strlcpy(data + i * ETH_GSTRING_LEN,
marvell_hw_stats[i].string, ETH_GSTRING_LEN);
}
@@ -1519,9 +1520,10 @@ static u64 marvell_get_stat(struct phy_device *phydev, int i)
static void marvell_get_stats(struct phy_device *phydev,
struct ethtool_stats *stats, u64 *data)
{
int count = marvell_get_sset_count(phydev);
int i;
for (i = 0; i < ARRAY_SIZE(marvell_hw_stats); i++)
for (i = 0; i < count; i++)
data[i] = marvell_get_stat(phydev, i);
}

View File

@@ -1122,9 +1122,16 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x0846, 0x68d3, 8)}, /* Netgear Aircard 779S */
{QMI_FIXED_INTF(0x12d1, 0x140c, 1)}, /* Huawei E173 */
{QMI_FIXED_INTF(0x12d1, 0x14ac, 1)}, /* Huawei E1820 */
{QMI_FIXED_INTF(0x1435, 0x0918, 3)}, /* Wistron NeWeb D16Q1 */
{QMI_FIXED_INTF(0x1435, 0x0918, 4)}, /* Wistron NeWeb D16Q1 */
{QMI_FIXED_INTF(0x1435, 0x0918, 5)}, /* Wistron NeWeb D16Q1 */
{QMI_FIXED_INTF(0x1435, 0x3185, 4)}, /* Wistron NeWeb M18Q5 */
{QMI_FIXED_INTF(0x1435, 0xd111, 4)}, /* M9615A DM11-1 D51QC */
{QMI_FIXED_INTF(0x1435, 0xd181, 3)}, /* Wistron NeWeb D18Q1 */
{QMI_FIXED_INTF(0x1435, 0xd181, 4)}, /* Wistron NeWeb D18Q1 */
{QMI_FIXED_INTF(0x1435, 0xd181, 5)}, /* Wistron NeWeb D18Q1 */
{QMI_FIXED_INTF(0x1435, 0xd182, 4)}, /* Wistron NeWeb D18 */
{QMI_FIXED_INTF(0x1435, 0xd182, 5)}, /* Wistron NeWeb D18 */
{QMI_FIXED_INTF(0x1435, 0xd191, 4)}, /* Wistron NeWeb D19Q1 */
{QMI_QUIRK_SET_DTR(0x1508, 0x1001, 4)}, /* Fibocom NL668 series */
{QMI_FIXED_INTF(0x16d8, 0x6003, 0)}, /* CMOTech 6003 */
@@ -1180,6 +1187,7 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x19d2, 0x0265, 4)}, /* ONDA MT8205 4G LTE */
{QMI_FIXED_INTF(0x19d2, 0x0284, 4)}, /* ZTE MF880 */
{QMI_FIXED_INTF(0x19d2, 0x0326, 4)}, /* ZTE MF821D */
{QMI_FIXED_INTF(0x19d2, 0x0396, 3)}, /* ZTE ZM8620 */
{QMI_FIXED_INTF(0x19d2, 0x0412, 4)}, /* Telewell TW-LTE 4G */
{QMI_FIXED_INTF(0x19d2, 0x1008, 4)}, /* ZTE (Vodafone) K3570-Z */
{QMI_FIXED_INTF(0x19d2, 0x1010, 4)}, /* ZTE (Vodafone) K3571-Z */
@@ -1200,7 +1208,9 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x19d2, 0x1425, 2)},
{QMI_FIXED_INTF(0x19d2, 0x1426, 2)}, /* ZTE MF91 */
{QMI_FIXED_INTF(0x19d2, 0x1428, 2)}, /* Telewell TW-LTE 4G v2 */
{QMI_FIXED_INTF(0x19d2, 0x1432, 3)}, /* ZTE ME3620 */
{QMI_FIXED_INTF(0x19d2, 0x2002, 4)}, /* ZTE (Vodafone) K3765-Z */
{QMI_FIXED_INTF(0x2001, 0x7e16, 3)}, /* D-Link DWM-221 */
{QMI_FIXED_INTF(0x2001, 0x7e19, 4)}, /* D-Link DWM-221 B1 */
{QMI_FIXED_INTF(0x2001, 0x7e35, 4)}, /* D-Link DWM-222 */
{QMI_FIXED_INTF(0x2020, 0x2031, 4)}, /* Olicard 600 */

View File

@@ -1855,7 +1855,7 @@ void ath10k_ce_dump_registers(struct ath10k *ar,
struct ath10k_ce_crash_data ce_data;
u32 addr, id;
lockdep_assert_held(&ar->data_lock);
lockdep_assert_held(&ar->dump_mutex);
ath10k_err(ar, "Copy Engine register dump:\n");

View File

@@ -3119,6 +3119,7 @@ struct ath10k *ath10k_core_create(size_t priv_size, struct device *dev,
goto err_free_wq;
mutex_init(&ar->conf_mutex);
mutex_init(&ar->dump_mutex);
spin_lock_init(&ar->data_lock);
INIT_LIST_HEAD(&ar->peers);

View File

@@ -1063,6 +1063,9 @@ struct ath10k {
/* prevents concurrent FW reconfiguration */
struct mutex conf_mutex;
/* protects coredump data */
struct mutex dump_mutex;
/* protects shared structure data */
spinlock_t data_lock;

View File

@@ -1102,7 +1102,7 @@ struct ath10k_fw_crash_data *ath10k_coredump_new(struct ath10k *ar)
{
struct ath10k_fw_crash_data *crash_data = ar->coredump.fw_crash_data;
lockdep_assert_held(&ar->data_lock);
lockdep_assert_held(&ar->dump_mutex);
if (ath10k_coredump_mask == 0)
/* coredump disabled */
@@ -1146,7 +1146,7 @@ static struct ath10k_dump_file_data *ath10k_coredump_build(struct ath10k *ar)
if (!buf)
return NULL;
spin_lock_bh(&ar->data_lock);
mutex_lock(&ar->dump_mutex);
dump_data = (struct ath10k_dump_file_data *)(buf);
strlcpy(dump_data->df_magic, "ATH10K-FW-DUMP",
@@ -1213,7 +1213,7 @@ static struct ath10k_dump_file_data *ath10k_coredump_build(struct ath10k *ar)
sofar += sizeof(*dump_tlv) + crash_data->ramdump_buf_len;
}
spin_unlock_bh(&ar->data_lock);
mutex_unlock(&ar->dump_mutex);
return dump_data;
}

View File

@@ -5774,7 +5774,7 @@ static void ath10k_bss_info_changed(struct ieee80211_hw *hw,
}
if (changed & BSS_CHANGED_MCAST_RATE &&
!WARN_ON(ath10k_mac_vif_chan(arvif->vif, &def))) {
!ath10k_mac_vif_chan(arvif->vif, &def)) {
band = def.chan->band;
rateidx = vif->bss_conf.mcast_rate[band] - 1;
@@ -5812,7 +5812,7 @@ static void ath10k_bss_info_changed(struct ieee80211_hw *hw,
}
if (changed & BSS_CHANGED_BASIC_RATES) {
if (WARN_ON(ath10k_mac_vif_chan(vif, &def))) {
if (ath10k_mac_vif_chan(vif, &def)) {
mutex_unlock(&ar->conf_mutex);
return;
}

View File

@@ -1441,7 +1441,7 @@ static void ath10k_pci_dump_registers(struct ath10k *ar,
__le32 reg_dump_values[REG_DUMP_COUNT_QCA988X] = {};
int i, ret;
lockdep_assert_held(&ar->data_lock);
lockdep_assert_held(&ar->dump_mutex);
ret = ath10k_pci_diag_read_hi(ar, &reg_dump_values[0],
hi_failure_state,
@@ -1656,7 +1656,7 @@ static void ath10k_pci_dump_memory(struct ath10k *ar,
int ret, i;
u8 *buf;
lockdep_assert_held(&ar->data_lock);
lockdep_assert_held(&ar->dump_mutex);
if (!crash_data)
return;
@@ -1734,14 +1734,19 @@ static void ath10k_pci_dump_memory(struct ath10k *ar,
}
}
static void ath10k_pci_fw_crashed_dump(struct ath10k *ar)
static void ath10k_pci_fw_dump_work(struct work_struct *work)
{
struct ath10k_pci *ar_pci = container_of(work, struct ath10k_pci,
dump_work);
struct ath10k_fw_crash_data *crash_data;
struct ath10k *ar = ar_pci->ar;
char guid[UUID_STRING_LEN + 1];
spin_lock_bh(&ar->data_lock);
mutex_lock(&ar->dump_mutex);
spin_lock_bh(&ar->data_lock);
ar->stats.fw_crash_counter++;
spin_unlock_bh(&ar->data_lock);
crash_data = ath10k_coredump_new(ar);
@@ -1756,11 +1761,18 @@ static void ath10k_pci_fw_crashed_dump(struct ath10k *ar)
ath10k_ce_dump_registers(ar, crash_data);
ath10k_pci_dump_memory(ar, crash_data);
spin_unlock_bh(&ar->data_lock);
mutex_unlock(&ar->dump_mutex);
queue_work(ar->workqueue, &ar->restart_work);
}
static void ath10k_pci_fw_crashed_dump(struct ath10k *ar)
{
struct ath10k_pci *ar_pci = ath10k_pci_priv(ar);
queue_work(ar->workqueue, &ar_pci->dump_work);
}
void ath10k_pci_hif_send_complete_check(struct ath10k *ar, u8 pipe,
int force)
{
@@ -3442,6 +3454,8 @@ int ath10k_pci_setup_resource(struct ath10k *ar)
spin_lock_init(&ar_pci->ps_lock);
mutex_init(&ar_pci->ce_diag_mutex);
INIT_WORK(&ar_pci->dump_work, ath10k_pci_fw_dump_work);
timer_setup(&ar_pci->rx_post_retry, ath10k_pci_rx_replenish_retry, 0);
if (QCA_REV_6174(ar) || QCA_REV_9377(ar))

View File

@@ -121,6 +121,8 @@ struct ath10k_pci {
/* For protecting ce_diag */
struct mutex ce_diag_mutex;
struct work_struct dump_work;
struct ath10k_ce ce;
struct timer_list rx_post_retry;

View File

@@ -201,7 +201,7 @@ static const struct iwl_ht_params iwl_22000_ht_params = {
#define IWL_DEVICE_AX210 \
IWL_DEVICE_AX200_COMMON, \
.device_family = IWL_DEVICE_FAMILY_AX210, \
.base_params = &iwl_22000_base_params, \
.base_params = &iwl_22560_base_params, \
.csr = &iwl_csr_v1, \
.min_txq_size = 128

View File

@@ -1,7 +1,7 @@
/******************************************************************************
*
* Copyright(c) 2007 - 2014 Intel Corporation. All rights reserved.
* Copyright(c) 2018 Intel Corporation
* Copyright(c) 2018 - 2019 Intel Corporation
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of version 2 of the GNU General Public License as
@@ -136,6 +136,7 @@ const struct iwl_cfg iwl5350_agn_cfg = {
.ht_params = &iwl5000_ht_params,
.led_mode = IWL_LED_BLINK,
.internal_wimax_coex = true,
.csr = &iwl_csr_v1,
};
#define IWL_DEVICE_5150 \

View File

@@ -93,7 +93,7 @@ struct iwl_ucode_header {
} u;
};
#define IWL_UCODE_INI_TLV_GROUP BIT(24)
#define IWL_UCODE_INI_TLV_GROUP 0x1000000
/*
* new TLV uCode file layout
@@ -148,11 +148,14 @@ enum iwl_ucode_tlv_type {
IWL_UCODE_TLV_UMAC_DEBUG_ADDRS = 54,
IWL_UCODE_TLV_LMAC_DEBUG_ADDRS = 55,
IWL_UCODE_TLV_FW_RECOVERY_INFO = 57,
IWL_UCODE_TLV_TYPE_BUFFER_ALLOCATION = IWL_UCODE_INI_TLV_GROUP | 0x1,
IWL_UCODE_TLV_TYPE_HCMD = IWL_UCODE_INI_TLV_GROUP | 0x2,
IWL_UCODE_TLV_TYPE_REGIONS = IWL_UCODE_INI_TLV_GROUP | 0x3,
IWL_UCODE_TLV_TYPE_TRIGGERS = IWL_UCODE_INI_TLV_GROUP | 0x4,
IWL_UCODE_TLV_TYPE_DEBUG_FLOW = IWL_UCODE_INI_TLV_GROUP | 0x5,
IWL_UCODE_TLV_TYPE_BUFFER_ALLOCATION = IWL_UCODE_INI_TLV_GROUP + 0x1,
IWL_UCODE_TLV_DEBUG_BASE = IWL_UCODE_TLV_TYPE_BUFFER_ALLOCATION,
IWL_UCODE_TLV_TYPE_HCMD = IWL_UCODE_INI_TLV_GROUP + 0x2,
IWL_UCODE_TLV_TYPE_REGIONS = IWL_UCODE_INI_TLV_GROUP + 0x3,
IWL_UCODE_TLV_TYPE_TRIGGERS = IWL_UCODE_INI_TLV_GROUP + 0x4,
IWL_UCODE_TLV_TYPE_DEBUG_FLOW = IWL_UCODE_INI_TLV_GROUP + 0x5,
IWL_UCODE_TLV_DEBUG_MAX = IWL_UCODE_TLV_TYPE_DEBUG_FLOW,
/* TLVs 0x1000-0x2000 are for internal driver usage */
IWL_UCODE_TLV_FW_DBG_DUMP_LST = 0x1000,

View File

@@ -126,7 +126,8 @@ void iwl_alloc_dbg_tlv(struct iwl_trans *trans, size_t len, const u8 *data,
len -= ALIGN(tlv_len, 4);
data += sizeof(*tlv) + ALIGN(tlv_len, 4);
if (!(tlv_type & IWL_UCODE_INI_TLV_GROUP))
if (tlv_type < IWL_UCODE_TLV_DEBUG_BASE ||
tlv_type > IWL_UCODE_TLV_DEBUG_MAX)
continue;
hdr = (void *)&tlv->data[0];

View File

@@ -774,8 +774,7 @@ void iwl_mvm_vif_dbgfs_register(struct iwl_mvm *mvm, struct ieee80211_vif *vif)
return;
mvmvif->dbgfs_dir = debugfs_create_dir("iwlmvm", dbgfs_dir);
if (!mvmvif->dbgfs_dir) {
if (IS_ERR_OR_NULL(mvmvif->dbgfs_dir)) {
IWL_ERR(mvm, "Failed to create debugfs directory under %pd\n",
dbgfs_dir);
return;

View File

@@ -1121,7 +1121,9 @@ int iwl_mvm_up(struct iwl_mvm *mvm)
ret = iwl_mvm_load_rt_fw(mvm);
if (ret) {
IWL_ERR(mvm, "Failed to start RT ucode: %d\n", ret);
iwl_fw_dbg_error_collect(&mvm->fwrt, FW_DBG_TRIGGER_DRIVER);
if (ret != -ERFKILL)
iwl_fw_dbg_error_collect(&mvm->fwrt,
FW_DBG_TRIGGER_DRIVER);
goto error;
}

View File

@@ -834,7 +834,7 @@ iwl_op_mode_mvm_start(struct iwl_trans *trans, const struct iwl_cfg *cfg,
mutex_lock(&mvm->mutex);
iwl_mvm_ref(mvm, IWL_MVM_REF_INIT_UCODE);
err = iwl_run_init_mvm_ucode(mvm, true);
if (err)
if (err && err != -ERFKILL)
iwl_fw_dbg_error_collect(&mvm->fwrt, FW_DBG_TRIGGER_DRIVER);
if (!iwlmvm_mod_params.init_dbg || !err)
iwl_mvm_stop_device(mvm);

View File

@@ -169,9 +169,9 @@ static inline int iwl_mvm_check_pn(struct iwl_mvm *mvm, struct sk_buff *skb,
}
/* iwl_mvm_create_skb Adds the rxb to a new skb */
static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr,
u16 len, u8 crypt_len,
struct iwl_rx_cmd_buffer *rxb)
static int iwl_mvm_create_skb(struct iwl_mvm *mvm, struct sk_buff *skb,
struct ieee80211_hdr *hdr, u16 len, u8 crypt_len,
struct iwl_rx_cmd_buffer *rxb)
{
struct iwl_rx_packet *pkt = rxb_addr(rxb);
struct iwl_rx_mpdu_desc *desc = (void *)pkt->data;
@@ -204,6 +204,20 @@ static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr,
* present before copying packet data.
*/
hdrlen += crypt_len;
if (WARN_ONCE(headlen < hdrlen,
"invalid packet lengths (hdrlen=%d, len=%d, crypt_len=%d)\n",
hdrlen, len, crypt_len)) {
/*
* We warn and trace because we want to be able to see
* it in trace-cmd as well.
*/
IWL_DEBUG_RX(mvm,
"invalid packet lengths (hdrlen=%d, len=%d, crypt_len=%d)\n",
hdrlen, len, crypt_len);
return -EINVAL;
}
skb_put_data(skb, hdr, hdrlen);
skb_put_data(skb, (u8 *)hdr + hdrlen + pad_len, headlen - hdrlen);
@@ -216,6 +230,8 @@ static void iwl_mvm_create_skb(struct sk_buff *skb, struct ieee80211_hdr *hdr,
skb_add_rx_frag(skb, 0, rxb_steal_page(rxb), offset,
fraglen, rxb->truesize);
}
return 0;
}
static void iwl_mvm_add_rtap_sniffer_config(struct iwl_mvm *mvm,
@@ -1671,7 +1687,11 @@ void iwl_mvm_rx_mpdu_mq(struct iwl_mvm *mvm, struct napi_struct *napi,
rx_status->boottime_ns = ktime_get_boot_ns();
}
iwl_mvm_create_skb(skb, hdr, len, crypt_len, rxb);
if (iwl_mvm_create_skb(mvm, skb, hdr, len, crypt_len, rxb)) {
kfree_skb(skb);
goto out;
}
if (!iwl_mvm_reorder(mvm, napi, queue, sta, skb, desc))
iwl_mvm_pass_packet_to_mac80211(mvm, napi, skb, queue,
sta, csi);

View File

@@ -3644,20 +3644,27 @@ out_no_pci:
void iwl_trans_pcie_sync_nmi(struct iwl_trans *trans)
{
struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
unsigned long timeout = jiffies + IWL_TRANS_NMI_TIMEOUT;
u32 inta_addr, sw_err_bit;
if (trans_pcie->msix_enabled) {
inta_addr = CSR_MSIX_HW_INT_CAUSES_AD;
sw_err_bit = MSIX_HW_INT_CAUSES_REG_SW_ERR;
} else {
inta_addr = CSR_INT;
sw_err_bit = CSR_INT_BIT_SW_ERR;
}
iwl_disable_interrupts(trans);
iwl_force_nmi(trans);
while (time_after(timeout, jiffies)) {
u32 inta_hw = iwl_read32(trans,
CSR_MSIX_HW_INT_CAUSES_AD);
u32 inta_hw = iwl_read32(trans, inta_addr);
/* Error detected by uCode */
if (inta_hw & MSIX_HW_INT_CAUSES_REG_SW_ERR) {
if (inta_hw & sw_err_bit) {
/* Clear causes register */
iwl_write32(trans, CSR_MSIX_HW_INT_CAUSES_AD,
inta_hw &
MSIX_HW_INT_CAUSES_REG_SW_ERR);
iwl_write32(trans, inta_addr, inta_hw & sw_err_bit);
break;
}

View File

@@ -181,7 +181,7 @@ static int mwifiex_sdio_resume(struct device *dev)
adapter = card->adapter;
if (test_bit(MWIFIEX_IS_SUSPENDED, &adapter->work_flags)) {
if (!test_bit(MWIFIEX_IS_SUSPENDED, &adapter->work_flags)) {
mwifiex_dbg(adapter, WARN,
"device already resumed\n");
return 0;

View File

@@ -6262,8 +6262,7 @@ static int __init pci_setup(char *str)
} else if (!strncmp(str, "pcie_scan_all", 13)) {
pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
} else if (!strncmp(str, "disable_acs_redir=", 18)) {
disable_acs_redir_param =
kstrdup(str + 18, GFP_KERNEL);
disable_acs_redir_param = str + 18;
} else {
printk(KERN_ERR "PCI: Unknown option `%s'\n",
str);
@@ -6274,3 +6273,19 @@ static int __init pci_setup(char *str)
return 0;
}
early_param("pci", pci_setup);
/*
* 'disable_acs_redir_param' is initialized in pci_setup(), above, to point
* to data in the __initdata section which will be freed after the init
* sequence is complete. We can't allocate memory in pci_setup() because some
* architectures do not have any memory allocation service available during
* an early_param() call. So we allocate memory and copy the variable here
* before the init section is freed.
*/
static int __init pci_realloc_setup_params(void)
{
disable_acs_redir_param = kstrdup(disable_acs_redir_param, GFP_KERNEL);
return 0;
}
pure_initcall(pci_realloc_setup_params);

View File

@@ -142,3 +142,11 @@ config PCIE_PTM
This is only useful if you have devices that support PTM, but it
is safe to enable even if you don't.
config PCIE_BW
bool "PCI Express Bandwidth Change Notification"
depends on PCIEPORTBUS
help
This enables PCI Express Bandwidth Change Notification. If
you know link width or rate changes occur only to correct
unreliable links, you may answer Y.

View File

@@ -3,7 +3,6 @@
# Makefile for PCI Express features and port driver
pcieportdrv-y := portdrv_core.o portdrv_pci.o err.o
pcieportdrv-y += bw_notification.o
obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o
@@ -13,3 +12,4 @@ obj-$(CONFIG_PCIEAER_INJECT) += aer_inject.o
obj-$(CONFIG_PCIE_PME) += pme.o
obj-$(CONFIG_PCIE_DPC) += dpc.o
obj-$(CONFIG_PCIE_PTM) += ptm.o
obj-$(CONFIG_PCIE_BW) += bw_notification.o

View File

@@ -49,7 +49,11 @@ int pcie_dpc_init(void);
static inline int pcie_dpc_init(void) { return 0; }
#endif
#ifdef CONFIG_PCIE_BW
int pcie_bandwidth_notification_init(void);
#else
static inline int pcie_bandwidth_notification_init(void) { return 0; }
#endif
/* Port Type */
#define PCIE_ANY_PORT (~0)

View File

@@ -55,7 +55,8 @@ static int pcie_message_numbers(struct pci_dev *dev, int mask,
* 7.8.2, 7.10.10, 7.31.2.
*/
if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP)) {
if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP |
PCIE_PORT_SERVICE_BWNOTIF)) {
pcie_capability_read_word(dev, PCI_EXP_FLAGS, &reg16);
*pme = (reg16 & PCI_EXP_FLAGS_IRQ) >> 9;
nvec = *pme + 1;

View File

@@ -221,6 +221,9 @@ static int cpcap_battery_cc_raw_div(struct cpcap_battery_ddata *ddata,
int avg_current;
u32 cc_lsb;
if (!divider)
return 0;
sample &= 0xffffff; /* 24-bits, unsigned */
offset &= 0x7ff; /* 10-bits, signed */

View File

@@ -383,15 +383,11 @@ int power_supply_uevent(struct device *dev, struct kobj_uevent_env *env)
char *prop_buf;
char *attrname;
dev_dbg(dev, "uevent\n");
if (!psy || !psy->desc) {
dev_dbg(dev, "No power supply yet\n");
return ret;
}
dev_dbg(dev, "POWER_SUPPLY_NAME=%s\n", psy->desc->name);
ret = add_uevent_var(env, "POWER_SUPPLY_NAME=%s", psy->desc->name);
if (ret)
return ret;
@@ -427,8 +423,6 @@ int power_supply_uevent(struct device *dev, struct kobj_uevent_env *env)
goto out;
}
dev_dbg(dev, "prop %s=%s\n", attrname, prop_buf);
ret = add_uevent_var(env, "POWER_SUPPLY_%s=%s", attrname, prop_buf);
kfree(attrname);
if (ret)

View File

@@ -473,11 +473,6 @@ static int usb_unbind_interface(struct device *dev)
pm_runtime_disable(dev);
pm_runtime_set_suspended(dev);
/* Undo any residual pm_autopm_get_interface_* calls */
for (r = atomic_read(&intf->pm_usage_cnt); r > 0; --r)
usb_autopm_put_interface_no_suspend(intf);
atomic_set(&intf->pm_usage_cnt, 0);
if (!error)
usb_autosuspend_device(udev);
@@ -1633,7 +1628,6 @@ void usb_autopm_put_interface(struct usb_interface *intf)
int status;
usb_mark_last_busy(udev);
atomic_dec(&intf->pm_usage_cnt);
status = pm_runtime_put_sync(&intf->dev);
dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n",
__func__, atomic_read(&intf->dev.power.usage_count),
@@ -1662,7 +1656,6 @@ void usb_autopm_put_interface_async(struct usb_interface *intf)
int status;
usb_mark_last_busy(udev);
atomic_dec(&intf->pm_usage_cnt);
status = pm_runtime_put(&intf->dev);
dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n",
__func__, atomic_read(&intf->dev.power.usage_count),
@@ -1684,7 +1677,6 @@ void usb_autopm_put_interface_no_suspend(struct usb_interface *intf)
struct usb_device *udev = interface_to_usbdev(intf);
usb_mark_last_busy(udev);
atomic_dec(&intf->pm_usage_cnt);
pm_runtime_put_noidle(&intf->dev);
}
EXPORT_SYMBOL_GPL(usb_autopm_put_interface_no_suspend);
@@ -1715,8 +1707,6 @@ int usb_autopm_get_interface(struct usb_interface *intf)
status = pm_runtime_get_sync(&intf->dev);
if (status < 0)
pm_runtime_put_sync(&intf->dev);
else
atomic_inc(&intf->pm_usage_cnt);
dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n",
__func__, atomic_read(&intf->dev.power.usage_count),
status);
@@ -1750,8 +1740,6 @@ int usb_autopm_get_interface_async(struct usb_interface *intf)
status = pm_runtime_get(&intf->dev);
if (status < 0 && status != -EINPROGRESS)
pm_runtime_put_noidle(&intf->dev);
else
atomic_inc(&intf->pm_usage_cnt);
dev_vdbg(&intf->dev, "%s: cnt %d -> %d\n",
__func__, atomic_read(&intf->dev.power.usage_count),
status);
@@ -1775,7 +1763,6 @@ void usb_autopm_get_interface_no_resume(struct usb_interface *intf)
struct usb_device *udev = interface_to_usbdev(intf);
usb_mark_last_busy(udev);
atomic_inc(&intf->pm_usage_cnt);
pm_runtime_get_noresume(&intf->dev);
}
EXPORT_SYMBOL_GPL(usb_autopm_get_interface_no_resume);

View File

@@ -820,9 +820,11 @@ int usb_string(struct usb_device *dev, int index, char *buf, size_t size)
if (dev->state == USB_STATE_SUSPENDED)
return -EHOSTUNREACH;
if (size <= 0 || !buf || !index)
if (size <= 0 || !buf)
return -EINVAL;
buf[0] = 0;
if (index <= 0 || index >= 256)
return -EINVAL;
tbuf = kmalloc(256, GFP_NOIO);
if (!tbuf)
return -ENOMEM;

View File

@@ -979,8 +979,18 @@ static int dummy_udc_start(struct usb_gadget *g,
struct dummy_hcd *dum_hcd = gadget_to_dummy_hcd(g);
struct dummy *dum = dum_hcd->dum;
if (driver->max_speed == USB_SPEED_UNKNOWN)
switch (g->speed) {
/* All the speeds we support */
case USB_SPEED_LOW:
case USB_SPEED_FULL:
case USB_SPEED_HIGH:
case USB_SPEED_SUPER:
break;
default:
dev_err(dummy_dev(dum_hcd), "Unsupported driver max speed %d\n",
driver->max_speed);
return -EINVAL;
}
/*
* SLAVE side init ... the layer above hardware, which
@@ -1784,9 +1794,10 @@ static void dummy_timer(struct timer_list *t)
/* Bus speed is 500000 bytes/ms, so use a little less */
total = 490000;
break;
default:
default: /* Can't happen */
dev_err(dummy_dev(dum_hcd), "bogus device speed\n");
return;
total = 0;
break;
}
/* FIXME if HZ != 1000 this will probably misbehave ... */
@@ -1828,7 +1839,7 @@ restart:
/* Used up this frame's bandwidth? */
if (total <= 0)
break;
continue;
/* find the gadget's ep for this request (if configured) */
address = usb_pipeendpoint (urb->pipe);

View File

@@ -314,6 +314,7 @@ static void yurex_disconnect(struct usb_interface *interface)
usb_deregister_dev(interface, &yurex_class);
/* prevent more I/O from starting */
usb_poison_urb(dev->urb);
mutex_lock(&dev->io_mutex);
dev->interface = NULL;
mutex_unlock(&dev->io_mutex);

View File

@@ -763,18 +763,16 @@ static void rts51x_suspend_timer_fn(struct timer_list *t)
break;
case RTS51X_STAT_IDLE:
case RTS51X_STAT_SS:
usb_stor_dbg(us, "RTS51X_STAT_SS, intf->pm_usage_cnt:%d, power.usage:%d\n",
atomic_read(&us->pusb_intf->pm_usage_cnt),
usb_stor_dbg(us, "RTS51X_STAT_SS, power.usage:%d\n",
atomic_read(&us->pusb_intf->dev.power.usage_count));
if (atomic_read(&us->pusb_intf->pm_usage_cnt) > 0) {
if (atomic_read(&us->pusb_intf->dev.power.usage_count) > 0) {
usb_stor_dbg(us, "Ready to enter SS state\n");
rts51x_set_stat(chip, RTS51X_STAT_SS);
/* ignore mass storage interface's children */
pm_suspend_ignore_children(&us->pusb_intf->dev, true);
usb_autopm_put_interface_async(us->pusb_intf);
usb_stor_dbg(us, "RTS51X_STAT_SS 01, intf->pm_usage_cnt:%d, power.usage:%d\n",
atomic_read(&us->pusb_intf->pm_usage_cnt),
usb_stor_dbg(us, "RTS51X_STAT_SS 01, power.usage:%d\n",
atomic_read(&us->pusb_intf->dev.power.usage_count));
}
break;
@@ -807,11 +805,10 @@ static void rts51x_invoke_transport(struct scsi_cmnd *srb, struct us_data *us)
int ret;
if (working_scsi(srb)) {
usb_stor_dbg(us, "working scsi, intf->pm_usage_cnt:%d, power.usage:%d\n",
atomic_read(&us->pusb_intf->pm_usage_cnt),
usb_stor_dbg(us, "working scsi, power.usage:%d\n",
atomic_read(&us->pusb_intf->dev.power.usage_count));
if (atomic_read(&us->pusb_intf->pm_usage_cnt) <= 0) {
if (atomic_read(&us->pusb_intf->dev.power.usage_count) <= 0) {
ret = usb_autopm_get_interface(us->pusb_intf);
usb_stor_dbg(us, "working scsi, ret=%d\n", ret);
}

View File

@@ -361,16 +361,10 @@ static int get_pipe(struct stub_device *sdev, struct usbip_header *pdu)
}
if (usb_endpoint_xfer_isoc(epd)) {
/* validate packet size and number of packets */
unsigned int maxp, packets, bytes;
maxp = usb_endpoint_maxp(epd);
maxp *= usb_endpoint_maxp_mult(epd);
bytes = pdu->u.cmd_submit.transfer_buffer_length;
packets = DIV_ROUND_UP(bytes, maxp);
/* validate number of packets */
if (pdu->u.cmd_submit.number_of_packets < 0 ||
pdu->u.cmd_submit.number_of_packets > packets) {
pdu->u.cmd_submit.number_of_packets >
USBIP_MAX_ISO_PACKETS) {
dev_err(&sdev->udev->dev,
"CMD_SUBMIT: isoc invalid num packets %d\n",
pdu->u.cmd_submit.number_of_packets);

View File

@@ -121,6 +121,13 @@ extern struct device_attribute dev_attr_usbip_debug;
#define USBIP_DIR_OUT 0x00
#define USBIP_DIR_IN 0x01
/*
* Arbitrary limit for the maximum number of isochronous packets in an URB,
* compare for example the uhci_submit_isochronous function in
* drivers/usb/host/uhci-q.c
*/
#define USBIP_MAX_ISO_PACKETS 1024
/**
* struct usbip_header_basic - data pertinent to every request
* @command: the usbip request type

View File

@@ -1016,15 +1016,15 @@ static int ds_probe(struct usb_interface *intf,
/* alternative 3, 1ms interrupt (greatly speeds search), 64 byte bulk */
alt = 3;
err = usb_set_interface(dev->udev,
intf->altsetting[alt].desc.bInterfaceNumber, alt);
intf->cur_altsetting->desc.bInterfaceNumber, alt);
if (err) {
dev_err(&dev->udev->dev, "Failed to set alternative setting %d "
"for %d interface: err=%d.\n", alt,
intf->altsetting[alt].desc.bInterfaceNumber, err);
intf->cur_altsetting->desc.bInterfaceNumber, err);
goto err_out_clear;
}
iface_desc = &intf->altsetting[alt];
iface_desc = intf->cur_altsetting;
if (iface_desc->desc.bNumEndpoints != NUM_EP-1) {
pr_info("Num endpoints=%d. It is not DS9490R.\n",
iface_desc->desc.bNumEndpoints);

View File

@@ -264,7 +264,8 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
bio_for_each_segment_all(bvec, &bio, i, iter_all) {
if (should_dirty && !PageCompound(bvec->bv_page))
set_page_dirty_lock(bvec->bv_page);
put_page(bvec->bv_page);
if (!bio_flagged(&bio, BIO_NO_PAGE_REF))
put_page(bvec->bv_page);
}
if (unlikely(bio.bi_status))

View File

@@ -6783,7 +6783,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
u64 extent_start = 0;
u64 extent_end = 0;
u64 objectid = btrfs_ino(inode);
u8 extent_type;
int extent_type = -1;
struct btrfs_path *path = NULL;
struct btrfs_root *root = inode->root;
struct btrfs_file_extent_item *item;

View File

@@ -1817,8 +1817,13 @@ int file_remove_privs(struct file *file)
int kill;
int error = 0;
/* Fast path for nothing security related */
if (IS_NOSEC(inode))
/*
* Fast path for nothing security related.
* As well for non-regular files, e.g. blkdev inodes.
* For example, blkdev_write_iter() might get here
* trying to remove privs which it is not allowed to.
*/
if (IS_NOSEC(inode) || !S_ISREG(inode->i_mode))
return 0;
kill = dentry_needs_remove_privs(dentry);

View File

@@ -4,15 +4,28 @@
* supporting fast/efficient IO.
*
* A note on the read/write ordering memory barriers that are matched between
* the application and kernel side. When the application reads the CQ ring
* tail, it must use an appropriate smp_rmb() to order with the smp_wmb()
* the kernel uses after writing the tail. Failure to do so could cause a
* delay in when the application notices that completion events available.
* This isn't a fatal condition. Likewise, the application must use an
* appropriate smp_wmb() both before writing the SQ tail, and after writing
* the SQ tail. The first one orders the sqe writes with the tail write, and
* the latter is paired with the smp_rmb() the kernel will issue before
* reading the SQ tail on submission.
* the application and kernel side.
*
* After the application reads the CQ ring tail, it must use an
* appropriate smp_rmb() to pair with the smp_wmb() the kernel uses
* before writing the tail (using smp_load_acquire to read the tail will
* do). It also needs a smp_mb() before updating CQ head (ordering the
* entry load(s) with the head store), pairing with an implicit barrier
* through a control-dependency in io_get_cqring (smp_store_release to
* store head will do). Failure to do so could lead to reading invalid
* CQ entries.
*
* Likewise, the application must use an appropriate smp_wmb() before
* writing the SQ tail (ordering SQ entry stores with the tail store),
* which pairs with smp_load_acquire in io_get_sqring (smp_store_release
* to store the tail will do). And it needs a barrier ordering the SQ
* head load before writing new SQ entries (smp_load_acquire to read
* head will do).
*
* When using the SQ poll thread (IORING_SETUP_SQPOLL), the application
* needs to check the SQ flags for IORING_SQ_NEED_WAKEUP *after*
* updating the SQ tail; a full memory barrier smp_mb() is needed
* between.
*
* Also see the examples in the liburing library:
*
@@ -70,20 +83,108 @@ struct io_uring {
u32 tail ____cacheline_aligned_in_smp;
};
/*
* This data is shared with the application through the mmap at offset
* IORING_OFF_SQ_RING.
*
* The offsets to the member fields are published through struct
* io_sqring_offsets when calling io_uring_setup.
*/
struct io_sq_ring {
/*
* Head and tail offsets into the ring; the offsets need to be
* masked to get valid indices.
*
* The kernel controls head and the application controls tail.
*/
struct io_uring r;
/*
* Bitmask to apply to head and tail offsets (constant, equals
* ring_entries - 1)
*/
u32 ring_mask;
/* Ring size (constant, power of 2) */
u32 ring_entries;
/*
* Number of invalid entries dropped by the kernel due to
* invalid index stored in array
*
* Written by the kernel, shouldn't be modified by the
* application (i.e. get number of "new events" by comparing to
* cached value).
*
* After a new SQ head value was read by the application this
* counter includes all submissions that were dropped reaching
* the new SQ head (and possibly more).
*/
u32 dropped;
/*
* Runtime flags
*
* Written by the kernel, shouldn't be modified by the
* application.
*
* The application needs a full memory barrier before checking
* for IORING_SQ_NEED_WAKEUP after updating the sq tail.
*/
u32 flags;
/*
* Ring buffer of indices into array of io_uring_sqe, which is
* mmapped by the application using the IORING_OFF_SQES offset.
*
* This indirection could e.g. be used to assign fixed
* io_uring_sqe entries to operations and only submit them to
* the queue when needed.
*
* The kernel modifies neither the indices array nor the entries
* array.
*/
u32 array[];
};
/*
* This data is shared with the application through the mmap at offset
* IORING_OFF_CQ_RING.
*
* The offsets to the member fields are published through struct
* io_cqring_offsets when calling io_uring_setup.
*/
struct io_cq_ring {
/*
* Head and tail offsets into the ring; the offsets need to be
* masked to get valid indices.
*
* The application controls head and the kernel tail.
*/
struct io_uring r;
/*
* Bitmask to apply to head and tail offsets (constant, equals
* ring_entries - 1)
*/
u32 ring_mask;
/* Ring size (constant, power of 2) */
u32 ring_entries;
/*
* Number of completion events lost because the queue was full;
* this should be avoided by the application by making sure
* there are not more requests pending thatn there is space in
* the completion queue.
*
* Written by the kernel, shouldn't be modified by the
* application (i.e. get number of "new events" by comparing to
* cached value).
*
* As completion events come in out of order this counter is not
* ordered with any other data.
*/
u32 overflow;
/*
* Ring buffer of completion events.
*
* The kernel writes completion events fresh every time they are
* produced, so the application is allowed to modify pending
* entries.
*/
struct io_uring_cqe cqes[];
};
@@ -221,7 +322,7 @@ struct io_kiocb {
struct list_head list;
unsigned int flags;
refcount_t refs;
#define REQ_F_FORCE_NONBLOCK 1 /* inline submission attempt */
#define REQ_F_NOWAIT 1 /* must not punt to workers */
#define REQ_F_IOPOLL_COMPLETED 2 /* polled IO has completed */
#define REQ_F_FIXED_FILE 4 /* ctx owns file */
#define REQ_F_SEQ_PREV 8 /* sequential with previous */
@@ -317,12 +418,6 @@ static void io_commit_cqring(struct io_ring_ctx *ctx)
/* order cqe stores with ring update */
smp_store_release(&ring->r.tail, ctx->cached_cq_tail);
/*
* Write sider barrier of tail update, app has read side. See
* comment at the top of this file.
*/
smp_wmb();
if (wq_has_sleeper(&ctx->cq_wait)) {
wake_up_interruptible(&ctx->cq_wait);
kill_fasync(&ctx->cq_fasync, SIGIO, POLL_IN);
@@ -336,8 +431,11 @@ static struct io_uring_cqe *io_get_cqring(struct io_ring_ctx *ctx)
unsigned tail;
tail = ctx->cached_cq_tail;
/* See comment at the top of the file */
smp_rmb();
/*
* writes to the cq entry need to come after reading head; the
* control dependency is enough as we're using WRITE_ONCE to
* fill the cq entry
*/
if (tail - READ_ONCE(ring->r.head) == ring->ring_entries)
return NULL;
@@ -774,10 +872,14 @@ static int io_prep_rw(struct io_kiocb *req, const struct sqe_submit *s,
ret = kiocb_set_rw_flags(kiocb, READ_ONCE(sqe->rw_flags));
if (unlikely(ret))
return ret;
if (force_nonblock) {
/* don't allow async punt if RWF_NOWAIT was requested */
if (kiocb->ki_flags & IOCB_NOWAIT)
req->flags |= REQ_F_NOWAIT;
if (force_nonblock)
kiocb->ki_flags |= IOCB_NOWAIT;
req->flags |= REQ_F_FORCE_NONBLOCK;
}
if (ctx->flags & IORING_SETUP_IOPOLL) {
if (!(kiocb->ki_flags & IOCB_DIRECT) ||
!kiocb->ki_filp->f_op->iopoll)
@@ -1436,8 +1538,7 @@ restart:
struct sqe_submit *s = &req->submit;
const struct io_uring_sqe *sqe = s->sqe;
/* Ensure we clear previously set forced non-block flag */
req->flags &= ~REQ_F_FORCE_NONBLOCK;
/* Ensure we clear previously set non-block flag */
req->rw.ki_flags &= ~IOCB_NOWAIT;
ret = 0;
@@ -1467,10 +1568,11 @@ restart:
break;
cond_resched();
} while (1);
/* drop submission reference */
io_put_req(req);
}
/* drop submission reference */
io_put_req(req);
if (ret) {
io_cqring_add_event(ctx, sqe->user_data, ret, 0);
io_put_req(req);
@@ -1623,7 +1725,7 @@ static int io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
goto out;
ret = __io_submit_sqe(ctx, req, s, true);
if (ret == -EAGAIN) {
if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) {
struct io_uring_sqe *sqe_copy;
sqe_copy = kmalloc(sizeof(*sqe_copy), GFP_KERNEL);
@@ -1697,23 +1799,9 @@ static void io_commit_sqring(struct io_ring_ctx *ctx)
* write new data to them.
*/
smp_store_release(&ring->r.head, ctx->cached_sq_head);
/*
* write side barrier of head update, app has read side. See
* comment at the top of this file
*/
smp_wmb();
}
}
/*
* Undo last io_get_sqring()
*/
static void io_drop_sqring(struct io_ring_ctx *ctx)
{
ctx->cached_sq_head--;
}
/*
* Fetch an sqe, if one is available. Note that s->sqe will point to memory
* that is mapped by userspace. This means that care needs to be taken to
@@ -1736,8 +1824,6 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct sqe_submit *s)
* though the application is the one updating it.
*/
head = ctx->cached_sq_head;
/* See comment at the top of this file */
smp_rmb();
/* make sure SQ entry isn't read before tail */
if (head == smp_load_acquire(&ring->r.tail))
return false;
@@ -1753,8 +1839,6 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct sqe_submit *s)
/* drop invalid entries */
ctx->cached_sq_head++;
ring->dropped++;
/* See comment at the top of this file */
smp_wmb();
return false;
}
@@ -1878,13 +1962,11 @@ static int io_sq_thread(void *data)
finish_wait(&ctx->sqo_wait, &wait);
ctx->sq_ring->flags &= ~IORING_SQ_NEED_WAKEUP;
smp_wmb();
continue;
}
finish_wait(&ctx->sqo_wait, &wait);
ctx->sq_ring->flags &= ~IORING_SQ_NEED_WAKEUP;
smp_wmb();
}
i = 0;
@@ -1929,7 +2011,7 @@ static int io_sq_thread(void *data)
static int io_ring_submit(struct io_ring_ctx *ctx, unsigned int to_submit)
{
struct io_submit_state state, *statep = NULL;
int i, ret = 0, submit = 0;
int i, submit = 0;
if (to_submit > IO_PLUG_THRESHOLD) {
io_submit_state_start(&state, ctx, to_submit);
@@ -1938,6 +2020,7 @@ static int io_ring_submit(struct io_ring_ctx *ctx, unsigned int to_submit)
for (i = 0; i < to_submit; i++) {
struct sqe_submit s;
int ret;
if (!io_get_sqring(ctx, &s))
break;
@@ -1945,21 +2028,18 @@ static int io_ring_submit(struct io_ring_ctx *ctx, unsigned int to_submit)
s.has_user = true;
s.needs_lock = false;
s.needs_fixed_file = false;
submit++;
ret = io_submit_sqe(ctx, &s, statep);
if (ret) {
io_drop_sqring(ctx);
break;
}
submit++;
if (ret)
io_cqring_add_event(ctx, s.sqe->user_data, ret, 0);
}
io_commit_sqring(ctx);
if (statep)
io_submit_state_end(statep);
return submit ? submit : ret;
return submit;
}
static unsigned io_cqring_events(struct io_cq_ring *ring)
@@ -2240,10 +2320,6 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx,
mmgrab(current->mm);
ctx->sqo_mm = current->mm;
ret = -EINVAL;
if (!cpu_possible(p->sq_thread_cpu))
goto err;
if (ctx->flags & IORING_SETUP_SQPOLL) {
ret = -EPERM;
if (!capable(CAP_SYS_ADMIN))
@@ -2254,11 +2330,11 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx,
ctx->sq_thread_idle = HZ;
if (p->flags & IORING_SETUP_SQ_AFF) {
int cpu;
int cpu = array_index_nospec(p->sq_thread_cpu,
nr_cpu_ids);
cpu = array_index_nospec(p->sq_thread_cpu, NR_CPUS);
ret = -EINVAL;
if (!cpu_possible(p->sq_thread_cpu))
if (!cpu_possible(cpu))
goto err;
ctx->sqo_thread = kthread_create_on_cpu(io_sq_thread,
@@ -2321,8 +2397,12 @@ static int io_account_mem(struct user_struct *user, unsigned long nr_pages)
static void io_mem_free(void *ptr)
{
struct page *page = virt_to_head_page(ptr);
struct page *page;
if (!ptr)
return;
page = virt_to_head_page(ptr);
if (put_page_testzero(page))
free_compound_page(page);
}
@@ -2363,7 +2443,7 @@ static int io_sqe_buffer_unregister(struct io_ring_ctx *ctx)
if (ctx->account_mem)
io_unaccount_mem(ctx->user, imu->nr_bvecs);
kfree(imu->bvec);
kvfree(imu->bvec);
imu->nr_bvecs = 0;
}
@@ -2455,9 +2535,9 @@ static int io_sqe_buffer_register(struct io_ring_ctx *ctx, void __user *arg,
if (!pages || nr_pages > got_pages) {
kfree(vmas);
kfree(pages);
pages = kmalloc_array(nr_pages, sizeof(struct page *),
pages = kvmalloc_array(nr_pages, sizeof(struct page *),
GFP_KERNEL);
vmas = kmalloc_array(nr_pages,
vmas = kvmalloc_array(nr_pages,
sizeof(struct vm_area_struct *),
GFP_KERNEL);
if (!pages || !vmas) {
@@ -2469,7 +2549,7 @@ static int io_sqe_buffer_register(struct io_ring_ctx *ctx, void __user *arg,
got_pages = nr_pages;
}
imu->bvec = kmalloc_array(nr_pages, sizeof(struct bio_vec),
imu->bvec = kvmalloc_array(nr_pages, sizeof(struct bio_vec),
GFP_KERNEL);
ret = -ENOMEM;
if (!imu->bvec) {
@@ -2508,6 +2588,7 @@ static int io_sqe_buffer_register(struct io_ring_ctx *ctx, void __user *arg,
}
if (ctx->account_mem)
io_unaccount_mem(ctx->user, nr_pages);
kvfree(imu->bvec);
goto err;
}
@@ -2530,12 +2611,12 @@ static int io_sqe_buffer_register(struct io_ring_ctx *ctx, void __user *arg,
ctx->nr_user_bufs++;
}
kfree(pages);
kfree(vmas);
kvfree(pages);
kvfree(vmas);
return 0;
err:
kfree(pages);
kfree(vmas);
kvfree(pages);
kvfree(vmas);
io_sqe_buffer_unregister(ctx);
return ret;
}
@@ -2573,7 +2654,10 @@ static __poll_t io_uring_poll(struct file *file, poll_table *wait)
__poll_t mask = 0;
poll_wait(file, &ctx->cq_wait, wait);
/* See comment at the top of this file */
/*
* synchronizes with barrier from wq_has_sleeper call in
* io_commit_cqring
*/
smp_rmb();
if (READ_ONCE(ctx->sq_ring->r.tail) - ctx->cached_sq_head !=
ctx->sq_ring->ring_entries)
@@ -2687,24 +2771,12 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
mutex_lock(&ctx->uring_lock);
submitted = io_ring_submit(ctx, to_submit);
mutex_unlock(&ctx->uring_lock);
if (submitted < 0)
goto out_ctx;
}
if (flags & IORING_ENTER_GETEVENTS) {
unsigned nr_events = 0;
min_complete = min(min_complete, ctx->cq_entries);
/*
* The application could have included the 'to_submit' count
* in how many events it wanted to wait for. If we failed to
* submit the desired count, we may need to adjust the number
* of events to poll/wait for.
*/
if (submitted < to_submit)
min_complete = min_t(unsigned, submitted, min_complete);
if (ctx->flags & IORING_SETUP_IOPOLL) {
mutex_lock(&ctx->uring_lock);
ret = io_iopoll_check(ctx, &nr_events, min_complete);
@@ -2750,17 +2822,12 @@ static int io_allocate_scq_urings(struct io_ring_ctx *ctx,
return -EOVERFLOW;
ctx->sq_sqes = io_mem_alloc(size);
if (!ctx->sq_sqes) {
io_mem_free(ctx->sq_ring);
if (!ctx->sq_sqes)
return -ENOMEM;
}
cq_ring = io_mem_alloc(struct_size(cq_ring, cqes, p->cq_entries));
if (!cq_ring) {
io_mem_free(ctx->sq_ring);
io_mem_free(ctx->sq_sqes);
if (!cq_ring)
return -ENOMEM;
}
ctx->cq_ring = cq_ring;
cq_ring->ring_mask = p->cq_entries - 1;

View File

@@ -346,10 +346,16 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
__kernel_fsid_t fsid = {};
fsnotify_foreach_obj_type(type) {
struct fsnotify_mark_connector *conn;
if (!fsnotify_iter_should_report_type(iter_info, type))
continue;
fsid = iter_info->marks[type]->connector->fsid;
conn = READ_ONCE(iter_info->marks[type]->connector);
/* Mark is just getting destroyed or created? */
if (!conn)
continue;
fsid = conn->fsid;
if (WARN_ON_ONCE(!fsid.val[0] && !fsid.val[1]))
continue;
return fsid;
@@ -408,8 +414,12 @@ static int fanotify_handle_event(struct fsnotify_group *group,
return 0;
}
if (FAN_GROUP_FLAG(group, FAN_REPORT_FID))
if (FAN_GROUP_FLAG(group, FAN_REPORT_FID)) {
fsid = fanotify_get_fsid(iter_info);
/* Racing with mark destruction or creation? */
if (!fsid.val[0] && !fsid.val[1])
return 0;
}
event = fanotify_alloc_event(group, inode, mask, data, data_type,
&fsid);

View File

@@ -239,13 +239,13 @@ static void fsnotify_drop_object(unsigned int type, void *objp)
void fsnotify_put_mark(struct fsnotify_mark *mark)
{
struct fsnotify_mark_connector *conn;
struct fsnotify_mark_connector *conn = READ_ONCE(mark->connector);
void *objp = NULL;
unsigned int type = FSNOTIFY_OBJ_TYPE_DETACHED;
bool free_conn = false;
/* Catch marks that were actually never attached to object */
if (!mark->connector) {
if (!conn) {
if (refcount_dec_and_test(&mark->refcnt))
fsnotify_final_mark_destroy(mark);
return;
@@ -255,10 +255,9 @@ void fsnotify_put_mark(struct fsnotify_mark *mark)
* We have to be careful so that traversals of obj_list under lock can
* safely grab mark reference.
*/
if (!refcount_dec_and_lock(&mark->refcnt, &mark->connector->lock))
if (!refcount_dec_and_lock(&mark->refcnt, &conn->lock))
return;
conn = mark->connector;
hlist_del_init_rcu(&mark->obj_list);
if (hlist_empty(&conn->list)) {
objp = fsnotify_detach_connector_from_object(conn, &type);
@@ -266,7 +265,7 @@ void fsnotify_put_mark(struct fsnotify_mark *mark)
} else {
__fsnotify_recalc_mask(conn);
}
mark->connector = NULL;
WRITE_ONCE(mark->connector, NULL);
spin_unlock(&conn->lock);
fsnotify_drop_object(type, objp);
@@ -620,7 +619,7 @@ restart:
/* mark should be the last entry. last is the current last entry */
hlist_add_behind_rcu(&mark->obj_list, &last->obj_list);
added:
mark->connector = conn;
WRITE_ONCE(mark->connector, conn);
out_err:
spin_unlock(&conn->lock);
spin_unlock(&mark->lock);
@@ -808,6 +807,7 @@ void fsnotify_init_mark(struct fsnotify_mark *mark,
refcount_set(&mark->refcnt, 1);
fsnotify_get_group(group);
mark->group = group;
WRITE_ONCE(mark->connector, NULL);
}
/*

View File

@@ -1467,11 +1467,6 @@ int vfs_get_tree(struct fs_context *fc)
struct super_block *sb;
int error;
if (fc->fs_type->fs_flags & FS_REQUIRES_DEV && !fc->source) {
errorf(fc, "Filesystem requires source device");
return -ENOENT;
}
if (fc->root)
return -EBUSY;

View File

@@ -229,7 +229,7 @@ ufs_get_inode_gid(struct super_block *sb, struct ufs_inode *inode)
case UFS_UID_44BSD:
return fs32_to_cpu(sb, inode->ui_u3.ui_44.ui_gid);
case UFS_UID_EFT:
if (inode->ui_u1.oldids.ui_suid == 0xFFFF)
if (inode->ui_u1.oldids.ui_sgid == 0xFFFF)
return fs32_to_cpu(sb, inode->ui_u3.ui_sun.ui_gid);
/* Fall through */
default:

View File

@@ -510,7 +510,7 @@ int bpf_prog_array_copy(struct bpf_prog_array __rcu *old_array,
} \
_out: \
rcu_read_unlock(); \
preempt_enable_no_resched(); \
preempt_enable(); \
_ret; \
})

View File

@@ -811,6 +811,22 @@ static inline bool clk_has_parent(struct clk *clk, struct clk *parent)
return true;
}
static inline int clk_set_rate_range(struct clk *clk, unsigned long min,
unsigned long max)
{
return 0;
}
static inline int clk_set_min_rate(struct clk *clk, unsigned long rate)
{
return 0;
}
static inline int clk_set_max_rate(struct clk *clk, unsigned long rate)
{
return 0;
}
static inline int clk_set_parent(struct clk *clk, struct clk *parent)
{
return 0;

View File

@@ -240,7 +240,6 @@ struct perf_event;
#define PERF_PMU_CAP_NO_INTERRUPT 0x01
#define PERF_PMU_CAP_NO_NMI 0x02
#define PERF_PMU_CAP_AUX_NO_SG 0x04
#define PERF_PMU_CAP_AUX_SW_DOUBLEBUF 0x08
#define PERF_PMU_CAP_EXCLUSIVE 0x10
#define PERF_PMU_CAP_ITRACE 0x20
#define PERF_PMU_CAP_HETEROGENEOUS_CPUS 0x40

View File

@@ -60,7 +60,7 @@ struct iov_iter {
static inline enum iter_type iov_iter_type(const struct iov_iter *i)
{
return i->type & ~(READ | WRITE);
return i->type & ~(READ | WRITE | ITER_BVEC_FLAG_NO_REF);
}
static inline bool iter_is_iovec(const struct iov_iter *i)

View File

@@ -200,7 +200,6 @@ usb_find_last_int_out_endpoint(struct usb_host_interface *alt,
* @dev: driver model's view of this device
* @usb_dev: if an interface is bound to the USB major, this will point
* to the sysfs representation for that device.
* @pm_usage_cnt: PM usage counter for this interface
* @reset_ws: Used for scheduling resets from atomic context.
* @resetting_device: USB core reset the device, so use alt setting 0 as
* current; needs bandwidth alloc after reset.
@@ -257,7 +256,6 @@ struct usb_interface {
struct device dev; /* interface specific device info */
struct device *usb_dev;
atomic_t pm_usage_cnt; /* usage counter for autosuspend */
struct work_struct reset_ws; /* for resets in atomic context */
};
#define to_usb_interface(d) container_of(d, struct usb_interface, dev)

View File

@@ -105,7 +105,6 @@ enum sctp_verb {
SCTP_CMD_T1_RETRAN, /* Mark for retransmission after T1 timeout */
SCTP_CMD_UPDATE_INITTAG, /* Update peer inittag */
SCTP_CMD_SEND_MSG, /* Send the whole use message */
SCTP_CMD_SEND_NEXT_ASCONF, /* Send the next ASCONF after ACK */
SCTP_CMD_PURGE_ASCONF_QUEUE, /* Purge all asconf queues.*/
SCTP_CMD_SET_ASOC, /* Restore association context */
SCTP_CMD_LAST

View File

@@ -295,7 +295,8 @@ struct xfrm_replay {
};
struct xfrm_if_cb {
struct xfrm_if *(*decode_session)(struct sk_buff *skb);
struct xfrm_if *(*decode_session)(struct sk_buff *skb,
unsigned short family);
};
void xfrm_if_register_cb(const struct xfrm_if_cb *ifcb);
@@ -1404,6 +1405,23 @@ static inline int xfrm_state_kern(const struct xfrm_state *x)
return atomic_read(&x->tunnel_users);
}
static inline bool xfrm_id_proto_valid(u8 proto)
{
switch (proto) {
case IPPROTO_AH:
case IPPROTO_ESP:
case IPPROTO_COMP:
#if IS_ENABLED(CONFIG_IPV6)
case IPPROTO_ROUTING:
case IPPROTO_DSTOPTS:
#endif
return true;
default:
return false;
}
}
/* IPSEC_PROTO_ANY only matches 3 IPsec protocols, 0 could match all. */
static inline int xfrm_id_proto_match(u8 proto, u8 userproto)
{
return (!userproto || proto == userproto ||

View File

@@ -4138,15 +4138,35 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
return 0;
}
static void __find_good_pkt_pointers(struct bpf_func_state *state,
struct bpf_reg_state *dst_reg,
enum bpf_reg_type type, u16 new_range)
{
struct bpf_reg_state *reg;
int i;
for (i = 0; i < MAX_BPF_REG; i++) {
reg = &state->regs[i];
if (reg->type == type && reg->id == dst_reg->id)
/* keep the maximum range already checked */
reg->range = max(reg->range, new_range);
}
bpf_for_each_spilled_reg(i, state, reg) {
if (!reg)
continue;
if (reg->type == type && reg->id == dst_reg->id)
reg->range = max(reg->range, new_range);
}
}
static void find_good_pkt_pointers(struct bpf_verifier_state *vstate,
struct bpf_reg_state *dst_reg,
enum bpf_reg_type type,
bool range_right_open)
{
struct bpf_func_state *state = vstate->frame[vstate->curframe];
struct bpf_reg_state *regs = state->regs, *reg;
u16 new_range;
int i, j;
int i;
if (dst_reg->off < 0 ||
(dst_reg->off == 0 && range_right_open))
@@ -4211,20 +4231,9 @@ static void find_good_pkt_pointers(struct bpf_verifier_state *vstate,
* the range won't allow anything.
* dst_reg->off is known < MAX_PACKET_OFF, therefore it fits in a u16.
*/
for (i = 0; i < MAX_BPF_REG; i++)
if (regs[i].type == type && regs[i].id == dst_reg->id)
/* keep the maximum range already checked */
regs[i].range = max(regs[i].range, new_range);
for (j = 0; j <= vstate->curframe; j++) {
state = vstate->frame[j];
bpf_for_each_spilled_reg(i, state, reg) {
if (!reg)
continue;
if (reg->type == type && reg->id == dst_reg->id)
reg->range = max(reg->range, new_range);
}
}
for (i = 0; i <= vstate->curframe; i++)
__find_good_pkt_pointers(vstate->frame[i], dst_reg, type,
new_range);
}
/* compute branch direction of the expression "if (reg opcode val) goto target;"
@@ -4698,6 +4707,22 @@ static void mark_ptr_or_null_reg(struct bpf_func_state *state,
}
}
static void __mark_ptr_or_null_regs(struct bpf_func_state *state, u32 id,
bool is_null)
{
struct bpf_reg_state *reg;
int i;
for (i = 0; i < MAX_BPF_REG; i++)
mark_ptr_or_null_reg(state, &state->regs[i], id, is_null);
bpf_for_each_spilled_reg(i, state, reg) {
if (!reg)
continue;
mark_ptr_or_null_reg(state, reg, id, is_null);
}
}
/* The logic is similar to find_good_pkt_pointers(), both could eventually
* be folded together at some point.
*/
@@ -4705,10 +4730,10 @@ static void mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
bool is_null)
{
struct bpf_func_state *state = vstate->frame[vstate->curframe];
struct bpf_reg_state *reg, *regs = state->regs;
struct bpf_reg_state *regs = state->regs;
u32 ref_obj_id = regs[regno].ref_obj_id;
u32 id = regs[regno].id;
int i, j;
int i;
if (ref_obj_id && ref_obj_id == id && is_null)
/* regs[regno] is in the " == NULL" branch.
@@ -4717,17 +4742,8 @@ static void mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
*/
WARN_ON_ONCE(release_reference_state(state, id));
for (i = 0; i < MAX_BPF_REG; i++)
mark_ptr_or_null_reg(state, &regs[i], id, is_null);
for (j = 0; j <= vstate->curframe; j++) {
state = vstate->frame[j];
bpf_for_each_spilled_reg(i, state, reg) {
if (!reg)
continue;
mark_ptr_or_null_reg(state, reg, id, is_null);
}
}
for (i = 0; i <= vstate->curframe; i++)
__mark_ptr_or_null_regs(vstate->frame[i], id, is_null);
}
static bool try_match_pkt_pointers(const struct bpf_insn *insn,

View File

@@ -610,8 +610,7 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
* PMU requests more than one contiguous chunks of memory
* for SW double buffering
*/
if ((event->pmu->capabilities & PERF_PMU_CAP_AUX_SW_DOUBLEBUF) &&
!overwrite) {
if (!overwrite) {
if (!max_order)
return -EINVAL;

View File

@@ -771,6 +771,7 @@ out:
return 0;
fail:
kobject_put(&tunables->attr_set.kobj);
policy->governor_data = NULL;
sugov_tunables_free(tunables);

View File

@@ -502,7 +502,10 @@ out:
*
* Caller must be holding current->sighand->siglock lock.
*
* Returns 0 on success, -ve on error.
* Returns 0 on success, -ve on error, or
* - in TSYNC mode: the pid of a thread which was either not in the correct
* seccomp mode or did not have an ancestral seccomp filter
* - in NEW_LISTENER mode: the fd of the new listener
*/
static long seccomp_attach_filter(unsigned int flags,
struct seccomp_filter *filter)
@@ -1258,6 +1261,16 @@ static long seccomp_set_mode_filter(unsigned int flags,
if (flags & ~SECCOMP_FILTER_FLAG_MASK)
return -EINVAL;
/*
* In the successful case, NEW_LISTENER returns the new listener fd.
* But in the failure case, TSYNC returns the thread that died. If you
* combine these two flags, there's no way to tell whether something
* succeeded or failed. So, let's disallow this combination.
*/
if ((flags & SECCOMP_FILTER_FLAG_TSYNC) &&
(flags & SECCOMP_FILTER_FLAG_NEW_LISTENER))
return -EINVAL;
/* Prepare the new filter before holding any locks. */
prepared = seccomp_prepare_user_filter(filter);
if (IS_ERR(prepared))
@@ -1304,7 +1317,7 @@ out:
mutex_unlock(&current->signal->cred_guard_mutex);
out_put_fd:
if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER) {
if (ret < 0) {
if (ret) {
listener_f->private_data = NULL;
fput(listener_f);
put_unused_fd(listener);

View File

@@ -17,6 +17,17 @@ KCOV_INSTRUMENT_list_debug.o := n
KCOV_INSTRUMENT_debugobjects.o := n
KCOV_INSTRUMENT_dynamic_debug.o := n
# Early boot use of cmdline, don't instrument it
ifdef CONFIG_AMD_MEM_ENCRYPT
KASAN_SANITIZE_string.o := n
ifdef CONFIG_FUNCTION_TRACER
CFLAGS_REMOVE_string.o = -pg
endif
CFLAGS_string.o := $(call cc-option, -fno-stack-protector)
endif
lib-y := ctype.o string.o vsprintf.o cmdline.o \
rbtree.o radix-tree.o timerqueue.o xarray.o \
idr.o int_sqrt.o extable.o \

Some files were not shown because too many files have changed in this diff Show More