Commit Graph

40 Commits

Author SHA1 Message Date
Michael Bestas
9c70abfc5e
Merge tag 'ASB-2022-11-01_11-5.4' of https://android.googlesource.com/kernel/common into android13-5.4-lahaina
https://source.android.com/docs/security/bulletin/2022-11-01

* tag 'ASB-2022-11-01_11-5.4' of https://android.googlesource.com/kernel/common:
  UPSTREAM: mm/mremap: hold the rmap lock in write mode when moving page table entries.
  FROMLIST: binder: fix UAF of alloc->vma in race with munmap()
  UPSTREAM: mm: Fix TLB flush for not-first PFNMAP mappings in unmap_region()
  UPSTREAM: mm: Force TLB flush for PFNMAP mappings before unlink_file_vma()
  UPSTREAM: af_key: Do not call xfrm_probe_algs in parallel
  UPSTREAM: wifi: cfg80211: fix u8 overflow in cfg80211_update_notlisted_nontrans()
  UPSTREAM: wifi: cfg80211/mac80211: reject bad MBSSID elements
  UPSTREAM: wifi: cfg80211: ensure length byte is present before access
  UPSTREAM: wifi: cfg80211: fix BSS refcounting bugs
  UPSTREAM: wifi: cfg80211: avoid nontransmitted BSS list corruption
  UPSTREAM: wifi: mac80211_hwsim: avoid mac80211 warning on bad rate
  UPSTREAM: wifi: cfg80211: update hidden BSSes to avoid WARN_ON
  UPSTREAM: mac80211: mlme: find auth challenge directly
  UPSTREAM: wifi: mac80211: don't parse mbssid in assoc response
  UPSTREAM: wifi: mac80211: fix MBSSID parsing use-after-free
  ANDROID: Drop explicit 'CONFIG_INIT_STACK_ALL_ZERO=y' from gki_defconfig
  UPSTREAM: hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero
  UPSTREAM: hardening: Avoid harmless Clang option under CONFIG_INIT_STACK_ALL_ZERO
  UPSTREAM: hardening: Clarify Kconfig text for auto-var-init
  ANDROID: GKI: Update FCNT KMI symbol list
  ANDROID: Fix kenelci build-break for !CONFIG_PERF_EVENTS
  BACKPORT: HID: steam: Prevent NULL pointer dereference in steam_{recv,send}_report
  ANDROID: ABI: Update allowed list for QCOM
  UPSTREAM: wifi: mac80211_hwsim: use 32-bit skb cookie
  UPSTREAM: wifi: mac80211_hwsim: add back erroneously removed cast
  UPSTREAM: wifi: mac80211_hwsim: fix race condition in pending packet
  ANDROID: incfs: Add check for ATTR_KILL_SUID and ATTR_MODE in incfs_setattr
  Linux 5.4.210
  x86/speculation: Add LFENCE to RSB fill sequence
  x86/speculation: Add RSB VM Exit protections
  macintosh/adb: fix oob read in do_adb_query() function
  media: v4l2-mem2mem: Apply DST_QUEUE_OFF_BASE on MMAP buffers across ioctls
  selftests: KVM: Handle compiler optimizations in ucall
  KVM: Don't null dereference ops->destroy
  selftests/bpf: Fix "dubious pointer arithmetic" test
  selftests/bpf: Fix test_align verifier log patterns
  bpf: Test_verifier, #70 error message updates for 32-bit right shift
  selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads
  bpf: Verifer, adjust_scalar_min_max_vals to always call update_reg_bounds()
  ACPI: APEI: Better fix to avoid spamming the console with old error logs
  ACPI: video: Shortening quirk list by identifying Clevo by board_name only
  ACPI: video: Force backlight native for some TongFang devices
  thermal: Fix NULL pointer dereferences in of_thermal_ functions
  ANDROID: GKI: db845c: Update symbols list and ABI
  Linux 5.4.209
  scsi: core: Fix race between handling STS_RESOURCE and completion
  mt7601u: add USB device ID for some versions of XiaoDu WiFi Dongle.
  ARM: crypto: comment out gcc warning that breaks clang builds
  sctp: leave the err path free in sctp_stream_init to sctp_stream_free
  sfc: disable softirqs for ptp TX
  perf symbol: Correct address for bss symbols
  virtio-net: fix the race between refill work and close
  netfilter: nf_queue: do not allow packet truncation below transport header offset
  sctp: fix sleep in atomic context bug in timer handlers
  i40e: Fix interface init with MSI interrupts (no MSI-X)
  tcp: Fix a data-race around sysctl_tcp_comp_sack_nr.
  tcp: Fix a data-race around sysctl_tcp_comp_sack_delay_ns.
  Documentation: fix sctp_wmem in ip-sysctl.rst
  tcp: Fix a data-race around sysctl_tcp_invalid_ratelimit.
  tcp: Fix a data-race around sysctl_tcp_autocorking.
  tcp: Fix a data-race around sysctl_tcp_min_rtt_wlen.
  tcp: Fix a data-race around sysctl_tcp_min_tso_segs.
  net: sungem_phy: Add of_node_put() for reference returned by of_get_parent()
  igmp: Fix data-races around sysctl_igmp_qrv.
  ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr
  net: ping6: Fix memleak in ipv6_renew_options().
  tcp: Fix a data-race around sysctl_tcp_challenge_ack_limit.
  tcp: Fix a data-race around sysctl_tcp_limit_output_bytes.
  scsi: ufs: host: Hold reference returned by of_parse_phandle()
  ice: do not setup vlan for loopback VSI
  ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS)
  tcp: Fix a data-race around sysctl_tcp_nometrics_save.
  tcp: Fix a data-race around sysctl_tcp_frto.
  tcp: Fix a data-race around sysctl_tcp_adv_win_scale.
  tcp: Fix a data-race around sysctl_tcp_app_win.
  tcp: Fix data-races around sysctl_tcp_dsack.
  s390/archrandom: prevent CPACF trng invocations in interrupt context
  ntfs: fix use-after-free in ntfs_ucsncmp()
  Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
  ANDROID: restore some removed refcount functions
  ANDROID: add tty_schedule_flip() back to the kernel
  Linux 5.4.208
  x86: drop bogus "cc" clobber from __try_cmpxchg_user_asm()
  net: usb: ax88179_178a needs FLAG_SEND_ZLP
  tty: use new tty_insert_flip_string_and_push_buffer() in pty_write()
  tty: extract tty_flip_buffer_commit() from tty_flip_buffer_push()
  tty: drop tty_schedule_flip()
  tty: the rest, stop using tty_schedule_flip()
  tty: drivers/tty/, stop using tty_schedule_flip()
  Bluetooth: Fix bt_skb_sendmmsg not allocating partial chunks
  Bluetooth: SCO: Fix sco_send_frame returning skb->len
  Bluetooth: Fix passing NULL to PTR_ERR
  Bluetooth: RFCOMM: Replace use of memcpy_from_msg with bt_skb_sendmmsg
  Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg
  Bluetooth: Add bt_skb_sendmmsg helper
  Bluetooth: Add bt_skb_sendmsg helper
  ALSA: memalloc: Align buffer allocations in page size
  bitfield.h: Fix "type of reg too small for mask" test
  x86/mce: Deduplicate exception handling
  mmap locking API: initial implementation as rwsem wrappers
  x86/uaccess: Implement macros for CMPXCHG on user addresses
  x86: get rid of small constant size cases in raw_copy_{to,from}_user()
  locking/refcount: Consolidate implementations of refcount_t
  locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions
  locking/refcount: Move saturation warnings out of line
  locking/refcount: Improve performance of generic REFCOUNT_FULL code
  locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header
  locking/refcount: Remove unused refcount_*_checked() variants
  locking/refcount: Ensure integer operands are treated as signed
  locking/refcount: Define constants for saturation and max refcount values
  ima: remove the IMA_TEMPLATE Kconfig option
  dlm: fix pending remove if msg allocation fails
  bpf: Make sure mac_header was set before using it
  mm/mempolicy: fix uninit-value in mpol_rebind_policy()
  spi: bcm2835: bcm2835_spi_handle_err(): fix NULL pointer deref for non DMA transfers
  tcp: Fix data-races around sysctl_tcp_max_reordering.
  tcp: Fix a data-race around sysctl_tcp_rfc1337.
  tcp: Fix a data-race around sysctl_tcp_stdurg.
  tcp: Fix a data-race around sysctl_tcp_retrans_collapse.
  tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.
  tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts.
  tcp: Fix data-races around sysctl_tcp_recovery.
  tcp: Fix a data-race around sysctl_tcp_early_retrans.
  tcp: Fix data-races around sysctl knobs related to SYN option.
  udp: Fix a data-race around sysctl_udp_l3mdev_accept.
  ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh.
  be2net: Fix buffer overflow in be_get_module_eeprom
  gpio: pca953x: only use single read/write for No AI mode
  ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero
  i40e: Fix erroneous adapter reinitialization during recovery process
  iavf: Fix handling of dummy receive descriptors
  tcp: Fix data-races around sysctl_tcp_fastopen.
  tcp: Fix data-races around sysctl_max_syn_backlog.
  tcp: Fix a data-race around sysctl_tcp_tw_reuse.
  tcp: Fix a data-race around sysctl_tcp_notsent_lowat.
  tcp: Fix data-races around some timeout sysctl knobs.
  tcp: Fix data-races around sysctl_tcp_reordering.
  tcp: Fix data-races around sysctl_tcp_syncookies.
  igmp: Fix a data-race around sysctl_igmp_max_memberships.
  igmp: Fix data-races around sysctl_igmp_llm_reports.
  net/tls: Fix race in TLS device down flow
  net: stmmac: fix dma queue left shift overflow issue
  i2c: cadence: Change large transfer count reset logic to be unconditional
  tcp: Fix a data-race around sysctl_tcp_probe_interval.
  tcp: Fix a data-race around sysctl_tcp_probe_threshold.
  tcp: Fix a data-race around sysctl_tcp_mtu_probe_floor.
  tcp: Fix data-races around sysctl_tcp_min_snd_mss.
  tcp: Fix data-races around sysctl_tcp_base_mss.
  tcp: Fix data-races around sysctl_tcp_mtu_probing.
  tcp/dccp: Fix a data-race around sysctl_tcp_fwmark_accept.
  ip: Fix a data-race around sysctl_fwmark_reflect.
  ip: Fix data-races around sysctl_ip_nonlocal_bind.
  ip: Fix data-races around sysctl_ip_fwd_use_pmtu.
  ip: Fix data-races around sysctl_ip_no_pmtu_disc.
  igc: Reinstate IGC_REMOVED logic and implement it properly
  perf/core: Fix data race between perf_event_set_output() and perf_mmap_close()
  pinctrl: ralink: Check for null return of devm_kcalloc
  power/reset: arm-versatile: Fix refcount leak in versatile_reboot_probe
  xfrm: xfrm_policy: fix a possible double xfrm_pols_put() in xfrm_bundle_lookup()
  serial: mvebu-uart: correctly report configured baudrate value
  PCI: hv: Fix interrupt mapping for multi-MSI
  PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()
  PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI
  PCI: hv: Fix multi-MSI to allow more than one MSI vector
  xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
  lockdown: Fix kexec lockdown bypass with ima policy
  mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication
  riscv: add as-options for modules with assembly compontents
  pinctrl: stm32: fix optional IRQ support to gpios
  Revert "cgroup: Use separate src/dst nodes when preloading css_sets for migration"
  Linux 5.4.207
  can: m_can: m_can_tx_handler(): fix use after free of skb
  serial: pl011: UPSTAT_AUTORTS requires .throttle/unthrottle
  serial: stm32: Clear prev values before setting RTS delays
  serial: 8250: fix return error code in serial8250_request_std_resource()
  tty: serial: samsung_tty: set dma burst_size to 1
  usb: dwc3: gadget: Fix event pending check
  usb: typec: add missing uevent when partner support PD
  USB: serial: ftdi_sio: add Belimo device ids
  signal handling: don't use BUG_ON() for debugging
  ARM: dts: stm32: use the correct clock source for CEC on stm32mp151
  soc: ixp4xx/npe: Fix unused match warning
  x86: Clear .brk area at early boot
  irqchip: or1k-pic: Undefine mask_ack for level triggered hardware
  ASoC: madera: Fix event generation for rate controls
  ASoC: madera: Fix event generation for OUT1 demux
  ASoC: cs47l15: Fix event generation for low power mux control
  ASoC: wm5110: Fix DRE control
  ASoC: ops: Fix off by one in range control validation
  net: sfp: fix memory leak in sfp_probe()
  nvme: fix regression when disconnect a recovering ctrl
  NFC: nxp-nci: don't print header length mismatch on i2c error
  net: tipc: fix possible refcount leak in tipc_sk_create()
  platform/x86: hp-wmi: Ignore Sanitization Mode event
  cpufreq: pmac32-cpufreq: Fix refcount leak bug
  netfilter: br_netfilter: do not skip all hooks with 0 priority
  virtio_mmio: Restore guest page size on resume
  virtio_mmio: Add missing PM calls to freeze/restore
  mm: sysctl: fix missing numa_stat when !CONFIG_HUGETLB_PAGE
  sfc: fix kernel panic when creating VF
  seg6: bpf: fix skb checksum in bpf_push_seg6_encap()
  seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors
  seg6: fix skb checksum evaluation in SRH encapsulation/insertion
  sfc: fix use after free when disabling sriov
  net: ftgmac100: Hold reference returned by of_get_child_by_name()
  ipv4: Fix data-races around sysctl_ip_dynaddr.
  raw: Fix a data-race around sysctl_raw_l3mdev_accept.
  icmp: Fix a data-race around sysctl_icmp_ratemask.
  icmp: Fix a data-race around sysctl_icmp_ratelimit.
  drm/i915/gt: Serialize TLB invalidates with GT resets
  ARM: dts: sunxi: Fix SPI NOR campatible on Orange Pi Zero
  ARM: dts: at91: sama5d2: Fix typo in i2s1 node
  ipv4: Fix a data-race around sysctl_fib_sync_mem.
  icmp: Fix data-races around sysctl.
  cipso: Fix data-races around sysctl.
  net: Fix data-races around sysctl_mem.
  inetpeer: Fix data-races around sysctl.
  net: stmmac: dwc-qos: Disable split header for Tegra194
  ASoC: sgtl5000: Fix noise on shutdown/remove
  ima: Fix a potential integer overflow in ima_appraise_measurement
  drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()
  ARM: 9210/1: Mark the FDT_FIXED sections as shareable
  ARM: 9209/1: Spectre-BHB: avoid pr_info() every time a CPU comes out of idle
  ARM: dts: imx6qdl-ts7970: Fix ngpio typo and count
  ext4: fix race condition between ext4_write and ext4_convert_inline_data
  sched/rt: Disable RT_RUNTIME_SHARE by default
  Revert "evm: Fix memleak in init_desc"
  nilfs2: fix incorrect masking of permission flags for symlinks
  drm/panfrost: Fix shrinker list corruption by madvise IOCTL
  cgroup: Use separate src/dst nodes when preloading css_sets for migration
  wifi: mac80211: fix queue selection for mesh/OCB interfaces
  ARM: 9214/1: alignment: advance IT state after emulating Thumb instruction
  ARM: 9213/1: Print message about disabled Spectre workarounds only once
  ip: fix dflt addr selection for connected nexthop
  net: sock: tracing: Fix sock_exceed_buf_limit not to dereference stale pointer
  tracing/histograms: Fix memory leak problem
  xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue
  ALSA: hda/realtek - Enable the headset-mic on a Xiaomi's laptop
  ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc221
  ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc671
  ALSA: hda/conexant: Apply quirk for another HP ProDesk 600 G3 model
  ALSA: hda - Add fixup for Dell Latitidue E5430
  Linux 5.4.206
  Revert "mtd: rawnand: gpmi: Fix setting busy timeout setting"
  Linux 5.4.205
  dmaengine: ti: Add missing put_device in ti_dra7_xbar_route_allocate
  dmaengine: ti: Fix refcount leak in ti_dra7_xbar_route_allocate
  dmaengine: at_xdma: handle errors of at_xdmac_alloc_desc() correctly
  dmaengine: pl330: Fix lockdep warning about non-static key
  ida: don't use BUG_ON() for debugging
  dt-bindings: dma: allwinner,sun50i-a64-dma: Fix min/max typo
  misc: rtsx_usb: set return value in rsp_buf alloc err path
  misc: rtsx_usb: use separate command and response buffers
  misc: rtsx_usb: fix use of dma mapped buffer for usb bulk transfer
  dmaengine: imx-sdma: Allow imx8m for imx7 FW revs
  i2c: cadence: Unregister the clk notifier in error path
  selftests: forwarding: fix error message in learning_test
  selftests: forwarding: fix learning_test when h1 supports IFF_UNICAST_FLT
  selftests: forwarding: fix flood_unicast_test when h2 supports IFF_UNICAST_FLT
  ibmvnic: Properly dispose of all skbs during a failover.
  ARM: at91: pm: use proper compatibles for sam9x60's rtc and rtt
  ARM: at91: pm: use proper compatible for sama5d2's rtc
  pinctrl: sunxi: sunxi_pconf_set: use correct offset
  pinctrl: sunxi: a83t: Fix NAND function name for some pins
  ARM: meson: Fix refcount leak in meson_smp_prepare_cpus
  xfs: remove incorrect ASSERT in xfs_rename
  can: kvaser_usb: kvaser_usb_leaf: fix bittiming limits
  can: kvaser_usb: kvaser_usb_leaf: fix CAN clock frequency regression
  can: kvaser_usb: replace run-time checks with struct kvaser_usb_driver_info
  powerpc/powernv: delay rng platform device creation until later in boot
  video: of_display_timing.h: include errno.h
  fbcon: Prevent that screen size is smaller than font size
  fbcon: Disallow setting font bigger than screen size
  fbmem: Check virtual screen sizes in fb_set_var()
  fbdev: fbmem: Fix logo center image dx issue
  iommu/vt-d: Fix PCI bus rescan device hot add
  net: rose: fix UAF bug caused by rose_t0timer_expiry
  usbnet: fix memory leak in error case
  can: gs_usb: gs_usb_open/close(): fix memory leak
  can: grcan: grcan_probe(): remove extra of_node_get()
  can: bcm: use call_rcu() instead of costly synchronize_rcu()
  mm/slub: add missing TID updates on slab deactivation
  esp: limit skb_page_frag_refill use to a single page
  Linux 5.4.204
  clocksource/drivers/ixp4xx: remove EXPORT_SYMBOL_GPL from ixp4xx_timer_setup()
  net: usb: qmi_wwan: add Telit 0x1070 composition
  net: usb: qmi_wwan: add Telit 0x1060 composition
  xen/arm: Fix race in RB-tree based P2M accounting
  xen/blkfront: force data bouncing when backend is untrusted
  xen/netfront: force data bouncing when backend is untrusted
  xen/netfront: fix leaking data in shared pages
  xen/blkfront: fix leaking data in shared pages
  selftests/rseq: Change type of rseq_offset to ptrdiff_t
  selftests/rseq: x86-32: use %gs segment selector for accessing rseq thread area
  selftests/rseq: x86-64: use %fs segment selector for accessing rseq thread area
  selftests/rseq: Fix: work-around asm goto compiler bugs
  selftests/rseq: Remove arm/mips asm goto compiler work-around
  selftests/rseq: Fix warnings about #if checks of undefined tokens
  selftests/rseq: Fix ppc32 offsets by using long rather than off_t
  selftests/rseq: Fix ppc32 missing instruction selection "u" and "x" for load/store
  selftests/rseq: Fix ppc32: wrong rseq_cs 32-bit field pointer on big endian
  selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35
  selftests/rseq: Introduce thread pointer getters
  selftests/rseq: Introduce rseq_get_abi() helper
  selftests/rseq: Remove volatile from __rseq_abi
  selftests/rseq: Remove useless assignment to cpu variable
  selftests/rseq: introduce own copy of rseq uapi header
  selftests/rseq: remove ARRAY_SIZE define from individual tests
  rseq/selftests,x86_64: Add rseq_offset_deref_addv()
  ipv6/sit: fix ipip6_tunnel_get_prl return value
  sit: use min
  net: dsa: bcm_sf2: force pause link settings
  hwmon: (ibmaem) don't call platform_device_del() if platform_device_add() fails
  xen/gntdev: Avoid blocking in unmap_grant_pages()
  net: tun: avoid disabling NAPI twice
  NFC: nxp-nci: Don't issue a zero length i2c_master_read()
  nfc: nfcmrvl: Fix irq_of_parse_and_map() return value
  net: bonding: fix use-after-free after 802.3ad slave unbind
  net: bonding: fix possible NULL deref in rlb code
  net/sched: act_api: Notify user space if any actions were flushed before error
  netfilter: nft_dynset: restore set element counter when failing to update
  s390: remove unneeded 'select BUILD_BIN2C'
  PM / devfreq: exynos-ppmu: Fix refcount leak in of_get_devfreq_events
  caif_virtio: fix race between virtio_device_ready() and ndo_open()
  net: ipv6: unexport __init-annotated seg6_hmac_net_init()
  usbnet: fix memory allocation in helpers
  linux/dim: Fix divide by 0 in RDMA DIM
  RDMA/qedr: Fix reporting QP timeout attribute
  net: tun: stop NAPI when detaching queues
  net: tun: unlink NAPI from device on destruction
  selftests/net: pass ipv6_args to udpgso_bench's IPv6 TCP test
  virtio-net: fix race between ndo_open() and virtio_device_ready()
  net: usb: ax88179_178a: Fix packet receiving
  net: rose: fix UAF bugs caused by timer handler
  SUNRPC: Fix READ_PLUS crasher
  s390/archrandom: simplify back to earlier design and initialize earlier
  dm raid: fix KASAN warning in raid5_add_disks
  dm raid: fix accesses beyond end of raid member array
  powerpc/bpf: Fix use of user_pt_regs in uapi
  powerpc/prom_init: Fix kernel config grep
  nvdimm: Fix badblocks clear off-by-one error
  ipv6: take care of disable_policy when restoring routes
  Linux 5.4.203
  crypto: arm/ghash-ce - define fpu before fpu registers are referenced
  crypto: arm - use Kconfig based compiler checks for crypto opcodes
  ARM: 9029/1: Make iwmmxt.S support Clang's integrated assembler
  ARM: OMAP2+: drop unnecessary adrl
  ARM: 8929/1: use APSR_nzcv instead of r15 as mrc operand
  ARM: 8933/1: replace Sun/Solaris style flag on section directive
  crypto: arm/sha512-neon - avoid ADRL pseudo instruction
  crypto: arm/sha256-neon - avoid ADRL pseudo instruction
  ARM: 8971/1: replace the sole use of a symbol with its definition
  ARM: 8990/1: use VFP assembler mnemonics in register load/store macros
  ARM: 8989/1: use .fpu assembler directives instead of assembler arguments
  net: mscc: ocelot: allow unregistered IP multicast flooding
  kexec_file: drop weak attribute from arch_kexec_apply_relocations[_add]
  powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
  drm: remove drm_fb_helper_modinit
  Linux 5.4.202
  powerpc/pseries: wire up rng during setup_arch()
  kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS (2nd attempt)
  random: update comment from copy_to_user() -> copy_to_iter()
  modpost: fix section mismatch check for exported init/exit sections
  ARM: cns3xxx: Fix refcount leak in cns3xxx_init
  ARM: Fix refcount leak in axxia_boot_secondary
  soc: bcm: brcmstb: pm: pm-arm: Fix refcount leak in brcmstb_pm_probe
  ARM: exynos: Fix refcount leak in exynos_map_pmu
  ARM: dts: imx6qdl: correct PU regulator ramp delay
  powerpc/powernv: wire up rng during setup_arch
  powerpc/rtas: Allow ibm,platform-dump RTAS call with null buffer address
  powerpc: Enable execve syscall exit tracepoint
  parisc: Enable ARCH_HAS_STRICT_MODULE_RWX
  xtensa: Fix refcount leak bug in time.c
  xtensa: xtfpga: Fix refcount leak bug in setup
  iio: adc: axp288: Override TS pin bias current for some models
  iio: adc: stm32: fix maximum clock rate for stm32mp15x
  iio: trigger: sysfs: fix use-after-free on remove
  iio: gyro: mpu3050: Fix the error handling in mpu3050_power_up()
  iio: accel: mma8452: ignore the return value of reset operation
  iio:accel:mxc4005: rearrange iio trigger get and register
  iio:accel:bma180: rearrange iio trigger get and register
  iio:chemical:ccs811: rearrange iio trigger get and register
  usb: chipidea: udc: check request status before setting device address
  xhci: turn off port power in shutdown
  iio: adc: vf610: fix conversion mode sysfs node name
  s390/cpumf: Handle events cycles and instructions identical
  gpio: winbond: Fix error code in winbond_gpio_get()
  Revert "net/tls: fix tls_sk_proto_close executed repeatedly"
  virtio_net: fix xdp_rxq_info bug after suspend/resume
  igb: Make DMA faster when CPU is active on the PCIe link
  regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chips
  ice: ethtool: advertise 1000M speeds properly
  afs: Fix dynamic root getattr
  MIPS: Remove repetitive increase irq_err_count
  x86/xen: Remove undefined behavior in setup_features()
  udmabuf: add back sanity check
  net/tls: fix tls_sk_proto_close executed repeatedly
  erspan: do not assume transport header is always set
  drm/msm/mdp4: Fix refcount leak in mdp4_modeset_init_intf
  net/sched: sch_netem: Fix arithmetic in netem_dump() for 32-bit platforms
  bonding: ARP monitor spams NETDEV_NOTIFY_PEERS notifiers
  phy: aquantia: Fix AN when higher speeds than 1G are not advertised
  bpf: Fix request_sock leak in sk lookup helpers
  USB: serial: option: add Quectel RM500K module support
  USB: serial: option: add Quectel EM05-G modem
  USB: serial: option: add Telit LE910Cx 0x1250 composition
  random: quiet urandom warning ratelimit suppression message
  dm mirror log: clear log bits up to BITS_PER_LONG boundary
  dm era: commit metadata in postsuspend after worker stops
  ata: libata: add qc->flags in ata_qc_complete_template tracepoint
  mtd: rawnand: gpmi: Fix setting busy timeout setting
  mmc: sdhci-pci-o2micro: Fix card detect by dealing with debouncing
  net: openvswitch: fix parsing of nw_proto for IPv6 fragments
  ALSA: hda/realtek: Add quirk for Clevo PD70PNT
  ALSA: hda/realtek - ALC897 headset MIC no sound
  ALSA: hda/conexant: Fix missing beep setup
  ALSA: hda/via: Fix missing beep setup
  random: schedule mix_interrupt_randomness() less often
  vt: drop old FONT ioctls
  Linux 5.4.201
  Revert "hwmon: Make chip parameter for with_info API mandatory"
  arm64: mm: Don't invalidate FROM_DEVICE buffers at start of DMA transfer
  tcp: drop the hash_32() part from the index calculation
  tcp: increase source port perturb table to 2^16
  tcp: dynamically allocate the perturb table used by source ports
  tcp: add small random increments to the source port
  tcp: use different parts of the port_offset for index and offset
  tcp: add some entropy in __inet_hash_connect()
  usb: gadget: u_ether: fix regression in setting fixed MAC address
  dm: remove special-casing of bio-based immutable singleton target on NVMe
  s390/mm: use non-quiescing sske for KVM switch to keyed guest
  UPSTREAM: ext4: verify dir block before splitting it
  UPSTREAM: ext4: fix use-after-free in ext4_rename_dir_prepare
  BACKPORT: ext4: Only advertise encrypted_casefold when encryption and unicode are enabled
  BACKPORT: ext4: fix no-key deletion for encrypt+casefold
  BACKPORT: ext4: optimize match for casefolded encrypted dirs
  BACKPORT: ext4: handle casefolding with encryption
  Revert "ANDROID: ext4: Handle casefolding with encryption"
  Revert "ANDROID: ext4: Optimize match for casefolded encrypted dirs"
  ANDROID: cpu/hotplug: avoid breaking Android ABI by fusing cpuhp steps
  ANDROID: change function signatures for some random functions.
  Revert "mailbox: forward the hrtimer if not queued and under a lock"
  Revert "drm: fix EDID struct for old ARM OABI format"
  Revert "ALSA: jack: Access input_dev under mutex"
  Linux 5.4.200
  powerpc/mm: Switch obsolete dssall to .long
  riscv: Less inefficient gcc tishift helpers (and export their symbols)
  RISC-V: fix barrier() use in <vdso/processor.h>
  arm64: kprobes: Use BRK instead of single-step when executing instructions out-of-line
  net: openvswitch: fix leak of nested actions
  net: openvswitch: fix misuse of the cached connection on tuple changes
  net/sched: act_police: more accurate MTU policing
  virtio-pci: Remove wrong address verification in vp_del_vqs()
  ALSA: hda/realtek: fix right sounds and mute/micmute LEDs for HP machine
  ALSA: hda/realtek: fix mute/micmute LEDs for HP 440 G8
  ext4: add reserved GDT blocks check
  ext4: make variable "count" signed
  ext4: fix bug_on ext4_mb_use_inode_pa
  dm mirror log: round up region bitmap size to BITS_PER_LONG
  serial: 8250: Store to lsr_save_flags after lsr read
  usb: gadget: lpc32xx_udc: Fix refcount leak in lpc32xx_udc_probe
  usb: dwc2: Fix memory leak in dwc2_hcd_init
  USB: serial: io_ti: add Agilent E5805A support
  USB: serial: option: add support for Cinterion MV31 with new baseline
  comedi: vmk80xx: fix expression for tx buffer size
  i2c: designware: Use standard optional ref clock implementation
  irqchip/gic-v3: Fix refcount leak in gic_populate_ppi_partitions
  irqchip/gic-v3: Fix error handling in gic_populate_ppi_partitions
  irqchip/gic/realview: Fix refcount leak in realview_gic_of_init
  faddr2line: Fix overlapping text section failures, the sequel
  certs/blacklist_hashes.c: fix const confusion in certs blacklist
  arm64: ftrace: fix branch range checks
  net: bgmac: Fix an erroneous kfree() in bgmac_remove()
  mlxsw: spectrum_cnt: Reorder counter pools
  misc: atmel-ssc: Fix IRQ check in ssc_probe
  tty: goldfish: Fix free_irq() on remove
  i40e: Fix call trace in setup_tx_descriptors
  i40e: Fix calculating the number of queue pairs
  i40e: Fix adding ADQ filter to TC0
  clocksource: hyper-v: unexport __init-annotated hv_init_clocksource()
  pNFS: Don't keep retrying if the server replied NFS4ERR_LAYOUTUNAVAILABLE
  random: credit cpu and bootloader seeds by default
  net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag
  ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg
  nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred
  virtio-mmio: fix missing put_device() when vm_cmdline_parent registration failed
  ALSA: hda/realtek - Add HW8326 support
  scsi: pmcraid: Fix missing resource cleanup in error case
  scsi: ipr: Fix missing/incorrect resource cleanup in error case
  scsi: lpfc: Allow reduced polling rate for nvme_admin_async_event cmd completion
  scsi: lpfc: Fix port stuck in bypassed state after LIP in PT2PT topology
  scsi: vmw_pvscsi: Expand vcpuHint to 16 bits
  ASoC: wm_adsp: Fix event generation for wm_adsp_fw_put()
  ASoC: es8328: Fix event generation for deemphasis control
  ASoC: wm8962: Fix suspend while playing music
  ata: libata-core: fix NULL pointer deref in ata_host_alloc_pinfo()
  ASoC: cs42l56: Correct typo in minimum level for SX volume controls
  ASoC: cs42l52: Correct TLV for Bypass Volume
  ASoC: cs53l30: Correct number of volume levels on SX controls
  ASoC: cs35l36: Update digital volume TLV
  ASoC: cs42l52: Fix TLV scales for mixer controls
  dma-debug: make things less spammy under memory pressure
  ASoC: nau8822: Add operation for internal PLL off and on
  powerpc/kasan: Silence KASAN warnings in __get_wchan()
  random: account for arch randomness in bits
  random: mark bootloader randomness code as __init
  random: avoid checking crng_ready() twice in random_init()
  crypto: drbg - make reseeding from get_random_bytes() synchronous
  crypto: drbg - always try to free Jitter RNG instance
  crypto: drbg - move dynamic ->reseed_threshold adjustments to __drbg_seed()
  crypto: drbg - track whether DRBG was seeded with !rng_is_initialized()
  crypto: drbg - prepare for more fine-grained tracking of seeding state
  crypto: drbg - always seeded with SP800-90B compliant noise source
  Revert "random: use static branch for crng_ready()"
  random: check for signals after page of pool writes
  random: wire up fops->splice_{read,write}_iter()
  random: convert to using fops->write_iter()
  random: convert to using fops->read_iter()
  random: unify batched entropy implementations
  random: move randomize_page() into mm where it belongs
  random: move initialization functions out of hot pages
  random: make consistent use of buf and len
  random: use proper return types on get_random_{int,long}_wait()
  random: remove extern from functions in header
  random: use static branch for crng_ready()
  random: credit architectural init the exact amount
  random: handle latent entropy and command line from random_init()
  random: use proper jiffies comparison macro
  random: remove ratelimiting for in-kernel unseeded randomness
  random: move initialization out of reseeding hot path
  random: avoid initializing twice in credit race
  random: use symbolic constants for crng_init states
  siphash: use one source of truth for siphash permutations
  random: help compiler out with fast_mix() by using simpler arguments
  random: do not use input pool from hard IRQs
  random: order timer entropy functions below interrupt functions
  random: do not pretend to handle premature next security model
  random: use first 128 bits of input as fast init
  random: do not use batches when !crng_ready()
  random: insist on random_get_entropy() existing in order to simplify
  xtensa: use fallback for random_get_entropy() instead of zero
  sparc: use fallback for random_get_entropy() instead of zero
  um: use fallback for random_get_entropy() instead of zero
  x86/tsc: Use fallback for random_get_entropy() instead of zero
  nios2: use fallback for random_get_entropy() instead of zero
  arm: use fallback for random_get_entropy() instead of zero
  mips: use fallback for random_get_entropy() instead of just c0 random
  m68k: use fallback for random_get_entropy() instead of zero
  timekeeping: Add raw clock fallback for random_get_entropy()
  powerpc: define get_cycles macro for arch-override
  alpha: define get_cycles macro for arch-override
  parisc: define get_cycles macro for arch-override
  s390: define get_cycles macro for arch-override
  ia64: define get_cycles macro for arch-override
  init: call time_init() before rand_initialize()
  random: fix sysctl documentation nits
  random: document crng_fast_key_erasure() destination possibility
  random: make random_get_entropy() return an unsigned long
  random: allow partial reads if later user copies fail
  random: check for signals every PAGE_SIZE chunk of /dev/[u]random
  random: check for signal_pending() outside of need_resched() check
  random: do not allow user to keep crng key around on stack
  random: do not split fast init input in add_hwgenerator_randomness()
  random: mix build-time latent entropy into pool at init
  random: re-add removed comment about get_random_{u32,u64} reseeding
  random: treat bootloader trust toggle the same way as cpu trust toggle
  random: skip fast_init if hwrng provides large chunk of entropy
  random: check for signal and try earlier when generating entropy
  random: reseed more often immediately after booting
  random: make consistent usage of crng_ready()
  random: use SipHash as interrupt entropy accumulator
  random: replace custom notifier chain with standard one
  random: don't let 644 read-only sysctls be written to
  random: give sysctl_random_min_urandom_seed a more sensible value
  random: do crng pre-init loading in worker rather than irq
  random: unify cycles_t and jiffies usage and types
  random: cleanup UUID handling
  random: only wake up writers after zap if threshold was passed
  random: round-robin registers as ulong, not u32
  random: clear fast pool, crng, and batches in cpuhp bring up
  random: pull add_hwgenerator_randomness() declaration into random.h
  random: check for crng_init == 0 in add_device_randomness()
  random: unify early init crng load accounting
  random: do not take pool spinlock at boot
  random: defer fast pool mixing to worker
  random: rewrite header introductory comment
  random: group sysctl functions
  random: group userspace read/write functions
  random: group entropy collection functions
  random: group entropy extraction functions
  random: group crng functions
  random: group initialization wait functions
  random: remove whitespace and reorder includes
  random: remove useless header comment
  random: introduce drain_entropy() helper to declutter crng_reseed()
  random: deobfuscate irq u32/u64 contributions
  random: add proper SPDX header
  random: remove unused tracepoints
  random: remove ifdef'd out interrupt bench
  random: tie batched entropy generation to base_crng generation
  random: fix locking for crng_init in crng_reseed()
  random: zero buffer after reading entropy from userspace
  random: remove outdated INT_MAX >> 6 check in urandom_read()
  random: make more consistent use of integer types
  random: use hash function for crng_slow_load()
  random: use simpler fast key erasure flow on per-cpu keys
  random: absorb fast pool into input pool after fast load
  random: do not xor RDRAND when writing into /dev/random
  random: ensure early RDSEED goes through mixer on init
  random: inline leaves of rand_initialize()
  random: get rid of secondary crngs
  random: use RDSEED instead of RDRAND in entropy extraction
  random: fix locking in crng_fast_load()
  random: remove batched entropy locking
  random: remove use_input_pool parameter from crng_reseed()
  random: make credit_entropy_bits() always safe
  random: always wake up entropy writers after extraction
  random: use linear min-entropy accumulation crediting
  random: simplify entropy debiting
  random: use computational hash for entropy extraction
  random: only call crng_finalize_init() for primary_crng
  random: access primary_pool directly rather than through pointer
  random: continually use hwgenerator randomness
  random: simplify arithmetic function flow in account()
  random: selectively clang-format where it makes sense
  random: access input_pool_data directly rather than through pointer
  random: cleanup fractional entropy shift constants
  random: prepend remaining pool constants with POOL_
  random: de-duplicate INPUT_POOL constants
  random: remove unused OUTPUT_POOL constants
  random: rather than entropy_store abstraction, use global
  random: remove unused extract_entropy() reserved argument
  random: remove incomplete last_data logic
  random: cleanup integer types
  random: cleanup poolinfo abstraction
  random: fix typo in comments
  random: don't reset crng_init_cnt on urandom_read()
  random: avoid superfluous call to RDRAND in CRNG extraction
  random: early initialization of ChaCha constants
  random: initialize ChaCha20 constants with correct endianness
  random: use IS_ENABLED(CONFIG_NUMA) instead of ifdefs
  random: harmonize "crng init done" messages
  random: mix bootloader randomness into pool
  random: do not re-init if crng_reseed completes before primary init
  random: do not sign extend bytes for rotation when mixing
  random: use BLAKE2s instead of SHA1 in extraction
  random: remove unused irq_flags argument from add_interrupt_randomness()
  random: document add_hwgenerator_randomness() with other input functions
  crypto: blake2s - adjust include guard naming
  crypto: blake2s - include <linux/bug.h> instead of <asm/bug.h>
  MAINTAINERS: co-maintain random.c
  random: remove dead code left over from blocking pool
  random: avoid arch_get_random_seed_long() when collecting IRQ randomness
  random: add arch_get_random_*long_early()
  powerpc: Use bool in archrandom.h
  linux/random.h: Mark CONFIG_ARCH_RANDOM functions __must_check
  linux/random.h: Use false with bool
  linux/random.h: Remove arch_has_random, arch_has_random_seed
  s390: Remove arch_has_random, arch_has_random_seed
  powerpc: Remove arch_has_random, arch_has_random_seed
  x86: Remove arch_has_random, arch_has_random_seed
  random: avoid warnings for !CONFIG_NUMA builds
  random: split primary/secondary crng init paths
  random: remove some dead code of poolinfo
  random: fix typo in add_timer_randomness()
  random: Add and use pr_fmt()
  random: convert to ENTROPY_BITS for better code readability
  random: remove unnecessary unlikely()
  random: remove kernel.random.read_wakeup_threshold
  random: delete code to pull data into pools
  random: remove the blocking pool
  random: make /dev/random be almost like /dev/urandom
  random: ignore GRND_RANDOM in getentropy(2)
  random: add GRND_INSECURE to return best-effort non-cryptographic bytes
  random: Add a urandom_read_nowait() for random APIs that don't warn
  random: Don't wake crng_init_wait when crng_init == 1
  random: don't forget compat_ioctl on urandom
  compat_ioctl: remove /dev/random commands
  lib/crypto: sha1: re-roll loops to reduce code size
  lib/crypto: blake2s: move hmac construction into wireguard
  crypto: blake2s - generic C library implementation and selftest
  nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION
  bpf: Fix incorrect memory charge cost calculation in stack_map_alloc()
  9p: missing chunk of "fs/9p: Don't update file type when updating file attributes"
  Revert "ext4: fix use-after-free in ext4_rename_dir_prepare"
  Revert "ext4: verify dir block before splitting it"
  Linux 5.4.199
  x86/speculation/mmio: Print SMT warning
  KVM: x86/speculation: Disable Fill buffer clear within guests
  x86/speculation/mmio: Reuse SRBDS mitigation for SBDS
  x86/speculation/srbds: Update SRBDS mitigation selection
  x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
  x86/speculation/mmio: Enable CPU Fill buffer clearing on idle
  x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations
  x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
  x86/speculation: Add a common function for MD_CLEAR mitigation update
  x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
  Documentation: Add documentation for Processor MMIO Stale Data
  x86/cpu: Add another Alder Lake CPU to the Intel family
  x86/cpu: Add Lakefield, Alder Lake and Rocket Lake models to the to Intel CPU family
  x86/cpu: Add Jasper Lake to Intel family
  cpu/speculation: Add prototype for cpu_show_srbds()
  Linux 5.4.198
  tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd
  mtd: cfi_cmdset_0002: Use chip_ready() for write on S29GL064N
  md/raid0: Ignore RAID0 layout if the second zone has only one device
  powerpc/32: Fix overread/overwrite of thread_struct via ptrace
  Input: bcm5974 - set missing URB_NO_TRANSFER_DMA_MAP urb flag
  ixgbe: fix unexpected VLAN Rx in promisc mode on VF
  ixgbe: fix bcast packets Rx on VF after promisc removal
  nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
  nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION
  mmc: block: Fix CQE recovery reset success
  ata: libata-transport: fix {dma|pio|xfer}_mode sysfs files
  cifs: return errors during session setup during reconnects
  ALSA: hda/conexant - Fix loopback issue with CX20632
  scripts/gdb: change kernel config dumping method
  vringh: Fix loop descriptors check in the indirect cases
  nodemask: Fix return values to be unsigned
  cifs: version operations for smb20 unneeded when legacy support disabled
  s390/gmap: voluntarily schedule during key setting
  nbd: fix io hung while disconnecting device
  nbd: fix race between nbd_alloc_config() and module removal
  nbd: call genl_unregister_family() first in nbd_cleanup()
  x86/cpu: Elide KCSAN for cpu_has() and friends
  modpost: fix undefined behavior of is_arm_mapping_symbol()
  drm/radeon: fix a possible null pointer dereference
  ceph: allow ceph.dir.rctime xattr to be updatable
  Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process"
  scsi: myrb: Fix up null pointer access on myrb_cleanup()
  md: protect md_unregister_thread from reentrancy
  watchdog: wdat_wdt: Stop watchdog when rebooting the system
  kernfs: Separate kernfs_pr_cont_buf and rename_lock.
  serial: msm_serial: disable interrupts in __msm_console_write()
  staging: rtl8712: fix uninit-value in r871xu_drv_init()
  staging: rtl8712: fix uninit-value in usb_read8() and friends
  clocksource/drivers/sp804: Avoid error on multiple instances
  extcon: Modify extcon device to be created after driver data is set
  misc: rtsx: set NULL intfdata when probe fails
  usb: dwc2: gadget: don't reset gadget's driver->bus
  USB: hcd-pci: Fully suspend across freeze/thaw cycle
  drivers: usb: host: Fix deadlock in oxu_bus_suspend()
  drivers: tty: serial: Fix deadlock in sa1100_set_termios()
  USB: host: isp116x: check return value after calling platform_get_resource()
  drivers: staging: rtl8192e: Fix deadlock in rtllib_beacons_stop()
  drivers: staging: rtl8192u: Fix deadlock in ieee80211_beacons_stop()
  tty: Fix a possible resource leak in icom_probe
  tty: synclink_gt: Fix null-pointer-dereference in slgt_clean()
  lkdtm/usercopy: Expand size of "out of frame" object
  iio: st_sensors: Add a local lock for protecting odr
  iio: dummy: iio_simple_dummy: check the return value of kstrdup()
  drm: imx: fix compiler warning with gcc-12
  net: altera: Fix refcount leak in altera_tse_mdio_create
  ip_gre: test csum_start instead of transport header
  net/mlx5: fs, fail conflicting actions
  net/mlx5: Rearm the FW tracer after each tracer event
  net: ipv6: unexport __init-annotated seg6_hmac_init()
  net: xfrm: unexport __init-annotated xfrm4_protocol_init()
  net: mdio: unexport __init-annotated mdio_bus_init()
  SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
  net/mlx4_en: Fix wrong return value on ioctl EEPROM query failure
  net: dsa: lantiq_gswip: Fix refcount leak in gswip_gphy_fw_list
  bpf, arm64: Clear prog->jited_len along prog->jited
  af_unix: Fix a data-race in unix_dgram_peer_wake_me().
  xen: unexport __init-annotated xen_xlate_map_ballooned_pages()
  netfilter: nf_tables: memleak flow rule from commit path
  ata: pata_octeon_cf: Fix refcount leak in octeon_cf_probe
  netfilter: nat: really support inet nat without l3 address
  xprtrdma: treat all calls not a bcall when bc_serv is NULL
  video: fbdev: pxa3xx-gcu: release the resources correctly in pxa3xx_gcu_probe/remove()
  NFSv4: Don't hold the layoutget locks across multiple RPC calls
  dmaengine: zynqmp_dma: In struct zynqmp_dma_chan fix desc_size data type
  m68knommu: fix undefined reference to `_init_sp'
  m68knommu: set ZERO_PAGE() to the allocated zeroed page
  i2c: cadence: Increase timeout per message if necessary
  f2fs: remove WARN_ON in f2fs_is_valid_blkaddr
  tracing: Avoid adding tracer option before update_tracer_options
  tracing: Fix sleeping function called from invalid context on RT kernel
  mips: cpc: Fix refcount leak in mips_cpc_default_phys_base
  perf c2c: Fix sorting in percent_rmt_hitm_cmp()
  tipc: check attribute length for bearer name
  afs: Fix infinite loop found by xfstest generic/676
  tcp: tcp_rtx_synack() can be called from process context
  net: sched: add barrier to fix packet stuck problem for lockless qdisc
  net/mlx5e: Update netdev features after changing XDP state
  net/mlx5: Don't use already freed action pointer
  nfp: only report pause frame configuration for physical device
  ubi: ubi_create_volume: Fix use-after-free when volume creation failed
  jffs2: fix memory leak in jffs2_do_fill_super
  modpost: fix removing numeric suffixes
  net: dsa: mv88e6xxx: Fix refcount leak in mv88e6xxx_mdios_register
  net: ethernet: mtk_eth_soc: out of bounds read in mtk_hwlro_get_fdir_entry()
  net: sched: fixed barrier to prevent skbuff sticking in qdisc backlog
  s390/crypto: fix scatterwalk_unmap() callers in AES-GCM
  clocksource/drivers/oxnas-rps: Fix irq_of_parse_and_map() return value
  ASoC: fsl_sai: Fix FSL_SAI_xDR/xFR definition
  watchdog: ts4800_wdt: Fix refcount leak in ts4800_wdt_probe
  driver core: fix deadlock in __device_attach
  driver: base: fix UAF when driver_attach failed
  bus: ti-sysc: Fix warnings for unbind for serial
  firmware: dmi-sysfs: Fix memory leak in dmi_sysfs_register_handle
  serial: stm32-usart: Correct CSIZE, bits, and parity
  serial: st-asc: Sanitize CSIZE and correct PARENB for CS7
  serial: sifive: Sanitize CSIZE and c_iflag
  serial: sh-sci: Don't allow CS5-6
  serial: txx9: Don't allow CS5-6
  serial: rda-uart: Don't allow CS5-6
  serial: digicolor-usart: Don't allow CS5-6
  serial: 8250_fintek: Check SER_RS485_RTS_* only with RS485
  serial: meson: acquire port->lock in startup()
  rtc: mt6397: check return value after calling platform_get_resource()
  clocksource/drivers/riscv: Events are stopped during CPU suspend
  soc: rockchip: Fix refcount leak in rockchip_grf_init
  coresight: cpu-debug: Replace mutex with mutex_trylock on panic notifier
  serial: sifive: Report actual baud base rather than fixed 115200
  phy: qcom-qmp: fix pipe-clock imbalance on power-on failure
  rpmsg: qcom_smd: Fix returning 0 if irq_of_parse_and_map() fails
  iio: adc: sc27xx: Fine tune the scale calibration values
  iio: adc: sc27xx: fix read big scale voltage not right
  iio: adc: stmpe-adc: Fix wait_for_completion_timeout return value check
  firmware: stratix10-svc: fix a missing check on list iterator
  usb: dwc3: pci: Fix pm_runtime_get_sync() error checking
  rpmsg: qcom_smd: Fix irq_of_parse_and_map() return value
  pwm: lp3943: Fix duty calculation in case period was clamped
  staging: fieldbus: Fix the error handling path in anybuss_host_common_probe()
  usb: musb: Fix missing of_node_put() in omap2430_probe
  USB: storage: karma: fix rio_karma_init return
  usb: usbip: add missing device lock on tweak configuration cmd
  usb: usbip: fix a refcount leak in stub_probe()
  tty: serial: fsl_lpuart: fix potential bug when using both of_alias_get_id and ida_simple_get
  tty: serial: owl: Fix missing clk_disable_unprepare() in owl_uart_probe
  tty: goldfish: Use tty_port_destroy() to destroy port
  iio: adc: ad7124: Remove shift from scan_type
  staging: greybus: codecs: fix type confusion of list iterator variable
  pcmcia: db1xxx_ss: restrict to MIPS_DB1XXX boards
  md: bcache: check the return value of kzalloc() in detached_dev_do_request()
  block: fix bio_clone_blkg_association() to associate with proper blkcg_gq
  bfq: Make sure bfqg for which we are queueing requests is online
  bfq: Get rid of __bio_blkcg() usage
  bfq: Remove pointless bfq_init_rq() calls
  bfq: Drop pointless unlock-lock pair
  bfq: Avoid merging queues with different parents
  MIPS: IP27: Remove incorrect `cpu_has_fpu' override
  RDMA/rxe: Generate a completion for unsupported/invalid opcode
  Kconfig: add config option for asm goto w/ outputs
  phy: qcom-qmp: fix reset-controller leak on probe errors
  blk-iolatency: Fix inflight count imbalances and IO hangs on offline
  dt-bindings: gpio: altera: correct interrupt-cells
  docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0
  ARM: pxa: maybe fix gpio lookup tables
  phy: qcom-qmp: fix struct clk leak on probe errors
  arm64: dts: qcom: ipq8074: fix the sleep clock frequency
  gma500: fix an incorrect NULL check on list iterator
  tilcdc: tilcdc_external: fix an incorrect NULL check on list iterator
  serial: pch: don't overwrite xmit->buf[0] by x_char
  carl9170: tx: fix an incorrect use of list iterator
  ASoC: rt5514: Fix event generation for "DSP Voice Wake Up" control
  rtl818x: Prevent using not initialized queues
  hugetlb: fix huge_pmd_unshare address update
  nodemask.h: fix compilation error with GCC12
  iommu/msm: Fix an incorrect NULL check on list iterator
  um: Fix out-of-bounds read in LDT setup
  um: chan_user: Fix winch_tramp() return value
  mac80211: upgrade passive scan to active scan on DFS channels after beacon rx
  irqchip: irq-xtensa-mx: fix initial IRQ affinity
  irqchip/armada-370-xp: Do not touch Performance Counter Overflow on A375, A38x, A39x
  RDMA/hfi1: Fix potential integer multiplication overflow errors
  Kconfig: Add option for asm goto w/ tied outputs to workaround clang-13 bug
  media: coda: Add more H264 levels for CODA960
  media: coda: Fix reported H264 profile
  mtd: cfi_cmdset_0002: Move and rename chip_check/chip_ready/chip_good_for_write
  md: fix an incorrect NULL check in md_reload_sb
  md: fix an incorrect NULL check in does_sb_need_changing
  drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
  drm/nouveau/clk: Fix an incorrect NULL check on list iterator
  drm/etnaviv: check for reaped mapping in etnaviv_iommu_unmap_gem
  drm/amdgpu/cs: make commands with 0 chunks illegal behaviour.
  scsi: ufs: qcom: Add a readl() to make sure ref_clk gets enabled
  scsi: dc395x: Fix a missing check on list iterator
  ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock
  dlm: fix missing lkb refcount handling
  dlm: fix plock invalid read
  mm, compaction: fast_find_migrateblock() should return pfn in the target zone
  PCI: qcom: Fix unbalanced PHY init on probe errors
  PCI: qcom: Fix runtime PM imbalance on probe errors
  PCI/PM: Fix bridge_d3_blacklist[] Elo i2 overwrite of Gigabyte X299
  tracing: Fix potential double free in create_var_ref()
  ACPI: property: Release subnode properties with data nodes
  ext4: avoid cycles in directory h-tree
  ext4: verify dir block before splitting it
  ext4: fix bug_on in ext4_writepages
  ext4: fix warning in ext4_handle_inode_extension
  ext4: fix use-after-free in ext4_rename_dir_prepare
  netfilter: nf_tables: disallow non-stateful expression in sets earlier
  bfq: Track whether bfq_group is still online
  bfq: Update cgroup information before merging bio
  bfq: Split shared queues on move between cgroups
  efi: Do not import certificates from UEFI Secure Boot for T2 Macs
  fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages
  iwlwifi: mvm: fix assert 1F04 upon reconfig
  wifi: mac80211: fix use-after-free in chanctx code
  f2fs: fix fallocate to use file_modified to update permissions consistently
  f2fs: don't need inode lock for system hidden quota
  f2fs: fix deadloop in foreground GC
  f2fs: fix to clear dirty inode in f2fs_evict_inode()
  f2fs: fix to do sanity check on block address in f2fs_do_zero_range()
  f2fs: fix to avoid f2fs_bug_on() in dec_valid_node_count()
  perf jevents: Fix event syntax error caused by ExtSel
  perf c2c: Use stdio interface if slang is not supported
  iommu/amd: Increase timeout waiting for GA log enablement
  dmaengine: stm32-mdma: remove GISR1 register
  video: fbdev: clcdfb: Fix refcount leak in clcdfb_of_vram_setup
  NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout
  NFS: Don't report errors from nfs_pageio_complete() more than once
  NFS: Do not report flush errors in nfs_write_end()
  NFS: Do not report EINTR/ERESTARTSYS as mapping errors
  i2c: at91: Initialize dma_buf in at91_twi_xfer()
  i2c: at91: use dma safe buffers
  iommu/mediatek: Add list_del in mtk_iommu_remove
  f2fs: fix dereference of stale list iterator after loop body
  Input: stmfts - do not leave device disabled in stmfts_input_open
  RDMA/hfi1: Prevent use of lock before it is initialized
  mailbox: forward the hrtimer if not queued and under a lock
  mfd: davinci_voicecodec: Fix possible null-ptr-deref davinci_vc_probe()
  powerpc/fsl_rio: Fix refcount leak in fsl_rio_setup
  macintosh: via-pmu and via-cuda need RTC_LIB
  powerpc/perf: Fix the threshold compare group constraint for power9
  powerpc/64: Only WARN if __pa()/__va() called with bad addresses
  Input: sparcspkr - fix refcount leak in bbc_beep_probe
  crypto: cryptd - Protect per-CPU resource by disabling BH.
  tty: fix deadlock caused by calling printk() under tty_port->lock
  PCI: imx6: Fix PERST# start-up sequence
  ipc/mqueue: use get_tree_nodev() in mqueue_get_tree()
  proc: fix dentry/inode overinstantiating under /proc/${pid}/net
  powerpc/4xx/cpm: Fix return value of __setup() handler
  powerpc/idle: Fix return value of __setup() handler
  powerpc/8xx: export 'cpm_setbrg' for modules
  dax: fix cache flush on PMD-mapped pages
  drivers/base/node.c: fix compaction sysfs file leak
  pinctrl: mvebu: Fix irq_of_parse_and_map() return value
  nvdimm: Allow overwrite in the presence of disabled dimms
  firmware: arm_scmi: Fix list protocols enumeration in the base protocol
  scsi: fcoe: Fix Wstringop-overflow warnings in fcoe_wwn_from_mac()
  mfd: ipaq-micro: Fix error check return value of platform_get_irq()
  powerpc/fadump: fix PT_LOAD segment for boot memory area
  arm: mediatek: select arch timer for mt7629
  crypto: marvell/cesa - ECB does not IV
  misc: ocxl: fix possible double free in ocxl_file_register_afu
  ARM: dts: bcm2835-rpi-b: Fix GPIO line names
  ARM: dts: bcm2837-rpi-3-b-plus: Fix GPIO line name of power LED
  ARM: dts: bcm2837-rpi-cm3-io3: Fix GPIO line names for SMPS I2C
  ARM: dts: bcm2835-rpi-zero-w: Fix GPIO line name for Wifi/BT
  can: xilinx_can: mark bit timing constants as const
  KVM: nVMX: Leave most VM-Exit info fields unmodified on failed VM-Entry
  PCI: rockchip: Fix find_first_zero_bit() limit
  PCI: cadence: Fix find_first_zero_bit() limit
  soc: qcom: smsm: Fix missing of_node_put() in smsm_parse_ipc
  soc: qcom: smp2p: Fix missing of_node_put() in smp2p_parse_ipc
  ARM: dts: suniv: F1C100: fix watchdog compatible
  arm64: dts: rockchip: Move drive-impedance-ohm to emmc phy on rk3399
  net/smc: postpone sk_refcnt increment in connect()
  rxrpc: Fix decision on when to generate an IDLE ACK
  rxrpc: Don't let ack.previousPacket regress
  rxrpc: Fix overlapping ACK accounting
  rxrpc: Don't try to resend the request if we're receiving the reply
  rxrpc: Fix listen() setting the bar too high for the prealloc rings
  NFC: hci: fix sleep in atomic context bugs in nfc_hci_hcp_message_tx
  ASoC: wm2000: fix missing clk_disable_unprepare() on error in wm2000_anc_transition()
  thermal/drivers/broadcom: Fix potential NULL dereference in sr_thermal_probe
  drm: msm: fix possible memory leak in mdp5_crtc_cursor_set()
  drm/msm/a6xx: Fix refcount leak in a6xx_gpu_init
  ext4: reject the 'commit' option on ext2 filesystems
  media: ov7670: remove ov7670_power_off from ov7670_remove
  sctp: read sk->sk_bound_dev_if once in sctp_rcv()
  m68k: math-emu: Fix dependencies of math emulation support
  Bluetooth: fix dangling sco_conn and use-after-free in sco_sock_timeout
  media: vsp1: Fix offset calculation for plane cropping
  media: pvrusb2: fix array-index-out-of-bounds in pvr2_i2c_core_init
  media: exynos4-is: Change clk_disable to clk_disable_unprepare
  media: st-delta: Fix PM disable depth imbalance in delta_probe
  media: aspeed: Fix an error handling path in aspeed_video_probe()
  scripts/faddr2line: Fix overlapping text section failures
  regulator: pfuze100: Fix refcount leak in pfuze_parse_regulators_dt
  ASoC: mxs-saif: Fix refcount leak in mxs_saif_probe
  ASoC: fsl: Fix refcount leak in imx_sgtl5000_probe
  perf/amd/ibs: Use interrupt regs ip for stack unwinding
  Revert "cpufreq: Fix possible race in cpufreq online error path"
  iomap: iomap_write_failed fix
  media: uvcvideo: Fix missing check to determine if element is found in list
  drm/msm: return an error pointer in msm_gem_prime_get_sg_table()
  drm/msm/mdp5: Return error code in mdp5_mixer_release when deadlock is detected
  drm/msm/mdp5: Return error code in mdp5_pipe_release when deadlock is detected
  regulator: core: Fix enable_count imbalance with EXCLUSIVE_GET
  x86/mm: Cleanup the control_va_addr_alignment() __setup handler
  irqchip/aspeed-i2c-ic: Fix irq_of_parse_and_map() return value
  irqchip/exiu: Fix acknowledgment of edge triggered interrupts
  x86: Fix return value of __setup handlers
  virtio_blk: fix the discard_granularity and discard_alignment queue limits
  drm/rockchip: vop: fix possible null-ptr-deref in vop_bind()
  drm/msm/hdmi: fix error check return value of irq_of_parse_and_map()
  drm/msm/hdmi: check return value after calling platform_get_resource_byname()
  drm/msm/dsi: fix error checks and return values for DSI xmit functions
  drm/msm/disp/dpu1: set vbif hw config to NULL to avoid use after memory free during pm runtime resume
  perf tools: Add missing headers needed by util/data.h
  ASoC: rk3328: fix disabling mclk on pclk probe failure
  x86/speculation: Add missing prototype for unpriv_ebpf_notify()
  x86/pm: Fix false positive kmemleak report in msr_build_context()
  scsi: ufs: core: Exclude UECxx from SFR dump list
  of: overlay: do not break notify on NOTIFY_{OK|STOP}
  fsnotify: fix wrong lockdep annotations
  inotify: show inotify mask flags in proc fdinfo
  ath9k_htc: fix potential out of bounds access with invalid rxstatus->rs_keyix
  cpufreq: Fix possible race in cpufreq online error path
  spi: img-spfi: Fix pm_runtime_get_sync() error checking
  sched/fair: Fix cfs_rq_clock_pelt() for throttled cfs_rq
  drm/bridge: Fix error handling in analogix_dp_probe
  HID: elan: Fix potential double free in elan_input_configured
  HID: hid-led: fix maximum brightness for Dream Cheeky
  drbd: fix duplicate array initializer
  efi: Add missing prototype for efi_capsule_setup_info
  NFC: NULL out the dev->rfkill to prevent UAF
  spi: spi-ti-qspi: Fix return value handling of wait_for_completion_timeout
  drm: mali-dp: potential dereference of null pointer
  drm/komeda: Fix an undefined behavior bug in komeda_plane_add()
  nl80211: show SSID for P2P_GO interfaces
  bpf: Fix excessive memory allocation in stack_map_alloc()
  drm/vc4: txp: Force alpha to be 0xff if it's disabled
  drm/vc4: txp: Don't set TXP_VSTART_AT_EOF
  drm/mediatek: Fix mtk_cec_mask()
  x86/delay: Fix the wrong asm constraint in delay_loop()
  ASoC: mediatek: Fix missing of_node_put in mt2701_wm8960_machine_probe
  ASoC: mediatek: Fix error handling in mt8173_max98090_dev_probe
  drm/bridge: adv7511: clean up CEC adapter when probe fails
  drm/edid: fix invalid EDID extension block filtering
  ath9k: fix ar9003_get_eepmisc
  drm: fix EDID struct for old ARM OABI format
  RDMA/hfi1: Prevent panic when SDMA is disabled
  powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
  macintosh/via-pmu: Fix build failure when CONFIG_INPUT is disabled
  powerpc/powernv: fix missing of_node_put in uv_init()
  powerpc/xics: fix refcount leak in icp_opal_init()
  tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
  PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
  ARM: hisi: Add missing of_node_put after of_find_compatible_node
  ARM: dts: exynos: add atmel,24c128 fallback to Samsung EEPROM
  ARM: versatile: Add missing of_node_put in dcscb_init
  fat: add ratelimit to fat*_ent_bread()
  powerpc/fadump: Fix fadump to work with a different endian capture kernel
  ARM: OMAP1: clock: Fix UART rate reporting algorithm
  fs: jfs: fix possible NULL pointer dereference in dbFree()
  PM / devfreq: rk3399_dmc: Disable edev on remove()
  ARM: dts: ox820: align interrupt controller node name with dtschema
  IB/rdmavt: add missing locks in rvt_ruc_loopback
  selftests/bpf: fix btf_dump/btf_dump due to recent clang change
  eth: tg3: silence the GCC 12 array-bounds warning
  rxrpc: Return an error to sendmsg if call failed
  hwmon: Make chip parameter for with_info API mandatory
  ASoC: max98357a: remove dependency on GPIOLIB
  media: exynos4-is: Fix compile warning
  net: phy: micrel: Allow probing without .driver_data
  nbd: Fix hung on disconnect request if socket is closed before
  ASoC: rt5645: Fix errorenous cleanup order
  nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags
  openrisc: start CPU timer early in boot
  media: cec-adap.c: fix is_configuring state
  media: coda: limit frame interval enumeration to supported encoder frame sizes
  rtlwifi: Use pr_warn instead of WARN_ONCE
  ipmi: Fix pr_fmt to avoid compilation issues
  ipmi:ssif: Check for NULL msg when handling events and messages
  ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default
  dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
  spi: stm32-qspi: Fix wait_cmd timeout in APM mode
  s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES
  ASoC: tscs454: Add endianness flag in snd_soc_component_driver
  HID: bigben: fix slab-out-of-bounds Write in bigben_probe
  drm/amdgpu/ucode: Remove firmware load type check in amdgpu_ucode_free_bo
  mlxsw: spectrum_dcb: Do not warn about priority changes
  ASoC: dapm: Don't fold register value changes into notifications
  net/mlx5: fs, delete the FTE when there are no rules attached to it
  ipv6: Don't send rs packets to the interface of ARPHRD_TUNNEL
  drm: msm: fix error check return value of irq_of_parse_and_map()
  arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall
  drm/amd/pm: fix the compile warning
  drm/plane: Move range check for format_count earlier
  scsi: megaraid: Fix error check return value of register_chrdev()
  mmc: jz4740: Apply DMA engine limits to maximum segment size
  md/bitmap: don't set sb values if can't pass sanity check
  media: cx25821: Fix the warning when removing the module
  media: pci: cx23885: Fix the error handling in cx23885_initdev()
  media: venus: hfi: avoid null dereference in deinit
  ath9k: fix QCA9561 PA bias level
  drm/amd/pm: fix double free in si_parse_power_table()
  tools/power turbostat: fix ICX DRAM power numbers
  spi: spi-rspi: Remove setting {src,dst}_{addr,addr_width} based on DMA direction
  ALSA: jack: Access input_dev under mutex
  drm/komeda: return early if drm_universal_plane_init() fails.
  ACPICA: Avoid cache flush inside virtual machines
  fbcon: Consistently protect deferred_takeover with console_lock()
  ipv6: fix locking issues with loops over idev->addr_list
  ipw2x00: Fix potential NULL dereference in libipw_xmit()
  b43: Fix assigning negative value to unsigned variable
  b43legacy: Fix assigning negative value to unsigned variable
  mwifiex: add mutex lock for call in mwifiex_dfs_chan_sw_work_queue
  drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes
  btrfs: repair super block num_devices automatically
  btrfs: add "0x" prefix for unsupported optional features
  ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
  ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
  ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
  perf/x86/intel: Fix event constraints for ICL
  usb: core: hcd: Add support for deferring roothub registration
  USB: new quirk for Dell Gen 2 devices
  USB: serial: option: add Quectel BG95 modem
  ALSA: hda/realtek - Fix microphone noise on ASUS TUF B550M-PLUS
  binfmt_flat: do not stop relocating GOT entries prematurely on riscv

 Conflicts:
	Documentation/devicetree/bindings/dma/allwinner,sun50i-a64-dma.yaml
	Documentation/devicetree/bindings~HEAD
	drivers/char/Kconfig
	drivers/mmc/core/block.c
	kernel/sysctl.c

Change-Id: If11e1865055bfb94b3268960268c88c3dfc032c3
2022-11-09 19:53:28 +02:00
Greg Kroah-Hartman
836d95bfdc This is the 5.4.207 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmLZohoACgkQONu9yGCS
 aT4D3RAA1Je6ingEk1e/IMmfWhNu/0AOrULMbrNBdM/WDOlIQLNBchhMa81UXEh7
 OJzt+NyBcuV4x5UdXj1fK3erJXA7tKM3s7BGY7OcXPDMpZjf9uOUV2Tg1Jw1jDbW
 TV7lnWv1YA7ze3eOx6qoR9sNPh4kYiW5DG2ivY8JMblKEz5EPCdvyPSSW+s8kmpg
 ZdyJ0pa3fnS0Di421DzJ+7R1U2t4C1eAz1FkngAyPM47GzwJoJxgcP4Q8syBmwGY
 qylUnrLTBMRtpngayaP15tQtYckGTbsnTUNCTjoW7BhbABkWysc2aVnCYZDLqBck
 C4XjEfBMLByICokuab0ewrzeVzvvHaY31hnhf33hYn6pgIoS5oy4T3mN7T8yEJz9
 zsr+unBioZFiIOqiVgu5A2Rwn3+1x8qOmLZ/x35jqZQCmh0ndlmHUhkdjl3y/68S
 XWvP4zpYBAR7QlW3WsGtFeI9Kbeh6y2tH0J79N5CjctAZFAvUaZd3cSfh3Vck02/
 7Wo9vs5zV8ZvRkdRWEawkrfe/PUImnDmvkv56nTH79bI7qIlpOU6kS6gy0sDzdGl
 YRKv4+jwE9/hJAcWW5S/U3wbfZMxMA6wdt8QcWsn0pXs1WFUQgWeNuyO2HNodff3
 jlp25lEi3C3NSUycmm9IjuG2241hPDYnhqeX0Q4B5ciPHCD4w3o=
 =KtMr
 -----END PGP SIGNATURE-----

Merge 5.4.207 into android11-5.4-lts

Changes in 5.4.207
	ALSA: hda - Add fixup for Dell Latitidue E5430
	ALSA: hda/conexant: Apply quirk for another HP ProDesk 600 G3 model
	ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc671
	ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc221
	ALSA: hda/realtek - Enable the headset-mic on a Xiaomi's laptop
	xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue
	tracing/histograms: Fix memory leak problem
	net: sock: tracing: Fix sock_exceed_buf_limit not to dereference stale pointer
	ip: fix dflt addr selection for connected nexthop
	ARM: 9213/1: Print message about disabled Spectre workarounds only once
	ARM: 9214/1: alignment: advance IT state after emulating Thumb instruction
	wifi: mac80211: fix queue selection for mesh/OCB interfaces
	cgroup: Use separate src/dst nodes when preloading css_sets for migration
	drm/panfrost: Fix shrinker list corruption by madvise IOCTL
	nilfs2: fix incorrect masking of permission flags for symlinks
	Revert "evm: Fix memleak in init_desc"
	sched/rt: Disable RT_RUNTIME_SHARE by default
	ext4: fix race condition between ext4_write and ext4_convert_inline_data
	ARM: dts: imx6qdl-ts7970: Fix ngpio typo and count
	ARM: 9209/1: Spectre-BHB: avoid pr_info() every time a CPU comes out of idle
	ARM: 9210/1: Mark the FDT_FIXED sections as shareable
	drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()
	ima: Fix a potential integer overflow in ima_appraise_measurement
	ASoC: sgtl5000: Fix noise on shutdown/remove
	net: stmmac: dwc-qos: Disable split header for Tegra194
	inetpeer: Fix data-races around sysctl.
	net: Fix data-races around sysctl_mem.
	cipso: Fix data-races around sysctl.
	icmp: Fix data-races around sysctl.
	ipv4: Fix a data-race around sysctl_fib_sync_mem.
	ARM: dts: at91: sama5d2: Fix typo in i2s1 node
	ARM: dts: sunxi: Fix SPI NOR campatible on Orange Pi Zero
	drm/i915/gt: Serialize TLB invalidates with GT resets
	icmp: Fix a data-race around sysctl_icmp_ratelimit.
	icmp: Fix a data-race around sysctl_icmp_ratemask.
	raw: Fix a data-race around sysctl_raw_l3mdev_accept.
	ipv4: Fix data-races around sysctl_ip_dynaddr.
	net: ftgmac100: Hold reference returned by of_get_child_by_name()
	sfc: fix use after free when disabling sriov
	seg6: fix skb checksum evaluation in SRH encapsulation/insertion
	seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors
	seg6: bpf: fix skb checksum in bpf_push_seg6_encap()
	sfc: fix kernel panic when creating VF
	mm: sysctl: fix missing numa_stat when !CONFIG_HUGETLB_PAGE
	virtio_mmio: Add missing PM calls to freeze/restore
	virtio_mmio: Restore guest page size on resume
	netfilter: br_netfilter: do not skip all hooks with 0 priority
	cpufreq: pmac32-cpufreq: Fix refcount leak bug
	platform/x86: hp-wmi: Ignore Sanitization Mode event
	net: tipc: fix possible refcount leak in tipc_sk_create()
	NFC: nxp-nci: don't print header length mismatch on i2c error
	nvme: fix regression when disconnect a recovering ctrl
	net: sfp: fix memory leak in sfp_probe()
	ASoC: ops: Fix off by one in range control validation
	ASoC: wm5110: Fix DRE control
	ASoC: cs47l15: Fix event generation for low power mux control
	ASoC: madera: Fix event generation for OUT1 demux
	ASoC: madera: Fix event generation for rate controls
	irqchip: or1k-pic: Undefine mask_ack for level triggered hardware
	x86: Clear .brk area at early boot
	soc: ixp4xx/npe: Fix unused match warning
	ARM: dts: stm32: use the correct clock source for CEC on stm32mp151
	signal handling: don't use BUG_ON() for debugging
	USB: serial: ftdi_sio: add Belimo device ids
	usb: typec: add missing uevent when partner support PD
	usb: dwc3: gadget: Fix event pending check
	tty: serial: samsung_tty: set dma burst_size to 1
	serial: 8250: fix return error code in serial8250_request_std_resource()
	serial: stm32: Clear prev values before setting RTS delays
	serial: pl011: UPSTAT_AUTORTS requires .throttle/unthrottle
	can: m_can: m_can_tx_handler(): fix use after free of skb
	Linux 5.4.207

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ica75b787bd813b75db73739df2a831dbb4958668
2022-07-23 13:54:07 +02:00
Daniel Bristot de Oliveira
423f269500 sched/rt: Disable RT_RUNTIME_SHARE by default
commit 2586af1ac187f6b3a50930a4e33497074e81762d upstream.

The RT_RUNTIME_SHARE sched feature enables the sharing of rt_runtime
between CPUs, allowing a CPU to run a real-time task up to 100% of the
time while leaving more space for non-real-time tasks to run on the CPU
that lend rt_runtime.

The problem is that a CPU can easily borrow enough rt_runtime to allow
a spinning rt-task to run forever, starving per-cpu tasks like kworkers,
which are non-real-time by design.

This patch disables RT_RUNTIME_SHARE by default, avoiding this problem.
The feature will still be present for users that want to enable it,
though.

Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Wei Wang <wvw@google.com>
Link: https://lkml.kernel.org/r/b776ab46817e3db5d8ef79175fa0d71073c051c7.1600697903.git.bristot@redhat.com
Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-21 20:59:18 +02:00
Blagovest Kolenichev
d9a7f5879a Merge android11-5.4.61+ (874de1d) into msm-5.4
* refs/heads/tmp-874de1d:
  ANDROID: GKI: enable CONFIG_CPU_FREQ_STAT and more thermal configs
  ANDROID: ABI: Update allowed list for QCOM
  ANDROID: sched: add restrict vendor hook to modify load balance behavior
  ANDROID: GKI: Update abi_gki_aarch64_oneplus
  ANDROID: scs: use vmapped shadow stacks by default
  ANDROID: ABI: update allowed list for QCOM
  UPSTREAM: sched/fair/util_est: Implement faster ramp-up EWMA on utilization increases
  ANDROID: GKI: Update abi_gki_aarch64_exynos
  ANDROID: kbuild: disable GCOV with CFI
  ANDROID: GKI: add built-in PCIE_DW_PLAT_EP
  ANDROID: PCI: dwc: export symbols for ep driver
  ANDROID: recordmcount: avoid STT_FILE as base for mcount offset relocation
  ANDROID: iommu: Enable CONFIG_IOMMU_IO_PGTABLE_ARMV7S
  BACKPORT: FROMLIST: iommu/io-pgtable-arm-v7s: Quad lvl1 pgtable for MediaTek
  BACKPORT: FROMLIST: iommu/io-pgtable-arm-v7s: Add cfg as a param in some macros
  BACKPORT: FROMLIST: iommu/io-pgtable-arm-v7s: Extend PA34 for MediaTek
  BACKPORT: FROMLIST: iommu/io-pgtable-arm-v7s: Use ias to check the valid iova in unmap
  ANDROID: ABI: Update allowed list for QCOM
  ANDROID: modules: fix suspicious rcu usage

 Conflicts:
	kernel/sched/fair.c
	kernel/sched/features.h

Change-Id: I5d8dc058a0aca2b90ca702f18f71f663af9774fc
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2020-10-29 15:36:09 -07:00
Patrick Bellasi
d82457153c UPSTREAM: sched/fair/util_est: Implement faster ramp-up EWMA on utilization increases
The estimated utilization for a task:

   util_est = max(util_avg, est.enqueue, est.ewma)

is defined based on:

 - util_avg: the PELT defined utilization
 - est.enqueued: the util_avg at the end of the last activation
 - est.ewma:     a exponential moving average on the est.enqueued samples

According to this definition, when a task suddenly changes its bandwidth
requirements from small to big, the EWMA will need to collect multiple
samples before converging up to track the new big utilization.

This slow convergence towards bigger utilization values is not
aligned to the default scheduler behavior, which is to optimize for
performance. Moreover, the est.ewma component fails to compensate for
temporarely utilization drops which spans just few est.enqueued samples.

To let util_est do a better job in the scenario depicted above, change
its definition by making util_est directly follow upward motion and
only decay the est.ewma on downward.

Bug: 120440300
Signed-off-by: Patrick Bellasi <patrick.bellasi@matbug.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Douglas Raillard <douglas.raillard@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <qperret@google.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191023205630.14469-1-patrick.bellasi@matbug.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit b8c96361402aa3e74ad48ceef18aed99153d8da8)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: Iaa75dec917748375d9043c3ebf21bf272299c8dd
2020-10-26 12:24:35 +00:00
Shaleen Agrawal
dc8050ca5d sched: fair: Improve the Scheduler
This change is for general scheduler improvement.

Change-Id: I92c13d8e6681adb2655a1dae1b5c92fd0fe32166
Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>
2020-04-28 18:57:12 -07:00
Quentin Perret
9e864eaf88 ANDROID: cpufreq/schedutil: Select frequency using util_avg for RT
Schedutil always requests max frequency whenever a RT task is running.
Now that we have a better estimate of the utilization of RT runqueues,
it is possible to make a less conservative decision and scale frequency
according to the needs of the RT tasks.

To do so, protect the RT-go-to-max code with a new sched_feature.
The sched_feature is disabled by default, hence favoring energy savings
as required in mobile environments.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: Ic9f01c8703d4f843addaa0d684012a422fe9f3b8
Git-commit: c74d264106b150fad8949a985514d127c1e117c5
Git-repo: https://android.googlesource.com/kernel/common/
[satyap@codeaurora.org: fix trivial merge conflicts]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2020-01-27 19:48:32 -08:00
Satya Durga Srinivasu Prabhala
fb1332306e sched/fair: Add snapshot of placement changes
This snapshot is taken from msm-4.19 as of commit 22d004c08a27
("sched/fair: Improve the scheduler").

Change-Id: I8fc95a4a4650de0dc36bd979d374b9335f6af774
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2019-12-10 12:45:41 -08:00
Syed Rameez Mustafa
61f31539cf sched: turn off the TTWU_QUEUE feature
While the feature TTWU_QUEUE has the advantage of reducing cache
bouncing of runqueue locks, it has the side effect that runqueue
statistics are not updated until the remote CPU has a chance to
enqueue the task. Since there is no upper bound on the amount of
time it can take the remote CPU to enqueue the task, several
sequential wakeups can result in suboptimal task placement based
on the stale statistics. Turn off the feature as the cost of
sub-optimal placement is much higher than the cost of cache bouncing
spinlocks for msm based systems.

Change-Id: I0b85c0225237b2bc44f54934769f5e3750c0f3d6
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[satyap@codeaurora.org: fix trivial merge conflict]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2019-12-04 16:27:54 -08:00
Dietmar Eggemann
1c1b8a7b03 sched/fair: Replace source_load() & target_load() with weighted_cpuload()
With LB_BIAS disabled, source_load() & target_load() return
weighted_cpuload(). Replace both with calls to weighted_cpuload().

The function to obtain the load index (sd->*_idx) for an sd,
get_sd_load_idx(), can be removed as well.

Finally, get rid of the sched feature LB_BIAS.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20190527062116.11512-3-dietmar.eggemann@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-03 11:49:39 +02:00
Dietmar Eggemann
fdf5f315d5 sched/fair: Disable LB_BIAS by default
LB_BIAS allows the adjustment on how conservative load should be
balanced.

The rq->cpu_load[idx] array is used for this functionality. It contains
weighted CPU load decayed average values over different intervals
(idx = 1..4). Idx = 0 is the weighted CPU load itself.

The values are updated during scheduler_tick, before idle balance and at
nohz exit.

There are 5 different types of idx's per sched domain (sd). Each of them
is used to index into the rq->cpu_load[idx] array in a specific scenario
(busy, idle and newidle for load balancing, forkexec for wake-up
slow-path load balancing and wake for affine wakeup based on weight).
Only the sd idx's for busy and idle load balancing are set to 2,3 or 1,2
respectively. All the other sd idx's are set to 0.

Conservative load balancing is achieved for sd idx's >= 1 by using the
min/max (source_load()/target_load()) value between the current weighted
CPU load and the rq->cpu_load[sd idx -1] for the busiest(idlest)/local
CPU load in load balancing or vice versa in the wake-up slow-path load
balancing.
There is no conservative balancing for sd idx = 0 since only current
weighted CPU load is used in this case.

It is very likely that LB_BIAS' influence on load balancing can be
neglected (see test results below). This is further supported by:

(1) Weighted CPU load today is by itself a decayed average value (PELT)
    (cfs_rq->avg->runnable_load_avg) and not the instantaneous load
    (rq->load.weight) it was when LB_BIAS was introduced.

(2) Sd imbalance_pct is used for CPU_NEWLY_IDLE and CPU_NOT_IDLE (relate
    to sd's newidle and busy idx) in find_busiest_group() when comparing
    busiest and local avg load to make load balancing even more
    conservative.

(3) The sd forkexec and newidle idx are always set to 0 so there is no
    adjustment on how conservatively load balancing is done here.

(4) Affine wakeup based on weight (wake_affine_weight()) will not be
    impacted since the sd wake idx is always set to 0.

Let's disable LB_BIAS by default for a few kernel releases to make sure
that no workload and no scheduler topology is affected. The benefit of
being able to remove the LB_BIAS dependency from source_load() and
target_load() is that the entire rq->cpu_load[idx] code could be removed
in this case.

It is really hard to say if there is no regression w/o testing this with
a lot of different workloads on a lot of different platforms, especially
NUMA machines.
The following 104 LKP (Linux Kernel Performance) tests were run by the
0-Day guys mostly on multi-socket hosts with a larger number of logical
cpus (88, 192).
The base for the test was commit b3dae109fa ("sched/swait: Rename to
exclusive") (tip/sched/core v4.18-rc1).
Only 2 out of the 104 tests had a significant change in one of the
metrics (fsmark/1x-1t-1HDD-btrfs-nfsv4-4M-60G-NoSync-performance +7%
files_per_sec, unixbench/300s-100%-syscall-performance -11% score).
Tests which showed a change in one of the metrics are marked with a '*'
and this change is listed as well.

(a) lkp-bdw-ep3:
      88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 64G

    dd-write/10m-1HDD-cfq-btrfs-100dd-performance
    fsmark/1x-1t-1HDD-xfs-nfsv4-4M-60G-NoSync-performance
  * fsmark/1x-1t-1HDD-btrfs-nfsv4-4M-60G-NoSync-performance
      7.50  7%  8.00  ±  6%  fsmark.files_per_sec
    fsmark/1x-1t-1HDD-btrfs-nfsv4-4M-60G-fsyncBeforeClose-performance
    fsmark/1x-1t-1HDD-btrfs-4M-60G-NoSync-performance
    fsmark/1x-1t-1HDD-btrfs-4M-60G-fsyncBeforeClose-performance
    kbuild/300s-50%-vmlinux_prereq-performance
    kbuild/300s-200%-vmlinux_prereq-performance
    kbuild/300s-50%-vmlinux_prereq-performance-1HDD-ext4
    kbuild/300s-200%-vmlinux_prereq-performance-1HDD-ext4

(b) lkp-skl-4sp1:
      192 threads Intel(R) Xeon(R) Platinum 8160 768G

    dbench/100%-performance
    ebizzy/200%-100x-10s-performance
    hackbench/1600%-process-pipe-performance
    iperf/300s-cs-localhost-tcp-performance
    iperf/300s-cs-localhost-udp-performance
    perf-bench-numa-mem/2t-300M-performance
    perf-bench-sched-pipe/10000000ops-process-performance
    perf-bench-sched-pipe/10000000ops-threads-performance
    schbench/2-16-300-30000-30000-performance
    tbench/100%-cs-localhost-performance

(c) lkp-bdw-ep6:
      88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 128G

    stress-ng/100%-60s-pipe-performance
    unixbench/300s-1-whetstone-double-performance
    unixbench/300s-1-shell1-performance
    unixbench/300s-1-shell8-performance
    unixbench/300s-1-pipe-performance
  * unixbench/300s-1-context1-performance
      312  315  unixbench.score
    unixbench/300s-1-spawn-performance
    unixbench/300s-1-syscall-performance
    unixbench/300s-1-dhry2reg-performance
    unixbench/300s-1-fstime-performance
    unixbench/300s-1-fsbuffer-performance
    unixbench/300s-1-fsdisk-performance
    unixbench/300s-100%-whetstone-double-performance
    unixbench/300s-100%-shell1-performance
    unixbench/300s-100%-shell8-performance
    unixbench/300s-100%-pipe-performance
    unixbench/300s-100%-context1-performance
    unixbench/300s-100%-spawn-performance
  * unixbench/300s-100%-syscall-performance
      3571  ±  3%  -11%  3183  ±  4%  unixbench.score
    unixbench/300s-100%-dhry2reg-performance
    unixbench/300s-100%-fstime-performance
    unixbench/300s-100%-fsbuffer-performance
    unixbench/300s-100%-fsdisk-performance
    unixbench/300s-1-execl-performance
    unixbench/300s-100%-execl-performance
  * will-it-scale/brk1-performance
      365004  360387  will-it-scale.per_thread_ops
  * will-it-scale/dup1-performance
      432401  437596  will-it-scale.per_thread_ops
    will-it-scale/eventfd1-performance
    will-it-scale/futex1-performance
    will-it-scale/futex2-performance
    will-it-scale/futex3-performance
    will-it-scale/futex4-performance
    will-it-scale/getppid1-performance
    will-it-scale/lock1-performance
    will-it-scale/lseek1-performance
    will-it-scale/lseek2-performance
  * will-it-scale/malloc1-performance
      47025  45817  will-it-scale.per_thread_ops
      77499  76529  will-it-scale.per_process_ops
    will-it-scale/malloc2-performance
  * will-it-scale/mmap1-performance
      123399  120815  will-it-scale.per_thread_ops
      152219  149833  will-it-scale.per_process_ops
  * will-it-scale/mmap2-performance
      107327  104714  will-it-scale.per_thread_ops
      136405  133765  will-it-scale.per_process_ops
    will-it-scale/open1-performance
  * will-it-scale/open2-performance
      171570  168805  will-it-scale.per_thread_ops
      532644  526202  will-it-scale.per_process_ops
    will-it-scale/page_fault1-performance
    will-it-scale/page_fault2-performance
    will-it-scale/page_fault3-performance
    will-it-scale/pipe1-performance
    will-it-scale/poll1-performance
  * will-it-scale/poll2-performance
      176134  172848  will-it-scale.per_thread_ops
      281361  275053  will-it-scale.per_process_ops
    will-it-scale/posix_semaphore1-performance
    will-it-scale/pread1-performance
    will-it-scale/pread2-performance
    will-it-scale/pread3-performance
    will-it-scale/pthread_mutex1-performance
    will-it-scale/pthread_mutex2-performance
    will-it-scale/pwrite1-performance
    will-it-scale/pwrite2-performance
    will-it-scale/pwrite3-performance
  * will-it-scale/read1-performance
      1190563  1174833  will-it-scale.per_thread_ops
  * will-it-scale/read2-performance
      1105369  1080427  will-it-scale.per_thread_ops
    will-it-scale/readseek1-performance
  * will-it-scale/readseek2-performance
      261818  259040  will-it-scale.per_thread_ops
    will-it-scale/readseek3-performance
  * will-it-scale/sched_yield-performance
      2408059  2382034  will-it-scale.per_thread_ops
    will-it-scale/signal1-performance
    will-it-scale/unix1-performance
    will-it-scale/unlink1-performance
    will-it-scale/unlink2-performance
  * will-it-scale/write1-performance
      976701  961588  will-it-scale.per_thread_ops
  * will-it-scale/writeseek1-performance
      831898  822448  will-it-scale.per_thread_ops
  * will-it-scale/writeseek2-performance
      228248  225065  will-it-scale.per_thread_ops
  * will-it-scale/writeseek3-performance
      226670  224058  will-it-scale.per_thread_ops
    will-it-scale/context_switch1-performance
    aim7/performance-fork_test-2000
  * aim7/performance-brk_test-3000
      74869  76676  aim7.jobs-per-min
    aim7/performance-disk_cp-3000
    aim7/performance-disk_rd-3000
    aim7/performance-sieve-3000
    aim7/performance-page_test-3000
    aim7/performance-creat-clo-3000
    aim7/performance-mem_rtns_1-8000
    aim7/performance-disk_wrt-8000
    aim7/performance-pipe_cpy-8000
    aim7/performance-ram_copy-8000

(d) lkp-avoton3:
      8 threads Intel(R) Atom(TM) CPU C2750 @ 2.40GHz 16G

    netperf/ipv4-900s-200%-cs-localhost-TCP_STREAM-performance

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Li Zhijian <zhijianx.li@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180809135753.21077-1-dietmar.eggemann@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-02 09:45:01 +02:00
Patrick Bellasi
d519329f72 sched/fair: Update util_est only on util_avg updates
The estimated utilization of a task is currently updated every time the
task is dequeued. However, to keep overheads under control, PELT signals
are effectively updated at maximum once every 1ms.

Thus, for really short running tasks, it can happen that their util_avg
value has not been updates since their last enqueue.  If such tasks are
also frequently running tasks (e.g. the kind of workload generated by
hackbench) it can also happen that their util_avg is updated only every
few activations.

This means that updating util_est at every dequeue potentially introduces
not necessary overheads and it's also conceptually wrong if the util_avg
signal has never been updated during a task activation.

Let's introduce a throttling mechanism on task's util_est updates
to sync them with util_avg updates. To make the solution memory
efficient, both in terms of space and load/store operations, we encode a
synchronization flag into the LSB of util_est.enqueued.
This makes util_est an even values only metric, which is still
considered good enough for its purpose.
The synchronization bit is (re)set by __update_load_avg_se() once the
PELT signal of a task has been updated during its last activation.

Such a throttling mechanism allows to keep under control util_est
overheads in the wakeup hot path, thus making it a suitable mechanism
which can be enabled also on high-intensity workload systems.
Thus, this now switches on by default the estimation utilization
scheduler feature.

Suggested-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@android.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20180309095245.11071-5-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20 08:11:09 +01:00
Patrick Bellasi
7f65ea42eb sched/fair: Add util_est on top of PELT
The util_avg signal computed by PELT is too variable for some use-cases.
For example, a big task waking up after a long sleep period will have its
utilization almost completely decayed. This introduces some latency before
schedutil will be able to pick the best frequency to run a task.

The same issue can affect task placement. Indeed, since the task
utilization is already decayed at wakeup, when the task is enqueued in a
CPU, this can result in a CPU running a big task as being temporarily
represented as being almost empty. This leads to a race condition where
other tasks can be potentially allocated on a CPU which just started to run
a big task which slept for a relatively long period.

Moreover, the PELT utilization of a task can be updated every [ms], thus
making it a continuously changing value for certain longer running
tasks. This means that the instantaneous PELT utilization of a RUNNING
task is not really meaningful to properly support scheduler decisions.

For all these reasons, a more stable signal can do a better job of
representing the expected/estimated utilization of a task/cfs_rq.
Such a signal can be easily created on top of PELT by still using it as
an estimator which produces values to be aggregated on meaningful
events.

This patch adds a simple implementation of util_est, a new signal built on
top of PELT's util_avg where:

    util_est(task) = max(task::util_avg, f(task::util_avg@dequeue))

This allows to remember how big a task has been reported by PELT in its
previous activations via f(task::util_avg@dequeue), which is the new
_task_util_est(struct task_struct*) function added by this patch.

If a task should change its behavior and it runs longer in a new
activation, after a certain time its util_est will just track the
original PELT signal (i.e. task::util_avg).

The estimated utilization of cfs_rq is defined only for root ones.
That's because the only sensible consumer of this signal are the
scheduler and schedutil when looking for the overall CPU utilization
due to FAIR tasks.

For this reason, the estimated utilization of a root cfs_rq is simply
defined as:

    util_est(cfs_rq) = max(cfs_rq::util_avg, cfs_rq::util_est::enqueued)

where:

    cfs_rq::util_est::enqueued = sum(_task_util_est(task))
                                 for each RUNNABLE task on that root cfs_rq

It's worth noting that the estimated utilization is tracked only for
objects of interests, specifically:

 - Tasks: to better support tasks placement decisions
 - root cfs_rqs: to better support both tasks placement decisions as
                 well as frequencies selection

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@android.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20180309095245.11071-2-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-20 08:11:06 +01:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Peter Zijlstra
f2cdd9cc6c sched/core: Address more wake_affine() regressions
The trivial wake_affine_idle() implementation is very good for a
number of workloads, but it comes apart at the moment there are no
idle CPUs left, IOW. the overloaded case.

hackbench:

		NO_WA_WEIGHT		WA_WEIGHT

hackbench-20  : 7.362717561 seconds	6.450509391 seconds

(win)

netperf:

		  NO_WA_WEIGHT		WA_WEIGHT

TCP_SENDFILE-1	: Avg: 54524.6		Avg: 52224.3
TCP_SENDFILE-10	: Avg: 48185.2          Avg: 46504.3
TCP_SENDFILE-20	: Avg: 29031.2          Avg: 28610.3
TCP_SENDFILE-40	: Avg: 9819.72          Avg: 9253.12
TCP_SENDFILE-80	: Avg: 5355.3           Avg: 4687.4

TCP_STREAM-1	: Avg: 41448.3          Avg: 42254
TCP_STREAM-10	: Avg: 24123.2          Avg: 25847.9
TCP_STREAM-20	: Avg: 15834.5          Avg: 18374.4
TCP_STREAM-40	: Avg: 5583.91          Avg: 5599.57
TCP_STREAM-80	: Avg: 2329.66          Avg: 2726.41

TCP_RR-1	: Avg: 80473.5          Avg: 82638.8
TCP_RR-10	: Avg: 72660.5          Avg: 73265.1
TCP_RR-20	: Avg: 52607.1          Avg: 52634.5
TCP_RR-40	: Avg: 57199.2          Avg: 56302.3
TCP_RR-80	: Avg: 25330.3          Avg: 26867.9

UDP_RR-1	: Avg: 108266           Avg: 107844
UDP_RR-10	: Avg: 95480            Avg: 95245.2
UDP_RR-20	: Avg: 68770.8          Avg: 68673.7
UDP_RR-40	: Avg: 76231            Avg: 75419.1
UDP_RR-80	: Avg: 34578.3          Avg: 35639.1

UDP_STREAM-1	: Avg: 64684.3          Avg: 66606
UDP_STREAM-10	: Avg: 52701.2          Avg: 52959.5
UDP_STREAM-20	: Avg: 30376.4          Avg: 29704
UDP_STREAM-40	: Avg: 15685.8          Avg: 15266.5
UDP_STREAM-80	: Avg: 8415.13          Avg: 7388.97

(wins and losses)

sysbench:

		    NO_WA_WEIGHT		WA_WEIGHT

sysbench-mysql-2  :  2135.17 per sec.		 2142.51 per sec.
sysbench-mysql-5  :  4809.68 per sec.            4800.19 per sec.
sysbench-mysql-10 :  9158.59 per sec.            9157.05 per sec.
sysbench-mysql-20 : 14570.70 per sec.           14543.55 per sec.
sysbench-mysql-40 : 22130.56 per sec.           22184.82 per sec.
sysbench-mysql-80 : 20995.56 per sec.           21904.18 per sec.

sysbench-psql-2   :  1679.58 per sec.            1705.06 per sec.
sysbench-psql-5   :  3797.69 per sec.            3879.93 per sec.
sysbench-psql-10  :  7253.22 per sec.            7258.06 per sec.
sysbench-psql-20  : 11166.75 per sec.           11220.00 per sec.
sysbench-psql-40  : 17277.28 per sec.           17359.78 per sec.
sysbench-psql-80  : 17112.44 per sec.           17221.16 per sec.

(increase on the top end)

tbench:

NO_WA_WEIGHT

Throughput 685.211 MB/sec   2 clients   2 procs  max_latency=0.123 ms
Throughput 1596.64 MB/sec   5 clients   5 procs  max_latency=0.119 ms
Throughput 2985.47 MB/sec  10 clients  10 procs  max_latency=0.262 ms
Throughput 4521.15 MB/sec  20 clients  20 procs  max_latency=0.506 ms
Throughput 9438.1  MB/sec  40 clients  40 procs  max_latency=2.052 ms
Throughput 8210.5  MB/sec  80 clients  80 procs  max_latency=8.310 ms

WA_WEIGHT

Throughput 697.292 MB/sec   2 clients   2 procs  max_latency=0.127 ms
Throughput 1596.48 MB/sec   5 clients   5 procs  max_latency=0.080 ms
Throughput 2975.22 MB/sec  10 clients  10 procs  max_latency=0.254 ms
Throughput 4575.14 MB/sec  20 clients  20 procs  max_latency=0.502 ms
Throughput 9468.65 MB/sec  40 clients  40 procs  max_latency=2.069 ms
Throughput 8631.73 MB/sec  80 clients  80 procs  max_latency=8.605 ms

(increase on the top end)

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-10 10:14:03 +02:00
Peter Zijlstra
d153b15344 sched/core: Fix wake_affine() performance regression
Eric reported a sysbench regression against commit:

  3fed382b46 ("sched/numa: Implement NUMA node level wake_affine()")

Similarly, Rik was looking at the NAS-lu.C benchmark, which regressed
against his v3.10 enterprise kernel.

PRE (current tip/master):

 ivb-ep sysbench:

   2: [30 secs]     transactions:                        64110  (2136.94 per sec.)
   5: [30 secs]     transactions:                        143644 (4787.99 per sec.)
  10: [30 secs]     transactions:                        274298 (9142.93 per sec.)
  20: [30 secs]     transactions:                        418683 (13955.45 per sec.)
  40: [30 secs]     transactions:                        320731 (10690.15 per sec.)
  80: [30 secs]     transactions:                        355096 (11834.28 per sec.)

 hsw-ex NAS:

 OMP_PROC_BIND/lu.C.x_threads_144_run_1.log: Time in seconds =                    18.01
 OMP_PROC_BIND/lu.C.x_threads_144_run_2.log: Time in seconds =                    17.89
 OMP_PROC_BIND/lu.C.x_threads_144_run_3.log: Time in seconds =                    17.93
 lu.C.x_threads_144_run_1.log: Time in seconds =                   434.68
 lu.C.x_threads_144_run_2.log: Time in seconds =                   405.36
 lu.C.x_threads_144_run_3.log: Time in seconds =                   433.83

POST (+patch):

 ivb-ep sysbench:

   2: [30 secs]     transactions:                        64494  (2149.75 per sec.)
   5: [30 secs]     transactions:                        145114 (4836.99 per sec.)
  10: [30 secs]     transactions:                        278311 (9276.69 per sec.)
  20: [30 secs]     transactions:                        437169 (14571.60 per sec.)
  40: [30 secs]     transactions:                        669837 (22326.73 per sec.)
  80: [30 secs]     transactions:                        631739 (21055.88 per sec.)

 hsw-ex NAS:

 lu.C.x_threads_144_run_1.log: Time in seconds =                    23.36
 lu.C.x_threads_144_run_2.log: Time in seconds =                    22.96
 lu.C.x_threads_144_run_3.log: Time in seconds =                    22.52

This patch takes out all the shiny wake_affine() stuff and goes back to
utter basics. Between the two CPUs involved with the wakeup (the CPU
doing the wakeup and the CPU we ran on previously) pick the CPU we can
run on _now_.

This restores much of the regressions against the older kernels,
but leaves some ground in the overloaded case. The default-enabled
WA_WEIGHT (which will be introduced in the next patch) is an attempt
to address the overloaded situation.

Reported-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jinpuwang@gmail.com
Cc: vcaputo@pengaru.com
Fixes: 3fed382b46 ("sched/numa: Implement NUMA node level wake_affine()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-10 10:14:02 +02:00
Peter Zijlstra
1ad3aaf3fc sched/core: Implement new approach to scale select_idle_cpu()
Hackbench recently suffered a bunch of pain, first by commit:

  4c77b18cf8 ("sched/fair: Make select_idle_cpu() more aggressive")

and then by commit:

  c743f0a5c5 ("sched/fair, cpumask: Export for_each_cpu_wrap()")

which fixed a bug in the initial for_each_cpu_wrap() implementation
that made select_idle_cpu() even more expensive. The bug was that it
would skip over CPUs when bits were consequtive in the bitmask.

This however gave me an idea to fix select_idle_cpu(); where the old
scheme was a cliff-edge throttle on idle scanning, this introduces a
more gradual approach. Instead of stopping to scan entirely, we limit
how many CPUs we scan.

Initial benchmarks show that it mostly recovers hackbench while not
hurting anything else, except Mason's schbench, but not as bad as the
old thing.

It also appears to recover the tbench high-end, which also suffered like
hackbench.

Tested-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Cc: kitsunyan <kitsunyan@inbox.ru>
Cc: linux-kernel@vger.kernel.org
Cc: lvenanci@redhat.com
Cc: riel@redhat.com
Cc: xiaolong.ye@intel.com
Link: http://lkml.kernel.org/r/20170517105350.hk5m4h4jb6dfr65a@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-08 10:25:17 +02:00
Peter Zijlstra
af85596c74 sched/topology: Remove FORCE_SD_OVERLAP
Its an obsolete debug mechanism and future code wants to rely on
properties this undermines.

Namely, it would be good to assume that SD_OVERLAP domains have
children, but if we build the entire hierarchy with SD_OVERLAP this is
obviously false.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15 10:15:28 +02:00
Peter Zijlstra
26ae58d23b sched/core: Add WARNING for multiple update_rq_clock() calls
Now that we have no missing calls, add a warning to find multiple
calls.

By having only a single update_rq_clock() call per rq-lock section,
the section appears 'atomic' wrt time.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-16 09:46:21 +01:00
Peter Zijlstra
4c77b18cf8 sched/fair: Make select_idle_cpu() more aggressive
Kitsunyan reported desktop latency issues on his Celeron 887 because
of commit:

  1b568f0aab ("sched/core: Optimize SCHED_SMT")

... even though his CPU doesn't do SMT.

The effect of running the SMT code on a !SMT part is basically a more
aggressive select_idle_cpu(). Removing the avg condition fixed things
for him.

I also know FB likes this test gone, even though other workloads like
having it.

For now, take it out by default, until we get a better idea.

Reported-by: kitsunyan <kitsunyan@inbox.ru>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:50:17 +01:00
Morten Rasmussen
8cd5601c50 sched/fair: Convert arch_scale_cpu_capacity() from weak function to #define
Bring arch_scale_cpu_capacity() in line with the recent change of its
arch_scale_freq_capacity() sibling in commit dfbca41f34 ("sched:
Optimize freq invariant accounting") from weak function to #define to
allow inlining of the function.

While at it, remove the ARCH_CAPACITY sched_feature as well. With the
change to #define there isn't a straightforward way to allow runtime
switch between an arch implementation and the default implementation of
arch_scale_cpu_capacity() using sched_feature. The default was to use
the arch-specific implementation, but only the arm architecture provides
one and that is essentially equivalent to the default implementation.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar Eggemann <Dietmar.Eggemann@arm.com>
Cc: Juri Lelli <Juri.Lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: daniel.lezcano@linaro.org
Cc: mturquette@baylibre.com
Cc: pang.xunlei@zte.com.cn
Cc: rjw@rjwysocki.net
Cc: sgurrappadi@nvidia.com
Cc: vincent.guittot@linaro.org
Cc: yuyang.du@intel.com
Link: http://lkml.kernel.org/r/1439569394-11974-3-git-send-email-morten.rasmussen@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-13 09:52:55 +02:00
Srikar Dronamraju
2b49d84b25 sched/numa: Remove the NUMA sched_feature
Variable sched_numa_balancing is available for both CONFIG_SCHED_DEBUG
and !CONFIG_SCHED_DEBUG. All code paths now check for
sched_numa_balancing. Hence remove sched_feat(NUMA).

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439290813-6683-4-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-13 09:52:53 +02:00
Peter Zijlstra
a9280514bf sched/fair: Make the entity load aging on attaching tunable
In case there are problems with the aging on attach, provide a debug
knob to turn it off.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-13 09:52:48 +02:00
Srikar Dronamraju
2a1ed24ce9 sched/numa: Prefer NUMA hotness over cache hotness
The current load balancer may not try to prevent a task from moving
out of a preferred node to a less preferred node. The reason for this
being:

 - Since sched features NUMA and NUMA_RESIST_LOWER are disabled by
   default, migrate_degrades_locality() always returns false.

 - Even if NUMA_RESIST_LOWER were to be enabled, if its cache hot,
   migrate_degrades_locality() never gets called.

The above behaviour can mean that tasks can move out of their
preferred node but they may be eventually be brought back to their
preferred node by numa balancer (due to higher numa faults).

To avoid the above, this commit merges migrate_degrades_locality() and
migrate_improves_locality(). It also replaces 3 sched features NUMA,
NUMA_FAVOUR_HIGHER and NUMA_RESIST_LOWER by a single sched feature
NUMA.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <efault@gmx.de>
Link: http://lkml.kernel.org/r/1434455762-30857-2-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07 08:46:10 +02:00
Steven Rostedt
b6366f048e sched/rt: Use IPI to trigger RT task push migration instead of pulling
When debugging the latencies on a 40 core box, where we hit 300 to
500 microsecond latencies, I found there was a huge contention on the
runqueue locks.

Investigating it further, running ftrace, I found that it was due to
the pulling of RT tasks.

The test that was run was the following:

 cyclictest --numa -p95 -m -d0 -i100

This created a thread on each CPU, that would set its wakeup in iterations
of 100 microseconds. The -d0 means that all the threads had the same
interval (100us). Each thread sleeps for 100us and wakes up and measures
its latencies.

cyclictest is maintained at:
 git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rt-tests.git

What happened was another RT task would be scheduled on one of the CPUs
that was running our test, when the other CPU tests went to sleep and
scheduled idle. This caused the "pull" operation to execute on all
these CPUs. Each one of these saw the RT task that was overloaded on
the CPU of the test that was still running, and each one tried
to grab that task in a thundering herd way.

To grab the task, each thread would do a double rq lock grab, grabbing
its own lock as well as the rq of the overloaded CPU. As the sched
domains on this box was rather flat for its size, I saw up to 12 CPUs
block on this lock at once. This caused a ripple affect with the
rq locks especially since the taking was done via a double rq lock, which
means that several of the CPUs had their own rq locks held while trying
to take this rq lock. As these locks were blocked, any wakeups or load
balanceing on these CPUs would also block on these locks, and the wait
time escalated.

I've tried various methods to lessen the load, but things like an
atomic counter to only let one CPU grab the task wont work, because
the task may have a limited affinity, and we may pick the wrong
CPU to take that lock and do the pull, to only find out that the
CPU we picked isn't in the task's affinity.

Instead of doing the PULL, I now have the CPUs that want the pull to
send over an IPI to the overloaded CPU, and let that CPU pick what
CPU to push the task to. No more need to grab the rq lock, and the
push/pull algorithm still works fine.

With this patch, the latency dropped to just 150us over a 20 hour run.
Without the patch, the huge latencies would trigger in seconds.

I've created a new sched feature called RT_PUSH_IPI, which is enabled
by default.

When RT_PUSH_IPI is not enabled, the old method of grabbing the rq locks
and having the pulling CPU do the work is implemented. When RT_PUSH_IPI
is enabled, the IPI is sent to the overloaded CPU to do a push.

To enabled or disable this at run time:

 # mount -t debugfs nodev /sys/kernel/debug
 # echo RT_PUSH_IPI > /sys/kernel/debug/sched_features
or
 # echo NO_RT_PUSH_IPI > /sys/kernel/debug/sched_features

Update: This original patch would send an IPI to all CPUs in the RT overload
list. But that could theoretically cause the reverse issue. That is, there
could be lots of overloaded RT queues and one CPU lowers its priority. It would
then send an IPI to all the overloaded RT queues and they could then all try
to grab the rq lock of the CPU lowering its priority, and then we have the
same problem.

The latest design sends out only one IPI to the first overloaded CPU. It tries to
push any tasks that it can, and then looks for the next overloaded CPU that can
push to the source CPU. The IPIs stop when all overloaded CPUs that have pushable
tasks that have priorities greater than the source CPU are covered. In case the
source CPU lowers its priority again, a flag is set to tell the IPI traversal to
restart with the first RT overloaded CPU after the source CPU.

Parts-suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Joern Engel <joern@purestorage.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150318144946.2f3cc982@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23 10:55:22 +01:00
Nicolas Pitre
5d4dfddd4f sched: Rename capacity related flags
It is better not to think about compute capacity as being equivalent
to "CPU power".  The upcoming "power aware" scheduler work may create
confusion with the notion of energy consumption if "power" is used too
liberally.

Let's rename the following feature flags since they do relate to capacity:

	SD_SHARE_CPUPOWER  -> SD_SHARE_CPUCAPACITY
	ARCH_POWER         -> ARCH_CAPACITY
	NONTASK_POWER      -> NONTASK_CAPACITY

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: linaro-kernel@lists.linaro.org
Cc: Andy Fleming <afleming@freescale.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: devicetree@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/n/tip-e93lpnxb87owfievqatey6b5@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 11:52:32 +02:00
Mel Gorman
7a0f308337 sched/numa: Resist moving tasks towards nodes with fewer hinting faults
Just as "sched: Favour moving tasks towards the preferred node" favours
moving tasks towards nodes with a higher number of recorded NUMA hinting
faults, this patch resists moving tasks towards nodes with lower faults.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-24-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-09 12:40:27 +02:00
Mel Gorman
3a7053b322 sched/numa: Favour moving tasks towards the preferred node
This patch favours moving tasks towards NUMA node that recorded a higher
number of NUMA faults during active load balancing.  Ideally this is
self-reinforcing as the longer the task runs on that node, the more faults
it should incur causing task_numa_placement to keep the task running on that
node. In reality a big weakness is that the nodes CPUs can be overloaded
and it would be more efficient to queue tasks on an idle node and migrate
to the new node. This would require additional smarts in the balancer so
for now the balancer will simply prefer to place the task on the preferred
node for a PTE scans which is controlled by the numa_balancing_settle_count
sysctl. Once the settle_count number of scans has complete the schedule
is free to place the task on an alternative node if the load is imbalanced.

[srikar@linux.vnet.ibm.com: Fixed statistics]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
[ Tunable and use higher faults instead of preferred. ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-23-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-09 12:40:26 +02:00
Mel Gorman
b726b7dfb4 Revert "mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node"
PTE scanning and NUMA hinting fault handling is expensive so commit
5bca2303 ("mm: sched: numa: Delay PTE scanning until a task is scheduled
on a new node") deferred the PTE scan until a task had been scheduled on
another node. The problem is that in the purely shared memory case that
this may never happen and no NUMA hinting fault information will be
captured. We are not ruling out the possibility that something better
can be done here but for now, this patch needs to be reverted and depend
entirely on the scan_delay to avoid punishing short-lived processes.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-16-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-09 12:40:17 +02:00
Waiman Long
41fcb9f230 mutex: Move mutex spinning code from sched/core.c back to mutex.c
As mentioned by Ingo, the SCHED_FEAT_OWNER_SPIN scheduler
feature bit was really just an early hack to make with/without
mutex-spinning testable. So it is no longer necessary.

This patch removes the SCHED_FEAT_OWNER_SPIN feature bit and
move the mutex spinning code from kernel/sched/core.c back to
kernel/mutex.c which is where they should belong.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chandramouleeswaran Aswin <aswin@hp.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Norton Scott J <scott.norton@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366226594-5506-2-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19 09:33:34 +02:00
Linus Torvalds
3d59eebc5e Automatic NUMA Balancing V11
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQIcBAABAgAGBQJQx0kQAAoJEHzG/DNEskfi4fQP/R5PRovayroZALBMLnVJDaLD
 Ttr9p40VNXbiJ+MfRgatJjSSJZ4Jl+fC3NEqBhcwVZhckZZb9R2s0WtrSQo5+ZbB
 vdRfiuKoCaKM4cSZ08C12uTvsF6xjhjd27CTUlMkyOcDoKxMEFKelv0hocSxe4Wo
 xqlv3eF+VsY7kE1BNbgBP06SX4tDpIHRxXfqJPMHaSKQmre+cU0xG2GcEu3QGbHT
 DEDTI788YSaWLmBfMC+kWoaQl1+bV/FYvavIAS8/o4K9IKvgR42VzrXmaFaqrbgb
 72ksa6xfAi57yTmZHqyGmts06qYeBbPpKI+yIhCMInxA9CY3lPbvHppRf0RQOyzj
 YOi4hovGEMJKE+BCILukhJcZ9jCTtS3zut6v1rdvR88f4y7uhR9RfmRfsxuW7PNj
 3Rmh191+n0lVWDmhOs2psXuCLJr3LEiA0dFffN1z8REUTtTAZMsj8Rz+SvBNAZDR
 hsJhERVeXB6X5uQ5rkLDzbn1Zic60LjVw7LIp6SF2OYf/YKaF8vhyWOA8dyCEu8W
 CGo7AoG0BO8tIIr8+LvFe8CweypysZImx4AjCfIs4u9pu/v11zmBvO9NO5yfuObF
 BreEERYgTes/UITxn1qdIW4/q+Nr0iKO3CTqsmu6L1GfCz3/XzPGs3U26fUhllqi
 Ka0JKgnWvsa6ez6FSzKI
 =ivQa
 -----END PGP SIGNATURE-----

Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma

Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
 "There are three implementations for NUMA balancing, this tree
  (balancenuma), numacore which has been developed in tip/master and
  autonuma which is in aa.git.

  In almost all respects balancenuma is the dumbest of the three because
  its main impact is on the VM side with no attempt to be smart about
  scheduling.  In the interest of getting the ball rolling, it would be
  desirable to see this much merged for 3.8 with the view to building
  scheduler smarts on top and adapting the VM where required for 3.9.

  The most recent set of comparisons available from different people are

    mel:    https://lkml.org/lkml/2012/12/9/108
    mingo:  https://lkml.org/lkml/2012/12/7/331
    tglx:   https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

  The results are a mixed bag.  In my own tests, balancenuma does
  reasonably well.  It's dumb as rocks and does not regress against
  mainline.  On the other hand, Ingo's tests shows that balancenuma is
  incapable of converging for this workloads driven by perf which is bad
  but is potentially explained by the lack of scheduler smarts.  Thomas'
  results show balancenuma improves on mainline but falls far short of
  numacore or autonuma.  Srikar's results indicate we all suffer on a
  large machine with imbalanced node sizes.

  My own testing showed that recent numacore results have improved
  dramatically, particularly in the last week but not universally.
  We've butted heads heavily on system CPU usage and high levels of
  migration even when it shows that overall performance is better.
  There are also cases where it regresses.  Of interest is that for
  specjbb in some configurations it will regress for lower numbers of
  warehouses and show gains for higher numbers which is not reported by
  the tool by default and sometimes missed in treports.  Recently I
  reported for numacore that the JVM was crashing with
  NullPointerExceptions but currently it's unclear what the source of
  this problem is.  Initially I thought it was in how numacore batch
  handles PTEs but I'm no longer think this is the case.  It's possible
  numacore is just able to trigger it due to higher rates of migration.

  These reports were quite late in the cycle so I/we would like to start
  with this tree as it contains much of the code we can agree on and has
  not changed significantly over the last 2-3 weeks."

* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
  mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
  mm/rmap: Convert the struct anon_vma::mutex to an rwsem
  mm: migrate: Account a transhuge page properly when rate limiting
  mm: numa: Account for failed allocations and isolations as migration failures
  mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
  mm: numa: Add THP migration for the NUMA working set scanning fault case.
  mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
  mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
  mm: sched: numa: Control enabling and disabling of NUMA balancing
  mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
  mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships
  mm: numa: migrate: Set last_nid on newly allocated page
  mm: numa: split_huge_page: Transfer last_nid on tail page
  mm: numa: Introduce last_nid to the page frame
  sched: numa: Slowly increase the scanning period as NUMA faults are handled
  mm: numa: Rate limit setting of pte_numa if node is saturated
  mm: numa: Rate limit the amount of memory that is migrated between nodes
  mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
  mm: numa: Migrate pages handled during a pmd_numa hinting fault
  mm: numa: Migrate on reference policy
  ...
2012-12-16 15:18:08 -08:00
Mel Gorman
5bca230353 mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
Due to the fact that migrations are driven by the CPU a task is running
on there is no point tracking NUMA faults until one task runs on a new
node. This patch tracks the first node used by an address space. Until
it changes, PTE scanning is disabled and no NUMA hinting faults are
trapped. This should help workloads that are short-lived, do not care
about NUMA placement or have bound themselves to a single node.

This takes advantage of the logic in "mm: sched: numa: Implement slow
start for working set sampling" to delay when the checks are made. This
will take advantage of processes that set their CPU and node bindings
early in their lifetime. It will also potentially allow any initial load
balancing to take place.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:56 +00:00
Mel Gorman
1a687c2e9a mm: sched: numa: Control enabling and disabling of NUMA balancing
This patch adds Kconfig options and kernel parameters to allow the
enabling and disabling of automatic NUMA balancing. The existance
of such a switch was and is very important when debugging problems
related to transparent hugepages and we should have the same for
automatic NUMA placement.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:55 +00:00
Peter Zijlstra
cbee9f88ec mm: numa: Add fault driven placement and migration
NOTE: This patch is based on "sched, numa, mm: Add fault driven
	placement and migration policy" but as it throws away all the policy
	to just leave a basic foundation I had to drop the signed-offs-by.

This patch creates a bare-bones method for setting PTEs pte_numa in the
context of the scheduler that when faulted later will be faulted onto the
node the CPU is running on.  In itself this does nothing useful but any
placement policy will fundamentally depend on receiving hints on placement
from fault context and doing something intelligent about it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:45 +00:00
Ingo Molnar
8ed92e51f9 sched: Add WAKEUP_PREEMPTION feature flag, on by default
As per the recent discussion with Mike and Linus, make it easier to
test with/without this feature. No change in default behavior.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-izoxq4haeg4mTognnDbwcevt@git.kernel.org
2012-10-16 10:05:27 +02:00
Vincent Guittot
bc2a27cd27 sched: cpu_power: enable ARCH_POWER
Heteregeneous ARM platform uses arch_scale_freq_power function
to reflect the relative capacity of each core

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1341826026-6504-6-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-13 16:52:06 +02:00
Namhyung Kim
c751134ef8 sched: Remove AFFINE_WAKEUPS feature flag
Commit beac4c7e4a ("sched: Remove AFFINE_WAKEUPS feature") removed
use of the flag but left the definition. Get rid of it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Link: http://lkml.kernel.org/r/1345090865-20851-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-04 14:31:31 +02:00
Peter Zijlstra
eb95308ee2 sched: Fix more load-balancing fallout
Commits 367456c756 ("sched: Ditch per cgroup task lists for
load-balancing") and 5d6523ebd ("sched: Fix load-balance wreckage")
left some more wreckage.

By setting loop_max unconditionally to ->nr_running load-balancing
could take a lot of time on very long runqueues (hackbench!). So keep
the sysctl as max limit of the amount of tasks we'll iterate.

Furthermore, the min load filter for migration completely fails with
cgroups since inequality in per-cpu state can easily lead to such
small loads :/

Furthermore the change to add new tasks to the tail of the queue
instead of the head seems to have some effect.. not quite sure I
understand why.

Combined these fixes solve the huge hackbench regression reported by
Tim when hackbench is ran in a cgroup.

Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1335365763.28150.267.camel@twins
[ got rid of the CONFIG_PREEMPT tuning and made small readability edits ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-26 12:54:52 +02:00
Peter Zijlstra
f8b6d1cc7d sched: Use jump_labels for sched_feat
Now that we initialize jump_labels before sched_init() we can use them
for the debug features without having to worry about a window where
they have the wrong setting.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-vpreo4hal9e0kzqmg5y0io2k@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06 20:51:26 +01:00
Peter Zijlstra
391e43da79 sched: Move all scheduler bits into kernel/sched/
There's too many sched*.[ch] files in kernel/, give them their own
directory.

(No code changed, other than Makefile glue added.)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-17 12:20:22 +01:00