android_kernel_xiaomi_sm8350/drivers/acpi/acpi_ipmi.c

644 lines
16 KiB
C
Raw Permalink Normal View History

Merge remote-tracking branch 'remotes/origin/tmp-f686d9f' into msm-lahaina * remotes/origin/tmp-f686d9f: ANDROID: update abi_gki_aarch64.xml for 5.2-rc6 Linux 5.2-rc6 Revert "iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock" Bluetooth: Fix regression with minimum encryption key size alignment tcp: refine memory limit test in tcp_fragment() x86/vdso: Prevent segfaults due to hoisted vclock reads SUNRPC: Fix a credential refcount leak Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE" net :sunrpc :clnt :Fix xps refcount imbalance on the error path NFS4: Only set creation opendata if O_CREAT ANDROID: gki_defconfig: workaround to enable configs ANDROID: gki_defconfig: more configs for partners ARM: 8867/1: vdso: pass --be8 to linker if necessary KVM: nVMX: reorganize initial steps of vmx_set_nested_state KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries habanalabs: use u64_to_user_ptr() for reading user pointers nfsd: replace Jeff by Chuck as nfsd co-maintainer inet: clear num_timeout reqsk_alloc() PCI/P2PDMA: Ignore root complex whitelist when an IOMMU is present net: mvpp2: debugfs: Add pmap to fs dump ipv6: Default fib6_type to RTN_UNICAST when not set net: hns3: Fix inconsistent indenting net/af_iucv: always register net_device notifier net/af_iucv: build proper skbs for HiperTransport net/af_iucv: remove GFP_DMA restriction for HiperTransport doc: fix documentation about UIO_MEM_LOGICAL using MAINTAINERS / Documentation: Thorsten Scherer is the successor of Gavin Schenk docs: fb: Add TER16x32 to the available font names MAINTAINERS: fpga: hand off maintainership to Moritz treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 484 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 481 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 480 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 479 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 477 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 475 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 474 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 473 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 472 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 471 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 469 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 468 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 467 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 466 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 465 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 464 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 463 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 462 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 461 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 460 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 459 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 457 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 456 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 455 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 454 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 452 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 451 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 250 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 248 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 247 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 246 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 245 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 244 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 243 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 239 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 238 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 237 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 235 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 233 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 232 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 231 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 226 KVM: arm/arm64: Fix emulated ptimer irq injection net: dsa: mv88e6xxx: fix shift of FID bits in mv88e6185_g1_vtu_loadpurge() tests: kvm: Check for a kernel warning kvm: tests: Sort tests in the Makefile alphabetically KVM: x86/mmu: Allocate PAE root array when using SVM's 32-bit NPT KVM: x86: Modify struct kvm_nested_state to have explicit fields for data fanotify: update connector fsid cache on add mark quota: fix a problem about transfer quota drm/i915: Don't clobber M/N values during fastset check powerpc: enable a 30-bit ZONE_DMA for 32-bit pmac ovl: make i_ino consistent with st_ino in more cases scsi: qla2xxx: Fix hardlockup in abort command during driver remove scsi: ufs: Avoid runtime suspend possibly being blocked forever scsi: qedi: update driver version to 8.37.0.20 scsi: qedi: Check targetname while finding boot target information hvsock: fix epollout hang from race condition net/udp_gso: Allow TX timestamp with UDP GSO net: netem: fix use after free and double free with packet corruption net: netem: fix backlog accounting for corrupted GSO frames net: lio_core: fix potential sign-extension overflow on large shift tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb ip6_tunnel: allow not to count pkts on tstats by passing dev as NULL ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL apparmor: reset pos on failure to unpack for various functions apparmor: enforce nullbyte at end of tag string apparmor: fix PROFILE_MEDIATES for untrusted input RDMA/efa: Handle mmap insertions overflow tun: wake up waitqueues after IFF_UP is set drm: return -EFAULT if copy_to_user() fails net: remove duplicate fetch in sock_getsockopt tipc: fix issues with early FAILOVER_MSG from peer bnx2x: Check if transceiver implements DDM before access xhci: detect USB 3.2 capable host controllers correctly usb: xhci: Don't try to recover an endpoint if port is in error state. KVM: fix typo in documentation drm/panfrost: Make sure a BO is only unmapped when appropriate md: fix for divide error in status_resync soc: ixp4xx: npe: Fix an IS_ERR() vs NULL check in probe arm64/mm: don't initialize pgd_cache twice MAINTAINERS: Update my email address arm64/sve: <uapi/asm/ptrace.h> should not depend on <uapi/linux/prctl.h> ovl: fix typo in MODULE_PARM_DESC ovl: fix bogus -Wmaybe-unitialized warning ovl: don't fail with disconnected lower NFS mmc: core: Prevent processing SDIO IRQs when the card is suspended mmc: sdhci: sdhci-pci-o2micro: Correctly set bus width when tuning brcmfmac: sdio: Don't tune while the card is off mmc: core: Add sdio_retune_hold_now() and sdio_retune_release() brcmfmac: sdio: Disable auto-tuning around commands expected to fail mmc: core: API to temporarily disable retuning for SDIO CRC errors Revert "brcmfmac: disable command decode in sdio_aos" ARM: ixp4xx: include irqs.h where needed ARM: ixp4xx: mark ixp4xx_irq_setup as __init ARM: ixp4xx: don't select SERIAL_OF_PLATFORM firmware: trusted_foundations: add ARMv7 dependency usb: dwc2: Use generic PHY width in params setup RDMA/efa: Fix success return value in case of error IB/hfi1: Handle port down properly in pio IB/hfi1: Handle wakeup of orphaned QPs for pio IB/hfi1: Wakeup QPs orphaned on wait list after flush IB/hfi1: Use aborts to trigger RC throttling IB/hfi1: Create inline to get extended headers IB/hfi1: Silence txreq allocation warnings IB/hfi1: Avoid hardlockup with flushlist_lock KVM: PPC: Book3S HV: Only write DAWR[X] when handling h_set_dawr in real mode KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr() fs/namespace: fix unprivileged mount propagation vfs: fsmount: add missing mntget() cifs: fix GlobalMid_Lock bug in cifs_reconnect SMB3: retry on STATUS_INSUFFICIENT_RESOURCES instead of failing write staging: erofs: add requirements field in superblock arm64: ssbd: explicitly depend on <linux/prctl.h> block: fix page leak when merging to same page block: return from __bio_try_merge_page if merging occured in the same page Btrfs: fix failure to persist compression property xattr deletion on fsync riscv: remove unused barrier defines usb: chipidea: udc: workaround for endpoint conflict issue MAINTAINERS: Change QCOM repo location mmc: mediatek: fix SDIO IRQ detection issue mmc: mediatek: fix SDIO IRQ interrupt handle flow mmc: core: complete HS400 before checking status riscv: mm: synchronize MMU after pte change MAINTAINERS: Update my email address to use @kernel.org ANDROID: update abi_gki_aarch64.xml for 5.2-rc5 riscv: dts: add initial board data for the SiFive HiFive Unleashed riscv: dts: add initial support for the SiFive FU540-C000 SoC dt-bindings: riscv: convert cpu binding to json-schema dt-bindings: riscv: sifive: add YAML documentation for the SiFive FU540 arch: riscv: add support for building DTB files from DT source data drm/i915/gvt: ignore unexpected pvinfo write lapb: fixed leak of control-blocks. tipc: purge deferredq list for each grp member in tipc_group_delete ax25: fix inconsistent lock state in ax25_destroy_timer neigh: fix use-after-free read in pneigh_get_next tcp: fix compile error if !CONFIG_SYSCTL hv_sock: Suppress bogus "may be used uninitialized" warnings be2net: Fix number of Rx queues used for flow hashing net: handle 802.1P vlan 0 packets properly Linux 5.2-rc5 tcp: enforce tcp_min_snd_mss in tcp_mtu_probing() tcp: add tcp_min_snd_mss sysctl tcp: tcp_fragment() should apply sane memory limits tcp: limit payload size of sacked skbs Revert "net: phylink: set the autoneg state in phylink_phy_change" bpf: fix nested bpf tracepoints with per-cpu data bpf: Fix out of bounds memory access in bpf_sk_storage vsock/virtio: set SOCK_DONE on peer shutdown net: dsa: rtl8366: Fix up VLAN filtering net: phylink: set the autoneg state in phylink_phy_change powerpc/32: fix build failure on book3e with KVM powerpc/booke: fix fast syscall entry on SMP powerpc/32s: fix initial setup of segment registers on secondary CPU x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback net: add high_order_alloc_disable sysctl/static key tcp: add tcp_tx_skb_cache sysctl tcp: add tcp_rx_skb_cache sysctl sysctl: define proc_do_static_key() hv_netvsc: Set probe mode to sync net: sched: flower: don't call synchronize_rcu() on mask creation net: dsa: fix warning same module names sctp: Free cookie before we memdup a new one net: dsa: microchip: Don't try to read stats for unused ports qmi_wwan: extend permitted QMAP mux_id value range qmi_wwan: avoid RCU stalls on device disconnect when in QMAP mode qmi_wwan: add network device usage statistics for qmimux devices qmi_wwan: add support for QMAP padding in the RX path bpf, x64: fix stack layout of JITed bpf code Smack: Restore the smackfsdef mount option and add missing prefixes bpf, devmap: Add missing RCU read lock on flush bpf, devmap: Add missing bulk queue free bpf, devmap: Fix premature entry free on destroying map ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper() module: Fix livepatch/ftrace module text permissions race tracing/uprobe: Fix obsolete comment on trace_uprobe_create() tracing/uprobe: Fix NULL pointer dereference in trace_uprobe_create() tracing: Make two symbols static tracing: avoid build warning with HAVE_NOP_MCOUNT tracing: Fix out-of-range read in trace_stack_print() gfs2: Fix rounding error in gfs2_iomap_page_prepare net: phylink: further mac_config documentation improvements nfc: Ensure presence of required attributes in the deactivate_target handler btrfs: start readahead also in seed devices x86/kasan: Fix boot with 5-level paging and KASAN cfg80211: report measurement start TSF correctly cfg80211: fix memory leak of wiphy device name cfg80211: util: fix bit count off by one mac80211: do not start any work during reconfigure flow cfg80211: use BIT_ULL in cfg80211_parse_mbssid_data() mac80211: only warn once on chanctx_conf being NULL mac80211: drop robust management frames from unknown TA gpu: ipu-v3: image-convert: Fix image downsize coefficients gpu: ipu-v3: image-convert: Fix input bytesperline for packed formats gpu: ipu-v3: image-convert: Fix input bytesperline width/height align thunderbolt: Implement CIO reset correctly for Titan Ridge ARM: davinci: da8xx: specify dma_coherent_mask for lcdc ARM: davinci: da850-evm: call regulator_has_full_constraints() timekeeping: Repair ktime_get_coarse*() granularity Revert "ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops" ANDROID: update abi_gki_aarch64.xml mm/devm_memremap_pages: fix final page put race PCI/P2PDMA: track pgmap references per resource, not globally lib/genalloc: introduce chunk owners PCI/P2PDMA: fix the gen_pool_add_virt() failure path mm/devm_memremap_pages: introduce devm_memunmap_pages drivers/base/devres: introduce devm_release_action() mm/vmscan.c: fix trying to reclaim unevictable LRU page coredump: fix race condition between collapse_huge_page() and core dumping mm/mlock.c: change count_mm_mlocked_page_nr return type mm: mmu_gather: remove __tlb_reset_range() for force flush fs/ocfs2: fix race in ocfs2_dentry_attach_lock() mm/vmscan.c: fix recent_rotated history mm/mlock.c: mlockall error for flag MCL_ONFAULT scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node mm: memcontrol: don't batch updates of local VM stats and events PCI: PM: Skip devices in D0 for suspend-to-idle ANDROID: Removed extraneous configs from gki powerpc/bpf: use unsigned division instruction for 64-bit operations bpf: fix div64 overflow tests to properly detect errors bpf: sync BPF_FIB_LOOKUP flag changes with BPF uapi bpf: simplify definition of BPF_FIB_LOOKUP related flags cifs: add spinlock for the openFileList to cifsInodeInfo cifs: fix panic in smb2_reconnect x86/fpu: Don't use current->mm to check for a kthread KVM: nVMX: use correct clean fields when copying from eVMCS vfio-ccw: Destroy kmem cache region on module exit block/ps3vram: Use %llu to format sector_t after LBDAF removal libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached bcache: fix stack corruption by PRECEDING_KEY() arm64/sve: Fix missing SVE/FPSIMD endianness conversions blk-mq: remove WARN_ON(!q->elevator) from blk_mq_sched_free_requests blkio-controller.txt: Remove references to CFQ block/switching-sched.txt: Update to blk-mq schedulers null_blk: remove duplicate check for report zone blk-mq: no need to check return value of debugfs_create functions io_uring: fix memory leak of UNIX domain socket inode block: force select mq-deadline for zoned block devices binder: fix possible UAF when freeing buffer drm/amdgpu: return 0 by default in amdgpu_pm_load_smu_firmware drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported() ANDROID: x86 gki_defconfig: enable DMA_CMA ANDROID: Fixed x86 regression ANDROID: gki_defconfig: enable DMA_CMA Input: synaptics - enable SMBus on ThinkPad E480 and E580 net: mvpp2: prs: Use the correct helpers when removing all VID filters net: mvpp2: prs: Fix parser range for VID filtering mlxsw: spectrum: Disallow prio-tagged packets when PVID is removed mlxsw: spectrum_buffers: Reduce pool size on Spectrum-2 selftests: tc_flower: Add TOS matching test mlxsw: spectrum_flower: Fix TOS matching selftests: mlxsw: Test nexthop offload indication mlxsw: spectrum_router: Refresh nexthop neighbour when it becomes dead mlxsw: spectrum: Use different seeds for ECMP and LAG hash net: tls, correctly account for copied bytes with multiple sk_msgs vrf: Increment Icmp6InMsgs on the original netdev cpuset: restore sanity to cpuset_cpus_allowed_fallback() net: ethtool: Allow matching on vlan DEI bit linux-next: DOC: RDS: Fix a typo in rds.txt x86/kgdb: Return 0 from kgdb_arch_set_breakpoint() mpls: fix af_mpls dependencies for real selinux: fix a missing-check bug in selinux_sb_eat_lsm_opts() selinux: fix a missing-check bug in selinux_add_mnt_opt( ) arm64: tlbflush: Ensure start/end of address range are aligned to stride usb: typec: Make sure an alt mode exist before getting its partner KVM: arm/arm64: vgic: Fix kvm_device leak in vgic_its_destroy KVM: arm64: Filter out invalid core register IDs in KVM_GET_REG_LIST KVM: arm64: Implement vq_present() as a macro xdp: check device pointer before clearing bpf: net: Set sk_bpf_storage back to NULL for cloned sk Btrfs: fix race between block group removal and block group allocation clocksource/drivers/arm_arch_timer: Don't trace count reader functions i2c: pca-platform: Fix GPIO lookup code thunderbolt: Make sure device runtime resume completes before taking domain lock drm: add fallback override/firmware EDID modes workaround i2c: acorn: fix i2c warning arm64: Don't unconditionally add -Wno-psabi to KBUILD_CFLAGS drm/edid: abstract override/firmware EDID retrieval platform/mellanox: mlxreg-hotplug: Add devm_free_irq call to remove flow platform/x86: mlx-platform: Fix parent device in i2c-mux-reg device registration platform/x86: intel-vbtn: Report switch events when event wakes device platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi ARM: mvebu_v7_defconfig: fix Ethernet on Clearfog x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled x86/resctrl: Don't stop walking closids when a locksetup group is found iommu/arm-smmu: Avoid constant zero in TLBI writes drm/i915/perf: fix whitelist on Gen10+ drm/i915/sdvo: Implement proper HDMI audio support for SDVO drm/i915: Fix per-pixel alpha with CCS drm/i915/dmc: protect against reading random memory drm/i915/dsi: Use a fuzzy check for burst mode clock check Input: imx_keypad - make sure keyboard can always wake up system selinux: log raw contexts as untrusted strings ptrace: restore smp_rmb() in __ptrace_may_access() IB/hfi1: Correct tid qp rcd to match verbs context IB/hfi1: Close PSM sdma_progress sleep window IB/hfi1: Validate fault injection opcode user input geneve: Don't assume linear buffers in error handler vxlan: Don't assume linear buffers in error handler net: openvswitch: do not free vport if register_netdevice() is failed. net: correct udp zerocopy refcnt also when zerocopy only on append drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc ovl: fix wrong flags check in FS_IOC_FS[SG]ETXATTR ioctls riscv: Fix udelay in RV32. drm/vmwgfx: fix a warning due to missing dma_parms riscv: export pm_power_off again drm/vmwgfx: Honor the sg list segment size limitation RISC-V: defconfig: enable clocks, serial console drm/vmwgfx: Use the backdoor port if the HB port is not available bpf: lpm_trie: check left child of last leftmost node for NULL Revert "fuse: require /dev/fuse reads to have enough buffer capacity" ALSA: ice1712: Check correct return value to snd_i2c_sendbytes (EWS/DMX 6Fire) ALSA: oxfw: allow PCM capture for Stanton SCS.1m ALSA: firewire-motu: fix destruction of data for isochronous resources s390/ctl_reg: mark __ctl_set_bit and __ctl_clear_bit as __always_inline s390/boot: disable address-of-packed-member warning ANDROID: update gki aarch64 ABI representation cgroup: Fix css_task_iter_advance_css_set() cset skip condition drm/panfrost: Require the simple_ondemand governor drm/panfrost: make devfreq optional again drm/gem_shmem: Use a writecombine mapping for ->vaddr mmc: sdhi: disallow HS400 for M3-W ES1.2, RZ/G2M, and V3H ASoC: Intel: sst: fix kmalloc call with wrong flags ASoC: core: Fix deadlock in snd_soc_instantiate_card() cgroup/bfq: revert bfq.weight symlink change ARM: dts: am335x phytec boards: Fix cd-gpios active level ARM: dts: dra72x: Disable usb4_tm target module nfp: ensure skb network header is set for packet redirect tcp: fix undo spurious SYNACK in passive Fast Open mpls: fix af_mpls dependencies ibmvnic: Fix unchecked return codes of memory allocations ibmvnic: Refresh device multicast list after reset ibmvnic: Do not close unopened driver during reset mpls: fix warning with multi-label encap net: phy: rename Asix Electronics PHY driver ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero net: ipv4: fib_semantics: fix uninitialized variable Input: iqs5xx - get axis info before calling input_mt_init_slots() Linux 5.2-rc4 drm: panel-orientation-quirks: Add quirk for GPD MicroPC drm: panel-orientation-quirks: Add quirk for GPD pocket2 counter/ftm-quaddec: Add missing dependencies in Kconfig staging: iio: adt7316: Fix build errors when GPIOLIB is not set x86/fpu: Update kernel's FPU state before using for the fsave header MAINTAINERS: Karthikeyan Ramasubramanian is MIA i2c: xiic: Add max_read_len quirk ANDROID: update ABI representation gpio: pca953x: hack to fix 24 bit gpio expanders net/mlx5e: Support tagged tunnel over bond net/mlx5e: Avoid detaching non-existing netdev under switchdev mode net/mlx5e: Fix source port matching in fdb peer flow rule net/mlx5e: Replace reciprocal_scale in TX select queue function net/mlx5e: Add ndo_set_feature for uplink representor net/mlx5: Avoid reloading already removed devices net/mlx5: Update pci error handler entries and command translation RAS/CEC: Convert the timer callback to a workqueue RAS/CEC: Fix binary search function x86/mm/KASLR: Compute the size of the vmemmap section properly can: purge socket error queue on sock destruct can: flexcan: Remove unneeded registration message can: af_can: Fix error path of can_init() can: m_can: implement errata "Needless activation of MRAF irq" can: mcp251x: add support for mcp25625 dt-bindings: can: mcp251x: add mcp25625 support can: xilinx_can: use correct bittiming_const for CAN FD core can: flexcan: fix timeout when set small bitrate can: usb: Kconfig: Remove duplicate menu entry lockref: Limit number of cmpxchg loop retries uaccess: add noop untagged_addr definition x86/insn-eval: Fix use-after-free access to LDT entry kbuild: use more portable 'command -v' for cc-cross-prefix s390/unwind: correct stack switching during unwind scsi: hpsa: correct ioaccel2 chaining btrfs: Always trim all unallocated space in btrfs_trim_free_extents netfilter: ipv6: nf_defrag: accept duplicate fragments again powerpc/32s: fix booting with CONFIG_PPC_EARLY_DEBUG_BOOTX drm/meson: fix G12A primary plane disabling drm/meson: fix primary plane disabling drm/meson: fix G12A HDMI PLL settings for 4K60 1000/1001 variations block, bfq: add weight symlink to the bfq.weight cgroup parameter cgroup: let a symlink too be created with a cftype file powerpc/64s: __find_linux_pte() synchronization vs pmdp_invalidate() powerpc/64s: Fix THP PMD collapse serialisation powerpc: Fix kexec failure on book3s/32 drm/nouveau/secboot/gp10[2467]: support newer FW to fix SEC2 failures on some boards drm/nouveau/secboot: enable loading of versioned LS PMU/SEC2 ACR msgqueue FW drm/nouveau/secboot: split out FW version-specific LS function pointers drm/nouveau/secboot: pass max supported FW version to LS load funcs drm/nouveau/core: support versioned firmware loading drm/nouveau/core: pass subdev into nvkm_firmware_get, rather than device block: free sched's request pool in blk_cleanup_queue bpf: expand section tests for test_section_names bpf: more msg_name rewrite tests to test_sock_addr bpf, bpftool: enable recvmsg attach types bpf, libbpf: enable recvmsg attach types bpf: sync tooling uapi header bpf: fix unconnected udp hooks vfio/mdev: Synchronize device create/remove with parent removal vfio/mdev: Avoid creating sysfs remove file on stale device removal pktgen: do not sleep with the thread lock held. net: mvpp2: Use strscpy to handle stat strings net: rds: fix memory leak in rds_ib_flush_mr_pool ipv6: fix EFAULT on sendto with icmpv6 and hdrincl ipv6: use READ_ONCE() for inet->hdrincl as in ipv4 soundwire: intel: set dai min and max channels correctly soundwire: stream: fix bad unlock balance x86/fpu: Use fault_in_pages_writeable() for pre-faulting nvme-rdma: use dynamic dma mapping per command nvme: Fix u32 overflow in the number of namespace list calculation vfio/mdev: Improve the create/remove sequence SoC: rt274: Fix internal jack assignment in set_jack callback ALSA: hdac: fix memory release for SST and SOF drivers ASoC: SOF: Intel: hda: use the defined ppcap functions ASoC: core: move DAI pre-links initiation to snd_soc_instantiate_card ASoC: Intel: cht_bsw_rt5672: fix kernel oops with platform_name override ASoC: Intel: cht_bsw_nau8824: fix kernel oops with platform_name override ASoC: Intel: bytcht_es8316: fix kernel oops with platform_name override ASoC: Intel: cht_bsw_max98090: fix kernel oops with platform_name override Revert "gfs2: Replace gl_revokes with a GLF flag" arm64: Silence gcc warnings about arch ABI drift parisc: Fix crash due alternative coding for NP iopdir_fdc bit parisc: Use lpa instruction to load physical addresses in driver code parisc: configs: Remove useless UEVENT_HELPER_PATH parisc: Use implicit space register selection for loading the coherence index of I/O pdirs usb: gadget: udc: lpc32xx: fix return value check in lpc32xx_udc_probe() usb: gadget: dwc2: fix zlp handling usb: dwc2: Set actual frame number for completed ISOC transfer for none DDMA usb: gadget: udc: lpc32xx: allocate descriptor with GFP_ATOMIC usb: gadget: fusb300_udc: Fix memory leak of fusb300->ep[i] usb: phy: mxs: Disable external charger detect in mxs_phy_hw_init() usb: dwc2: Fix DMA cache alignment issues usb: dwc2: host: Fix wMaxPacketSize handling (fix webcam regression) ARM64: trivial: s/TIF_SECOMP/TIF_SECCOMP/ comment typo fix drm/komeda: Potential error pointer dereference drm/komeda: remove set but not used variable 'kcrtc' x86/CPU: Add more Icelake model numbers hwmon: (pmbus/core) Treat parameters as paged if on multiple pages hwmon: (pmbus/core) mutex_lock write in pmbus_set_samples hwmon: (core) add thermal sensors only if dev->of_node is present Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied" net: aquantia: fix wol configuration not applied sometimes ethtool: fix potential userspace buffer overflow Fix memory leak in sctp_process_init net: rds: fix memory leak when unload rds_rdma ipv6: fix the check before getting the cookie in rt6_get_cookie ipv4: not do cache for local delivery if bc_forwarding is enabled selftests: vm: Fix test build failure when built by itself tools: bpftool: Fix JSON output when lookup fails mmc: also set max_segment_size in the device mtip32xx: also set max_segment_size in the device rsxx: don't call dma_set_max_seg_size nvme-pci: don't limit DMA segement size s390/qeth: handle error when updating TX queue count s390/qeth: fix VLAN attribute in bridge_hostnotify udev event s390/qeth: check dst entry before use s390/qeth: handle limited IPv4 broadcast in L3 TX path ceph: fix error handling in ceph_get_caps() ceph: avoid iput_final() while holding mutex or in dispatch thread ceph: single workqueue for inode related works cgroup: css_task_iter_skip()'d iterators must be advanced before accessed drm/amd/amdgpu: add RLC firmware to support raven1 refresh drm/amd/powerplay: add set_power_profile_mode for raven1_refresh drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 450 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 449 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 448 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 446 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 445 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 444 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 443 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 442 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 440 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 438 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 437 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 435 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 434 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 433 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 432 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 431 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 429 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 426 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 424 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 423 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 422 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 421 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 420 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 419 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 418 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 417 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 416 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 414 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 412 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 411 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 410 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 409 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 408 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 407 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 406 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 405 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 404 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 403 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 402 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 401 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 400 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 399 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 398 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 397 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 396 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 395 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 394 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 393 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 392 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 391 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 390 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 389 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 388 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 387 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 380 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 378 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 377 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 376 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 375 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 373 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 372 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 371 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 370 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 367 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 365 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 364 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 363 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 362 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 354 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 353 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 352 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 351 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 350 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 349 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 348 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 347 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 346 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 345 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 344 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 343 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 342 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 341 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 340 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 339 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 338 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 336 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 335 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 334 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 332 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 330 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 328 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 326 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 325 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 324 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 323 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 322 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 321 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 320 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 316 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 315 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 314 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 313 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 312 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 311 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 310 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 309 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 308 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 307 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 305 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 301 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 300 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 299 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 297 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 296 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 294 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 292 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 291 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 290 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 289 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 287 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 286 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 285 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 284 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 283 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 281 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 280 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 278 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 277 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 276 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 275 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 274 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 273 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 272 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 271 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 270 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 269 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 268 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 267 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 266 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 265 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 264 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 263 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 262 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 260 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 258 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 257 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 256 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 254 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 253 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 252 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 251 lib/test_stackinit: Handle Clang auto-initialization pattern block: Drop unlikely before IS_ERR(_OR_NULL) xen/swiotlb: don't initialize swiotlb twice on arm64 s390/mm: fix address space detection in exception handling HID: logitech-dj: Fix 064d:c52f receiver support Revert "HID: core: Call request_module before doing device_add" Revert "HID: core: Do not call request_module() in async context" Revert "HID: Increase maximum report size allowed by hid_field_extract()" tests: fix pidfd-test compilation signal: improve comments samples: fix pidfd-metadata compilation arm64: arch_timer: mark functions as __always_inline arm64: smp: Moved cpu_logical_map[] to smp.h arm64: cpufeature: Fix missing ZFR0 in __read_sysreg_by_encoding() selftests/bpf: move test_lirc_mode2_user to TEST_GEN_PROGS_EXTENDED USB: Fix chipmunk-like voice when using Logitech C270 for recording audio. USB: usb-storage: Add new ID to ums-realtek udmabuf: actually unmap the scatterlist net: fix indirect calls helpers for ptype list hooks. net: ipvlan: Fix ipvlan device tso disabled while NETIF_F_IP_CSUM is set scsi: smartpqi: unlock on error in pqi_submit_raid_request_synchronous() scsi: ufs: Check that space was properly alloced in copy_query_response udp: only choose unbound UDP socket for multicast when not in a VRF net/tls: replace the sleeping lock around RX resync with a bit lock Revert "net/tls: avoid NULL-deref on resync during device removal" block: aoe: no need to check return value of debugfs_create functions net: dsa: sja1105: Fix link speed not working at 100 Mbps and below net: phylink: avoid reducing support mask scripts/checkstack.pl: Fix arm64 wrong or unknown architecture kbuild: tar-pkg: enable communication with jobserver kconfig: tests: fix recursive inclusion unit test kbuild: teach kselftest-merge to find nested config files nvmet: fix data_len to 0 for bdev-backed write_zeroes MAINTAINERS: Hand over skd maintainership ASoC: sun4i-i2s: Add offset to RX channel select ASoC: sun4i-i2s: Fix sun8i tx channel offset mask ASoC: max98090: remove 24-bit format support if RJ is 0 ASoC: da7219: Fix build error without CONFIG_I2C ASoC: SOF: Intel: hda: Fix COMPILE_TEST build error drm/arm/hdlcd: Allow a bit of clock tolerance drm/arm/hdlcd: Actually validate CRTC modes drm/arm/mali-dp: Add a loop around the second set CVAL and try 5 times drm/komeda: fixing of DMA mapping sg segment warning netfilter: ipv6: nf_defrag: fix leakage of unqueued fragments habanalabs: Read upper bits of trace buffer from RWPHI arm64: arch_k3: Fix kconfig dependency warning drm: don't block fb changes for async plane updates drm/vc4: fix fb references in async update drm/msm: fix fb references in async update drm/amd: fix fb references in async update drm/rockchip: fix fb references in async update xen-blkfront: switch kcalloc to kvcalloc for large array allocation drm/mediatek: call mtk_dsi_stop() after mtk_drm_crtc_atomic_disable() drm/mediatek: clear num_pipes when unbind driver drm/mediatek: call drm_atomic_helper_shutdown() when unbinding driver drm/mediatek: unbind components in mtk_drm_unbind() drm/mediatek: fix unbind functions net: sfp: read eeprom in maximum 16 byte increments selftests: set sysctl bc_forwarding properly in router_broadcast.sh ANDROID: update gki aarch64 ABI representation net: ethernet: mediatek: Use NET_IP_ALIGN to judge if HW RX_2BYTE_OFFSET is enabled net: ethernet: mediatek: Use hw_feature to judge if HWLRO is supported net: ethernet: ti: cpsw_ethtool: fix ethtool ring param set ANDROID: gki_defconfig: Enable CMA, SLAB_FREELIST (RANDOM and HARDENED) on x86 bpf: udp: Avoid calling reuseport's bpf_prog from udp_gro bpf: udp: ipv6: Avoid running reuseport's bpf_prog from __udp6_lib_err rcu: locking and unlocking need to always be at least barriers ANDROID: gki_defconfig: enable SLAB_FREELIST_RANDOM, SLAB_FREELIST_HARDENED ANDROID: gki_defconfig: enable CMA and increase CMA_AREAS ASoC: SOF: fix DSP oops definitions in FW ABI ASoC: hda: fix unbalanced codec dev refcount for HDA_DEV_ASOC ASoC: SOF: ipc: replace fw ready bitfield with explicit bit ordering ASoC: SOF: bump to ABI 3.6 ASoC: SOF: soundwire: add initial soundwire support ASoC: SOF: uapi: mirror firmware changes ASoC: Intel: Baytrail: add quirk for Aegex 10 (RU2) tablet xfs: inode btree scrubber should calculate im_boffset correctly mmc: sdhci_am654: Fix SLOTTYPE write usb: typec: ucsi: ccg: fix memory leak in do_flash ANDROID: update gki aarch64 ABI representation habanalabs: Fix virtual address access via debugfs for 2MB pages drm/komeda: Constify the usage of komeda_component/pipeline/dev_funcs x86/power: Fix 'nosmt' vs hibernation triple fault during resume mm/vmalloc: Avoid rare case of flushing TLB with weird arguments mm/vmalloc: Fix calculation of direct map addr range PM: sleep: Add kerneldoc comments to some functions drm/i915/gvt: save RING_HEAD into vreg when vgpu switched out sparc: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD mdesc: fix a missing-check bug in get_vdev_port_node_info() drm/i915/gvt: add F_CMD_ACCESS flag for wa regs sparc64: Fix regression in non-hypervisor TLB flush xcall packet: unconditionally free po->rollover Update my email address net: hns: Fix loopback test failed at copper ports Linux 5.2-rc3 net: dsa: mv88e6xxx: avoid error message on remove from VLAN 0 mm, compaction: make sure we isolate a valid PFN include/linux/generic-radix-tree.h: fix kerneldoc comment kernel/signal.c: trace_signal_deliver when signal_group_exit drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used spdxcheck.py: fix directory structures kasan: initialize tag to 0xff in __kasan_kmalloc z3fold: fix sheduling while atomic scripts/gdb: fix invocation when CONFIG_COMMON_CLK is not set mm/gup: continue VM_FAULT_RETRY processing even for pre-faults ocfs2: fix error path kobject memory leak memcg: make it work on sparse non-0-node systems mm, memcg: consider subtrees in memory.events prctl_set_mm: downgrade mmap_sem to read lock prctl_set_mm: refactor checks from validate_prctl_map kernel/fork.c: make max_threads symbol static arch/arm/boot/compressed/decompress.c: fix build error due to lz4 changes arch/parisc/configs/c8000_defconfig: remove obsoleted CONFIG_DEBUG_SLAB_LEAK mm/vmalloc.c: fix typo in comment lib/sort.c: fix kernel-doc notation warnings mm: fix Documentation/vm/hmm.rst Sphinx warnings treewide: fix typos of SPDX-License-Identifier crypto: ux500 - fix license comment syntax error MAINTAINERS: add I2C DT bindings to ARM platforms MAINTAINERS: add DT bindings to i2c drivers mwifiex: Fix heap overflow in mwifiex_uap_parse_tail_ies() iwlwifi: mvm: change TLC config cmd sent by rs to be async iwlwifi: Fix double-free problems in iwl_req_fw_callback() iwlwifi: fix AX201 killer sku loading firmware issue iwlwifi: print fseq info upon fw assert iwlwifi: clear persistence bit according to device family iwlwifi: fix load in rfkill flow for unified firmware iwlwifi: mvm: remove d3_sram debugfs file bpf, riscv: clear high 32 bits for ALU32 add/sub/neg/lsh/rsh/arsh libbpf: Return btf_fd for load_sk_storage_btf HID: a4tech: fix horizontal scrolling HID: hyperv: Add a module description line net: dsa: sja1105: Don't store frame type in skb->cb block: print offending values when cloned rq limits are exceeded blk-mq: Document the blk_mq_hw_queue_to_node() arguments blk-mq: Fix spelling in a source code comment block: Fix bsg_setup_queue() kernel-doc header block: Fix rq_qos_wait() kernel-doc header block: Fix blk_mq_*_map_queues() kernel-doc headers block: Fix throtl_pending_timer_fn() kernel-doc header block: Convert blk_invalidate_devt() header into a non-kernel-doc header block/partitions/ldm: Convert a kernel-doc header into a non-kernel-doc header leds: avoid flush_work in atomic context cgroup: Include dying leaders with live threads in PROCS iterations cgroup: Implement css_task_iter_skip() cgroup: Call cgroup_release() before __exit_signal() netfilter: nf_tables: fix module autoload with inet family Revert "lockd: Show pid of lockd for remote locks" ALSA: hda/realtek - Update headset mode for ALC256 fs/adfs: fix filename fixup handling for "/" and "//" names fs/adfs: move append_filetype_suffix() into adfs_object_fixup() fs/adfs: remove truncated filename hashing fs/adfs: factor out filename fixup fs/adfs: factor out object fixups fs/adfs: factor out filename case lowering fs/adfs: factor out filename comparison ovl: doc: add non-standard corner cases pstore/ram: Run without kernel crash dump region MAINTAINERS: add Vasily Gorbik and Christian Borntraeger for s390 MAINTAINERS: Farewell Martin Schwidefsky pstore: Set tfm to NULL on free_buf_for_compression nds32: add new emulations for floating point instruction nds32: Avoid IEX status being incorrectly modified math-emu: Use statement expressions to fix Wshift-count-overflow warning net: correct zerocopy refcnt with udp MSG_MORE ethtool: Check for vlan etype or vlan tci when parsing flow_rule net: don't clear sock->sk early to avoid trouble in strparser net-gro: fix use-after-free read in napi_gro_frags() net: dsa: tag_8021q: Create a stable binary format net: dsa: tag_8021q: Change order of rx_vid setup net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value docs cgroups: add another example size for hugetlb NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter ipv4: tcp_input: fix stack out of bounds when parsing TCP options. mlxsw: spectrum: Prevent force of 56G mlxsw: spectrum_acl: Avoid warning after identical rules insertion SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT SUNRPC fix regression in umount of a secure mount r8169: fix MAC address being lost in PCI D3 treewide: Add SPDX license identifier - Kbuild treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 225 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 224 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 223 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 222 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 221 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 220 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 218 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 217 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 216 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 215 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 214 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 213 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 211 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 210 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 207 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 203 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 200 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 199 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 198 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 197 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 195 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 194 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 191 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 190 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 188 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 185 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 183 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 182 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 180 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 179 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 178 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 177 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 176 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 175 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 173 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 172 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 171 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 170 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 167 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 166 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 165 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 164 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 162 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 161 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 160 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 159 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 158 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 155 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 154 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 153 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 151 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 150 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 149 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 148 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 147 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 145 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 144 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 143 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 142 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 140 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 139 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 138 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 137 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 136 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 135 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 133 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 132 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 131 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 130 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 129 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 128 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 127 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 126 net: core: support XDP generic on stacked devices. netvsc: unshare skb in VF rx handler udp: Avoid post-GRO UDP checksum recalculation nvme-tcp: fix queue mapping when queue count is limited nvme-rdma: fix queue mapping when queue count is limited fpga: zynqmp-fpga: Correctly handle error pointer selftests: vm: install test_vmalloc.sh for run_vmtests userfaultfd: selftest: fix compiler warning kselftest/cgroup: fix incorrect test_core skip kselftest/cgroup: fix unexpected testing failure on test_core kselftest/cgroup: fix unexpected testing failure on test_memcontrol xtensa: Fix section mismatch between memblock_reserve and mem_reserve signal/ptrace: Don't leak unitialized kernel memory with PTRACE_PEEK_SIGINFO mwifiex: Abort at too short BSS descriptor element mwifiex: Fix possible buffer overflows at parsing bss descriptor drm/i915/gvt: Assign NULL to the pointer after memory free. drm/i915/gvt: Check if cur_pt_type is valid x86: intel_epb: Do not build when CONFIG_PM is unset crypto: hmac - fix memory leak in hmac_init_tfm() crypto: jitterentropy - change back to module_init() ARM: dts: Drop bogus CLKSEL for timer12 on dra7 KVM: PPC: Book3S HV: Restore SPRG3 in kvmhv_p9_guest_entry() KVM: PPC: Book3S HV: Fix lockdep warning when entering guest on POWER9 KVM: PPC: Book3S HV: XIVE: Fix page offset when clearing ESB pages KVM: PPC: Book3S HV: XIVE: Take the srcu read lock when accessing memslots KVM: PPC: Book3S HV: XIVE: Do not clear IRQ data of passthrough interrupts KVM: PPC: Book3S HV: XIVE: Introduce a new mutex for the XIVE device drm/i915/gvt: Fix cmd length of VEB_DI_IECP drm/i915/gvt: refine ggtt range validation drm/i915/gvt: Fix vGPU CSFE_CHICKEN1_REG mmio handler drm/i915/gvt: Fix GFX_MODE handling drm/i915/gvt: Update force-to-nonpriv register whitelist drm/i915/gvt: Initialize intel_gvt_gtt_entry in stack ima: show rules with IMA_INMASK correctly evm: check hash algorithm passed to init_desc() scsi: libsas: delete sas port if expander discover failed scsi: libsas: only clear phy->in_shutdown after shutdown event done scsi: scsi_dh_alua: Fix possible null-ptr-deref scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs) scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route() net: phy: dp83867: Set up RGMII TX delay net: phy: dp83867: do not call config_init twice net: phy: dp83867: increase SGMII autoneg timer duration net: phy: dp83867: fix speed 10 in sgmii mode net: phy: marvell10g: report if the PHY fails to boot firmware net: phylink: ensure consistent phy interface mode cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css() blk-mq: Fix memory leak in error handling usbip: usbip_host: fix stub_dev lock context imbalance regression net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs MIPS: uprobes: remove set but not used variable 'epc' s390/crypto: fix possible sleep during spinlock aquired MIPS: pistachio: Build uImage.gz by default MIPS: Make virt_addr_valid() return bool MIPS: Bounds check virt_addr_valid CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM RDMA/efa: Remove MAYEXEC flag check from mmap flow mlx5: avoid 64-bit division IB/hfi1: Validate page aligned for a given virtual address IB/{qib, hfi1, rdmavt}: Correct ibv_devinfo max_mr value IB/hfi1: Insure freeze_work work_struct is canceled on shutdown IB/rdmavt: Fix alloc_qpn() WARN_ON() ASoC: sun4i-codec: fix first delay on Speaker drm/amdgpu: reserve stollen vram for raven series media: venus: hfi_parser: fix a regression in parser selftests: bpf: fix compiler warning in flow_dissector test arm64: use the correct function type for __arm64_sys_ni_syscall arm64: use the correct function type in SYSCALL_DEFINE0 arm64: fix syscall_fn_t type block: don't protect generic_make_request_checks with blk_queue_enter block: move blk_exit_queue into __blk_release_queue selftests: bpf: complete sub-register zero extension checks selftests: bpf: move sub-register zero extension checks into subreg.c ovl: detect overlapping layers drm/i915/icl: Add WaDisableBankHangMode ALSA: fireface: Use ULL suffixes for 64-bit constants signal/arm64: Use force_sig not force_sig_fault for SIGKILL nl80211: fill all policy .type entries mac80211: free peer keys before vif down in mesh ANDROID: ABI out: Use the extension .xml rather then .out drm/mediatek: respect page offset for PRIME mmap calls drm/mediatek: adjust ddp clock control flow ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops KVM: PPC: Book3S HV: XIVE: Fix the enforced limit on the vCPU identifier KVM: PPC: Book3S HV: XIVE: Do not test the EQ flag validity when resetting KVM: PPC: Book3S HV: XIVE: Clear file mapping when device is released KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions Revert "drivers: thermal: tsens: Add new operation to check if a sensor is enabled" net/mlx5e: Disable rxhash when CQE compress is enabled net/mlx5e: restrict the real_dev of vlan device is the same as uplink device net/mlx5: Allocate root ns memory using kzalloc to match kfree net/mlx5: Avoid double free in fs init error unwinding path net/mlx5: Avoid double free of root ns in the error flow path net/mlx5: Fix error handling in mlx5_load() Documentation: net-sysfs: Remove duplicate PHY device documentation llc: fix skb leak in llc_build_and_send_ui_pkt() selftests: pmtu: Fix encapsulating device in pmtu_vti6_link_change_mtu dfs_cache: fix a wrong use of kfree in flush_cache_ent() fs/cifs/smb2pdu.c: fix buffer free in SMB2_ioctl_free cifs: fix memory leak of pneg_inbuf on -EOPNOTSUPP ioctl case xenbus: Avoid deadlock during suspend due to open transactions xen/pvcalls: Remove set but not used variable tracing: Avoid memory leak in predicate_parse() habanalabs: fix bug in checking huge page optimization mmc: sdhci: Fix SDIO IRQ thread deadlock dpaa_eth: use only online CPU portals net: mvneta: Fix err code path of probe net: stmmac: Do not output error on deferred probe Btrfs: fix race updating log root item during fsync Btrfs: fix wrong ctime and mtime of a directory after log replay ARC: [plat-hsdk] Get rid of inappropriate PHY settings ARC: [plat-hsdk]: Add support of Vivante GPU ARC: [plat-hsdk]: enable creg-gpio controller Btrfs: fix fsync not persisting changed attributes of a directory btrfs: qgroup: Check bg while resuming relocation to avoid NULL pointer dereference btrfs: reloc: Also queue orphan reloc tree for cleanup to avoid BUG_ON() Btrfs: incremental send, fix emission of invalid clone operations Btrfs: incremental send, fix file corruption when no-holes feature is enabled btrfs: correct zstd workspace manager lock to use spin_lock_bh() btrfs: Ensure replaced device doesn't have pending chunk allocation ia64: fix build errors by exporting paddr_to_nid() ASoC: SOF: Intel: hda: fix the hda init chip ASoC: SOF: ipc: fix a race, leading to IPC timeouts ASoC: SOF: control: correct the copy size for bytes kcontrol put ASoC: SOF: pcm: remove warning - initialize workqueue on open ASoC: SOF: pcm: clear hw_params_upon_resume flag correctly ASoC: SOF: core: fix error handling with the probe workqueue ASoC: SOF: core: remove snd_soc_unregister_component in case of error ASoC: SOF: core: remove DSP after unregistering machine driver ASoC: soc-core: fixup references at soc_cleanup_card_resources() arm64/module: revert to unsigned interpretation of ABS16/32 relocations KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID kvm: fix compile on s390 part 2 xprtrdma: Use struct_size() in kzalloc() tools headers UAPI: Sync kvm.h headers with the kernel sources perf record: Fix s390 missing module symbol and warning for non-root users perf machine: Read also the end of the kernel perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms perf session: Add missing swap ops for namespace events perf namespace: Protect reading thread's namespace tools headers UAPI: Sync drm/drm.h with the kernel s390/crypto: fix gcm-aes-s390 selftest failures s390/zcrypt: Fix wrong dispatching for control domain CPRBs s390/pci: fix assignment of bus resources s390/pci: fix struct definition for set PCI function s390: mark __cpacf_check_opcode() and cpacf_query_func() as __always_inline s390: add unreachable() to dump_fault_info() to fix -Wmaybe-uninitialized tools headers UAPI: Sync drm/i915_drm.h with the kernel tools headers UAPI: Sync linux/fs.h with the kernel tools headers UAPI: Sync linux/sched.h with the kernel tools arch x86: Sync asm/cpufeatures.h with the with the kernel tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel perf data: Fix 'strncat may truncate' build failure with recent gcc arm64: Fix the arm64_personality() syscall wrapper redirection rtw88: Make some symbols static rtw88: avoid circular locking between local->iflist_mtx and rtwdev->mutex rsi: Properly initialize data in rsi_sdio_ta_reset rtw88: fix unassigned rssi_level in rtw_sta_info rtw88: fix subscript above array bounds compiler warning fuse: extract helper for range writeback fuse: fix copy_file_range() in the writeback case mmc: meson-gx: fix irq ack mmc: tmio: fix SCC error handling to avoid false positive CRC error mmc: tegra: Fix a warning message memstick: mspro_block: Fix an error code in mspro_block_issue_req() mac80211: mesh: fix RCU warning nl80211: fix station_info pertid memory leak mac80211: Do not use stack memory with scatterlist for GMAC ALSA: line6: Assure canceling delayed work at disconnection configfs: Fix use-after-free when accessing sd->s_dentry ALSA: hda - Force polling mode on CNL for fixing codec communication i2c: synquacer: fix synquacer_i2c_doxfer() return value i2c: mlxcpld: Fix wrong initialization order in probe i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr RDMA/core: Fix panic when port_data isn't initialized RDMA/uverbs: Pass udata on uverbs error unwind RDMA/core: Clear out the udata before error unwind net: aquantia: tcp checksum 0xffff being handled incorrectly net: aquantia: fix LRO with FCS error net: aquantia: check rx csum for all packets in LRO session net: aquantia: tx clean budget logic error vhost: scsi: add weight support vhost: vsock: add weight support vhost_net: fix possible infinite loop vhost: introduce vhost_exceeds_weight() virtio: Fix indentation of VIRTIO_MMIO virtio: add unlikely() to WARN_ON_ONCE() iommu/vt-d: Set the right field for Page Walk Snoop iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock iommu: Add missing new line for dma type drm/etnaviv: lock MMU while dumping core block: Don't revalidate bdev of hidden gendisk loop: Don't change loop device under exclusive opener drm/imx: ipuv3-plane: fix atomic update status query for non-plus i.MX6Q drm/qxl: drop WARN_ONCE() iio: temperature: mlx90632 Relax the compatibility check iio: imu: st_lsm6dsx: fix PM support for st_lsm6dsx i2c controller staging:iio:ad7150: fix threshold mode config bit fuse: add FUSE_WRITE_KILL_PRIV fuse: fallocate: fix return with locked inode PCI: PM: Avoid possible suspend-to-idle issue ACPI: PM: Call pm_set_suspend_via_firmware() during hibernation ACPI/PCI: PM: Add missing wakeup.flags.valid checks ovl: support the FS_IOC_FS[SG]ETXATTR ioctls soundwire: stream: fix out of boundary access on port properties net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE() selftests/tls: add test for sleeping even though there is data net/tls: fix no wakeup on partial reads selftests/tls: test for lowat overshoot with multiple records net/tls: fix lowat calculation if some data came from previous record dpaa2-eth: Make constant 64-bit long dpaa2-eth: Use PTR_ERR_OR_ZERO where appropriate dpaa2-eth: Fix potential spectre issue bonding/802.3ad: fix slave link initialization transition states io_uring: Fix __io_uring_register() false success net: ethtool: Document get_rxfh_context and set_rxfh_context ethtool ops net: stmmac: dwmac-mediatek: modify csr_clk value to fix mdio read/write fail net: stmmac: fix csr_clk can't be zero issue net: stmmac: update rx tail pointer register to fix rx dma hang issue. ip_sockglue: Fix missing-check bug in ip_ra_control() ipv6_sockglue: Fix a missing-check bug in ip6_ra_control() efi: Allow the number of EFI configuration tables entries to be zero efi/x86/Add missing error handling to old_memmap 1:1 mapping code parisc: Fix compiler warnings in float emulation code parisc/slab: cleanup after /proc/slab_allocators removal bpf: sockmap, fix use after free from sleep in psock backlog workqueue net: sched: don't use tc_action->order during action dump cxgb4: Revert "cxgb4: Remove SGE_HOST_PAGE_SIZE dependency on page size" net: fec: fix the clk mismatch in failed_reset path habanalabs: Avoid using a non-initialized MMU cache mutex habanalabs: fix debugfs code uapi/habanalabs: add opcode for enable/disable device debug mode habanalabs: halt debug engines on user process close selftests: rtc: rtctest: specify timeouts selftests/harness: Allow test to configure timeout selftests/ftrace: Add checkbashisms meta-testcase selftests/ftrace: Make a script checkbashisms clean media: smsusb: better handle optional alignment test_firmware: Use correct snprintf() limit genwqe: Prevent an integer overflow in the ioctl parport: Fix mem leak in parport_register_dev_model fpga: dfl: expand minor range when registering chrdev region fpga: dfl: Add lockdep classes for pdata->lock fpga: dfl: afu: Pass the correct device to dma_mapping_error() fpga: stratix10-soc: fix use-after-free on s10_init() w1: ds2408: Fix typo after 49695ac46861 (reset on output_write retry with readback) kheaders: Do not regenerate archive if config is not changed kheaders: Move from proc to sysfs drm/amd/display: Don't load DMCU for Raven 1 (v2) drm/i915: Maintain consistent documentation subsection ordering scripts/sphinx-pre-install: make it handle Sphinx versions docs: Fix conf.py for Sphinx 2.0 vt/fbcon: deinitialize resources in visual_init() after failed memory allocation xfs: fix broken log reservation debugging clocksource/drivers/timer-ti-dm: Change to new style declaration ASoC: core: lock client_mutex while removing link components ASoC: simple-card: Restore original configuration of DAI format {nl,mac}80211: allow 4addr AP operation on crypto controlled devices mac80211_hwsim: mark expected switch fall-through mac80211: fix rate reporting inside cfg80211_calculate_bitrate_he() mac80211: remove set but not used variable 'old' mac80211: handle deauthentication/disassociation from TDLS peer gpio: fix gpio-adp5588 build errors pinctrl: stmfx: Fix compile issue when CONFIG_OF_GPIO is not defined staging: kpc2000: Add dependency on MFD_CORE to kconfig symbol 'KPC2000' perf/ring-buffer: Use regular variables for nesting perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data perf/ring_buffer: Add ordering to rb->nest increment perf/ring_buffer: Fix exposing a temporarily decreased data_head x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor x86/boot: Provide KASAN compatible aliases for string routines ALSA: hda/realtek - Enable micmute LED for Huawei laptops Input: uinput - add compat ioctl number translation for UI_*_FF_UPLOAD Input: silead - add MSSL0017 to acpi_device_id cxgb4: offload VLAN flows regardless of VLAN ethtype hsr: fix don't prune the master node from the node_db net: mvpp2: cls: Fix leaked ethtool_rx_flow_rule docs: fix multiple doc build warnings in enumeration.rst lib/list_sort: fix kerneldoc build error docs: fix numaperf.rst and add it to the doc tree doc: Cope with the deprecation of AutoReporter doc: Cope with Sphinx logging deprecations bpf: sockmap, restore sk_write_space when psock gets dropped selftests: bpf: add zero extend checks for ALU32 and/or/xor bpf, riscv: clear target register high 32-bits for and/or/xor on ALU32 spi: abort spi_sync if failed to prepare_transfer_hardware ALSA: hda/realtek - Set default power save node to 0 ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load() MIPS: TXx9: Fix boot crash in free_initmem() MIPS: remove a space after -I to cope with header search paths for VDSO MIPS: mark ginvt() as __always_inline ipv4/igmp: fix another memory leak in igmpv3_del_delrec() bnxt_en: Device serial number is supported only for PFs. bnxt_en: Reduce memory usage when running in kdump kernel. bnxt_en: Fix possible BUG() condition when calling pci_disable_msix(). bnxt_en: Fix aggregation buffer leak under OOM condition. ipv6: Fix redirect with VRF net: stmmac: fix reset gpio free missing mISDN: make sure device name is NUL terminated net: macb: save/restore the remaining registers and features media: dvb: warning about dvb frequency limits produces too much noise net/tls: don't ignore netdev notifications if no TLS features net/tls: fix state removal with feature flags off net/tls: avoid NULL-deref on resync during device removal Documentation: add TLS offload documentation Documentation: tls: RSTify the ktls documentation Documentation: net: move device drivers docs to a submenu mISDN: Fix indenting in dsp_cmx.c ocelot: Dont allocate another multicast list, use __dev_mc_sync Validate required parameters in inet6_validate_link_af xhci: Use %zu for printing size_t type xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic() xhci: Fix immediate data transfer if buffer is already DMA mapped usb: xhci: avoid null pointer deref when bos field is NULL usb: xhci: Fix a potential null pointer dereference in xhci_debugfs_create_endpoint() xhci: update bounce buffer with correct sg num media: usb: siano: Fix false-positive "uninitialized variable" warning spi: spi-fsl-spi: call spi_finalize_current_message() at the end ALSA: hda/realtek - Check headset type by unplug and resume powerpc/perf: Fix MMCRA corruption by bhrb_filter powerpc/powernv: Return for invalid IMC domain HID: logitech-hidpp: Add support for the S510 remote control HID: multitouch: handle faulty Elo touch device selftests: netfilter: add flowtable test script netfilter: nft_flow_offload: IPCB is only valid for ipv4 family netfilter: nft_flow_offload: don't offload when sequence numbers need adjustment netfilter: nft_flow_offload: set liberal tracking mode for tcp netfilter: nf_flow_table: ignore DF bit setting ASoC: Intel: sof-rt5682: fix AMP quirk support ASoC: Intel: sof-rt5682: fix for codec button mapping clk: ti: clkctrl: Fix clkdm_clk handling clk: imx: imx8mm: fix int pll clk gate clk: sifive: restrict Kconfig scope for the FU540 PRCI driver RDMA/hns: Fix PD memory leak for internal allocation netfilter: nat: fix udp checksum corruption selftests: netfilter: missing error check when setting up veth interface RDMA/srp: Rename SRP sysfs name after IB device rename trigger ipvs: Fix use-after-free in ip_vs_in ARC: [plat-hsdk]: Add missing FIFO size entry in GMAC node ARC: [plat-hsdk]: Add missing multicast filter bins number to GMAC node samples, bpf: suppress compiler warning samples, bpf: fix to change the buffer size for read() bpf: Check sk_fullsock() before returning from bpf_sk_lookup() bpf: fix out-of-bounds read in __bpf_skc_lookup Documentation/networking: fix af_xdp.rst Sphinx warnings netfilter: nft_fib: Fix existence check support netfilter: nf_queue: fix reinject verdict handling dmaengine: sprd: Add interrupt support for 2-stage transfer dmaengine: sprd: Fix the right place to configure 2-stage transfer dmaengine: sprd: Fix block length overflow dmaengine: sprd: Fix the incorrect start for 2-stage destination channels dmaengine: sprd: Add validation of current descriptor in irq handler dmaengine: sprd: Fix the possible crash when getting descriptor status tty: max310x: Fix external crystal register setup serial: sh-sci: disable DMA for uart_console serial: imx: remove log spamming error message tty: serial: msm_serial: Fix XON/XOFF USB: serial: option: add Telit 0x1260 and 0x1261 compositions USB: serial: pl2303: add Allied Telesis VT-Kit3 USB: serial: option: add support for Simcom SIM7500/SIM7600 RNDIS mode dmaengine: tegra210-adma: Fix spelling dmaengine: tegra210-adma: Fix channel FIFO configuration dmaengine: tegra210-adma: Fix crash during probe dmaengine: mediatek-cqdma: sleeping in atomic context dmaengine: dw-axi-dmac: fix null dereference when pointer first is null perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints USB: rio500: update Documentation USB: rio500: simplify locking USB: rio500: fix memory leak in close after disconnect USB: rio500: refuse more than one device at a time usbip: usbip_host: fix BUG: sleeping function called from invalid context USB: sisusbvga: fix oops in error path of sisusb_probe USB: Add LPM quirk for Surface Dock GigE adapter media: usb: siano: Fix general protection fault in smsusb usb: mtu3: fix up undefined reference to usb_debug_root USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor Input: elantech - enable middle button support on 2 ThinkPads dmaengine: fsl-qdma: Add improvement dmaengine: jz4780: Fix transfers being ACKed too soon gcc-plugins: Fix build failures under Darwin host MAINTAINERS: Update Stefan Wahren email address netfilter: nf_tables: fix oops during rule dump ARC: mm: SIGSEGV userspace trying to access kernel virtual memory ARC: fix build warnings ARM: dts: bcm: Add missing device_type = "memory" property soc: bcm: brcmstb: biuctrl: Register writes require a barrier soc: brcmstb: Fix error path for unsupported CPUs ARM: dts: dra71x: Disable usb4_tm target module ARM: dts: dra71x: Disable rtc target module ARM: dts: dra76x: Disable usb4_tm target module ARM: dts: dra76x: Disable rtc target module ASoC: simple-card: Fix configuration of DAI format ASoC: Intel: soc-acpi: Fix machine selection order ASoC: rt5677-spi: Handle over reading when flipping bytes ASoC: soc-dpm: fixup DAI active unbalance pinctrl: intel: Clear interrupt status in mask/unmask callback pinctrl: intel: Use GENMASK() consistently parisc: Allow building 64-bit kernel without -mlong-calls compiler option parisc: Kconfig: remove ARCH_DISCARD_MEMBLOCK staging: wilc1000: Fix some double unlock bugs in wilc_wlan_cleanup() staging: vc04_services: prevent integer overflow in create_pagelist() Staging: vc04_services: Fix a couple error codes staging: wlan-ng: fix adapter initialization failure staging: kpc2000: double unlock in error handling in kpc_dma_transfer() staging: kpc2000: Fix build error without CONFIG_UIO staging: kpc2000: fix build error on xtensa staging: erofs: set sb->s_root to NULL when failing from __getname() ARM: imx: cpuidle-imx6sx: Restrict the SW2ISO increase to i.MX6SX firmware: imx: SCU irq should ONLY be enabled after SCU IPC is ready arm64: imx: Fix build error without CONFIG_SOC_BUS ima: fix wrong signed policy requirement when not appraising x86/ima: Check EFI_RUNTIME_SERVICES before using stacktrace: Unbreak stack_trace_save_tsk_reliable() HID: wacom: Sync INTUOSP2_BT touch state after each frame if necessary HID: wacom: Correct button numbering 2nd-gen Intuos Pro over Bluetooth HID: wacom: Send BTN_TOUCH in response to INTUOSP2_BT eraser contact HID: wacom: Don't report anything prior to the tool entering range HID: wacom: Don't set tool type until we're in range ASoC: cs42xx8: Add regcache mask dirty regulator: tps6507x: Fix boot regression due to testing wrong init_data pointer ASoC: fsl_asrc: Fix the issue about unsupported rate spi: bitbang: Fix NULL pointer dereference in spi_unregister_master Input: elan_i2c - increment wakeup count if wake source wireless: Skip directory when generating certificates ASoC: ak4458: rstn_control - return a non-zero on error only ASoC: soc-pcm: BE dai needs prepare when pause release after resume ASoC: ak4458: add return value for ak4458_probe ASoC : cs4265 : readable register too low ASoC: SOF: fix error in verbose ipc command parsing ASoC: SOF: fix race in FW boot timeout handling ASoC: SOF: nocodec: fix undefined reference iio: adc: ti-ads8688: fix timestamp is not updated in buffer iio: dac: ds4422/ds4424 fix chip verification HID: rmi: Use SET_REPORT request on control endpoint for Acer Switch 3 and 5 HID: logitech-hidpp: add support for the MX5500 keyboard HID: logitech-dj: add support for the Logitech MX5500's Bluetooth Mini-Receiver HID: i2c-hid: add iBall Aer3 to descriptor override spi: Fix Raspberry Pi breakage ARM: dts: dra76x: Update MMC2_HS200_MANUAL1 iodelay values ARM: dts: am57xx-idk: Remove support for voltage switching for SD card bus: ti-sysc: Handle devices with no control registers ARM: dts: Configure osc clock for d_can on am335x iio: imu: mpu6050: Fix FIFO layout for ICM20602 lkdtm/bugs: Adjust recursion test to avoid elision lkdtm/usercopy: Moves the KERNEL_DS test to non-canonical iio: adc: ads124: avoid buffer overflow iio: adc: modify NPCM ADC read reference voltage Change-Id: I98c823993370027391cc21dfb239c3049f025136 Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
2019-06-24 20:30:20 -04:00
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* acpi_ipmi.c - ACPI IPMI opregion
*
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
* Copyright (C) 2010, 2013 Intel Corporation
* Author: Zhao Yakui <yakui.zhao@intel.com>
* Lv Zheng <lv.zheng@intel.com>
*/
#include <linux/module.h>
#include <linux/acpi.h>
#include <linux/ipmi.h>
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
#include <linux/spinlock.h>
MODULE_AUTHOR("Zhao Yakui");
MODULE_DESCRIPTION("ACPI IPMI Opregion driver");
MODULE_LICENSE("GPL");
#define ACPI_IPMI_OK 0
#define ACPI_IPMI_TIMEOUT 0x10
#define ACPI_IPMI_UNKNOWN 0x07
/* the IPMI timeout is 5s */
#define IPMI_TIMEOUT (5000)
#define ACPI_IPMI_MAX_MSG_LENGTH 64
struct acpi_ipmi_device {
/* the device list attached to driver_data.ipmi_devices */
struct list_head head;
/* the IPMI request message list */
struct list_head tx_msg_list;
spinlock_t tx_msg_lock;
acpi_handle handle;
struct device *dev;
struct ipmi_user *user_interface;
int ipmi_ifnum; /* IPMI interface number */
long curr_msgid;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
bool dead;
struct kref kref;
};
struct ipmi_driver_data {
struct list_head ipmi_devices;
struct ipmi_smi_watcher bmc_events;
const struct ipmi_user_hndl ipmi_hndlrs;
struct mutex ipmi_lock;
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
/*
* NOTE: IPMI System Interface Selection
* There is no system interface specified by the IPMI operation
* region access. We try to select one system interface with ACPI
* handle set. IPMI messages passed from the ACPI codes are sent
* to this selected global IPMI system interface.
*/
struct acpi_ipmi_device *selected_smi;
};
struct acpi_ipmi_msg {
struct list_head head;
/*
* General speaking the addr type should be SI_ADDR_TYPE. And
* the addr channel should be BMC.
* In fact it can also be IPMB type. But we will have to
* parse it from the Netfn command buffer. It is so complex
* that it is skipped.
*/
struct ipmi_addr addr;
long tx_msgid;
/* it is used to track whether the IPMI message is finished */
struct completion tx_complete;
struct kernel_ipmi_msg tx_message;
int msg_done;
/* tx/rx data . And copy it from/to ACPI object buffer */
u8 data[ACPI_IPMI_MAX_MSG_LENGTH];
u8 rx_len;
struct acpi_ipmi_device *device;
struct kref kref;
};
/* IPMI request/response buffer per ACPI 4.0, sec 5.5.2.4.3.2 */
struct acpi_ipmi_buffer {
u8 status;
u8 length;
u8 data[ACPI_IPMI_MAX_MSG_LENGTH];
};
static void ipmi_register_bmc(int iface, struct device *dev);
static void ipmi_bmc_gone(int iface);
static void ipmi_msg_handler(struct ipmi_recv_msg *msg, void *user_msg_data);
static struct ipmi_driver_data driver_data = {
.ipmi_devices = LIST_HEAD_INIT(driver_data.ipmi_devices),
.bmc_events = {
.owner = THIS_MODULE,
.new_smi = ipmi_register_bmc,
.smi_gone = ipmi_bmc_gone,
},
.ipmi_hndlrs = {
.ipmi_recv_hndl = ipmi_msg_handler,
},
.ipmi_lock = __MUTEX_INITIALIZER(driver_data.ipmi_lock)
};
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
static struct acpi_ipmi_device *
ipmi_dev_alloc(int iface, struct device *dev, acpi_handle handle)
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
{
struct acpi_ipmi_device *ipmi_device;
int err;
struct ipmi_user *user;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
ipmi_device = kzalloc(sizeof(*ipmi_device), GFP_KERNEL);
if (!ipmi_device)
return NULL;
kref_init(&ipmi_device->kref);
INIT_LIST_HEAD(&ipmi_device->head);
INIT_LIST_HEAD(&ipmi_device->tx_msg_list);
spin_lock_init(&ipmi_device->tx_msg_lock);
ipmi_device->handle = handle;
ipmi_device->dev = get_device(dev);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
ipmi_device->ipmi_ifnum = iface;
err = ipmi_create_user(iface, &driver_data.ipmi_hndlrs,
ipmi_device, &user);
if (err) {
put_device(dev);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
kfree(ipmi_device);
return NULL;
}
ipmi_device->user_interface = user;
return ipmi_device;
}
static void ipmi_dev_release(struct acpi_ipmi_device *ipmi_device)
{
ipmi_destroy_user(ipmi_device->user_interface);
put_device(ipmi_device->dev);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
kfree(ipmi_device);
}
static void ipmi_dev_release_kref(struct kref *kref)
{
struct acpi_ipmi_device *ipmi =
container_of(kref, struct acpi_ipmi_device, kref);
ipmi_dev_release(ipmi);
}
static void __ipmi_dev_kill(struct acpi_ipmi_device *ipmi_device)
{
list_del(&ipmi_device->head);
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
if (driver_data.selected_smi == ipmi_device)
driver_data.selected_smi = NULL;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
/*
* Always setting dead flag after deleting from the list or
* list_for_each_entry() codes must get changed.
*/
ipmi_device->dead = true;
}
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
static struct acpi_ipmi_device *acpi_ipmi_dev_get(void)
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
{
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
struct acpi_ipmi_device *ipmi_device = NULL;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
mutex_lock(&driver_data.ipmi_lock);
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
if (driver_data.selected_smi) {
ipmi_device = driver_data.selected_smi;
kref_get(&ipmi_device->kref);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
}
mutex_unlock(&driver_data.ipmi_lock);
return ipmi_device;
}
static void acpi_ipmi_dev_put(struct acpi_ipmi_device *ipmi_device)
{
kref_put(&ipmi_device->kref, ipmi_dev_release_kref);
}
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
static struct acpi_ipmi_msg *ipmi_msg_alloc(void)
{
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
struct acpi_ipmi_device *ipmi;
struct acpi_ipmi_msg *ipmi_msg;
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
ipmi = acpi_ipmi_dev_get();
if (!ipmi)
return NULL;
ipmi_msg = kzalloc(sizeof(struct acpi_ipmi_msg), GFP_KERNEL);
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
if (!ipmi_msg) {
acpi_ipmi_dev_put(ipmi);
return NULL;
}
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
kref_init(&ipmi_msg->kref);
init_completion(&ipmi_msg->tx_complete);
INIT_LIST_HEAD(&ipmi_msg->head);
ipmi_msg->device = ipmi;
ipmi_msg->msg_done = ACPI_IPMI_UNKNOWN;
return ipmi_msg;
}
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
static void ipmi_msg_release(struct acpi_ipmi_msg *tx_msg)
{
acpi_ipmi_dev_put(tx_msg->device);
kfree(tx_msg);
}
static void ipmi_msg_release_kref(struct kref *kref)
{
struct acpi_ipmi_msg *tx_msg =
container_of(kref, struct acpi_ipmi_msg, kref);
ipmi_msg_release(tx_msg);
}
static struct acpi_ipmi_msg *acpi_ipmi_msg_get(struct acpi_ipmi_msg *tx_msg)
{
kref_get(&tx_msg->kref);
return tx_msg;
}
static void acpi_ipmi_msg_put(struct acpi_ipmi_msg *tx_msg)
{
kref_put(&tx_msg->kref, ipmi_msg_release_kref);
}
#define IPMI_OP_RGN_NETFN(offset) ((offset >> 8) & 0xff)
#define IPMI_OP_RGN_CMD(offset) (offset & 0xff)
static int acpi_format_ipmi_request(struct acpi_ipmi_msg *tx_msg,
acpi_physical_address address,
acpi_integer *value)
{
struct kernel_ipmi_msg *msg;
struct acpi_ipmi_buffer *buffer;
struct acpi_ipmi_device *device;
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
unsigned long flags;
msg = &tx_msg->tx_message;
/*
* IPMI network function and command are encoded in the address
* within the IPMI OpRegion; see ACPI 4.0, sec 5.5.2.4.3.
*/
msg->netfn = IPMI_OP_RGN_NETFN(address);
msg->cmd = IPMI_OP_RGN_CMD(address);
msg->data = tx_msg->data;
/*
* value is the parameter passed by the IPMI opregion space handler.
* It points to the IPMI request message buffer
*/
buffer = (struct acpi_ipmi_buffer *)value;
/* copy the tx message data */
if (buffer->length > ACPI_IPMI_MAX_MSG_LENGTH) {
dev_WARN_ONCE(tx_msg->device->dev, true,
"Unexpected request (msg len %d).\n",
buffer->length);
return -EINVAL;
}
msg->data_len = buffer->length;
memcpy(tx_msg->data, buffer->data, msg->data_len);
/*
* now the default type is SYSTEM_INTERFACE and channel type is BMC.
* If the netfn is APP_REQUEST and the cmd is SEND_MESSAGE,
* the addr type should be changed to IPMB. Then we will have to parse
* the IPMI request message buffer to get the IPMB address.
* If so, please fix me.
*/
tx_msg->addr.addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE;
tx_msg->addr.channel = IPMI_BMC_CHANNEL;
tx_msg->addr.data[0] = 0;
/* Get the msgid */
device = tx_msg->device;
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
spin_lock_irqsave(&device->tx_msg_lock, flags);
device->curr_msgid++;
tx_msg->tx_msgid = device->curr_msgid;
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
spin_unlock_irqrestore(&device->tx_msg_lock, flags);
return 0;
}
static void acpi_format_ipmi_response(struct acpi_ipmi_msg *msg,
acpi_integer *value)
{
struct acpi_ipmi_buffer *buffer;
/*
* value is also used as output parameter. It represents the response
* IPMI message returned by IPMI command.
*/
buffer = (struct acpi_ipmi_buffer *)value;
/*
* If the flag of msg_done is not set, it means that the IPMI command is
* not executed correctly.
*/
buffer->status = msg->msg_done;
if (msg->msg_done != ACPI_IPMI_OK)
return;
/*
* If the IPMI response message is obtained correctly, the status code
* will be ACPI_IPMI_OK
*/
buffer->length = msg->rx_len;
memcpy(buffer->data, msg->data, msg->rx_len);
}
static void ipmi_flush_tx_msg(struct acpi_ipmi_device *ipmi)
{
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
struct acpi_ipmi_msg *tx_msg;
unsigned long flags;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
/*
* NOTE: On-going ipmi_recv_msg
* ipmi_msg_handler() may still be invoked by ipmi_si after
* flushing. But it is safe to do a fast flushing on module_exit()
* without waiting for all ipmi_recv_msg(s) to complete from
* ipmi_msg_handler() as it is ensured by ipmi_si that all
* ipmi_recv_msg(s) are freed after invoking ipmi_destroy_user().
*/
spin_lock_irqsave(&ipmi->tx_msg_lock, flags);
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
while (!list_empty(&ipmi->tx_msg_list)) {
tx_msg = list_first_entry(&ipmi->tx_msg_list,
struct acpi_ipmi_msg,
head);
list_del(&tx_msg->head);
spin_unlock_irqrestore(&ipmi->tx_msg_lock, flags);
/* wake up the sleep thread on the Tx msg */
complete(&tx_msg->tx_complete);
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
acpi_ipmi_msg_put(tx_msg);
spin_lock_irqsave(&ipmi->tx_msg_lock, flags);
}
spin_unlock_irqrestore(&ipmi->tx_msg_lock, flags);
}
static void ipmi_cancel_tx_msg(struct acpi_ipmi_device *ipmi,
struct acpi_ipmi_msg *msg)
{
struct acpi_ipmi_msg *tx_msg, *temp;
bool msg_found = false;
unsigned long flags;
spin_lock_irqsave(&ipmi->tx_msg_lock, flags);
list_for_each_entry_safe(tx_msg, temp, &ipmi->tx_msg_list, head) {
if (msg == tx_msg) {
msg_found = true;
list_del(&tx_msg->head);
break;
}
}
spin_unlock_irqrestore(&ipmi->tx_msg_lock, flags);
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
if (msg_found)
acpi_ipmi_msg_put(tx_msg);
}
static void ipmi_msg_handler(struct ipmi_recv_msg *msg, void *user_msg_data)
{
struct acpi_ipmi_device *ipmi_device = user_msg_data;
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
bool msg_found = false;
struct acpi_ipmi_msg *tx_msg, *temp;
struct device *dev = ipmi_device->dev;
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
unsigned long flags;
if (msg->user != ipmi_device->user_interface) {
dev_warn(dev,
"Unexpected response is returned. returned user %p, expected user %p\n",
msg->user, ipmi_device->user_interface);
goto out_msg;
}
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
spin_lock_irqsave(&ipmi_device->tx_msg_lock, flags);
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
list_for_each_entry_safe(tx_msg, temp, &ipmi_device->tx_msg_list, head) {
if (msg->msgid == tx_msg->tx_msgid) {
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
msg_found = true;
list_del(&tx_msg->head);
break;
}
}
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
spin_unlock_irqrestore(&ipmi_device->tx_msg_lock, flags);
if (!msg_found) {
dev_warn(dev,
"Unexpected response (msg id %ld) is returned.\n",
msg->msgid);
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
goto out_msg;
}
/* copy the response data to Rx_data buffer */
if (msg->msg.data_len > ACPI_IPMI_MAX_MSG_LENGTH) {
dev_WARN_ONCE(dev, true,
"Unexpected response (msg len %d).\n",
msg->msg.data_len);
goto out_comp;
}
/* response msg is an error msg */
msg->recv_type = IPMI_RESPONSE_RECV_TYPE;
if (msg->recv_type == IPMI_RESPONSE_RECV_TYPE &&
msg->msg.data_len == 1) {
if (msg->msg.data[0] == IPMI_TIMEOUT_COMPLETION_CODE) {
dev_dbg_once(dev, "Unexpected response (timeout).\n");
tx_msg->msg_done = ACPI_IPMI_TIMEOUT;
}
goto out_comp;
}
tx_msg->rx_len = msg->msg.data_len;
memcpy(tx_msg->data, msg->msg.data, tx_msg->rx_len);
tx_msg->msg_done = ACPI_IPMI_OK;
out_comp:
complete(&tx_msg->tx_complete);
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
acpi_ipmi_msg_put(tx_msg);
out_msg:
ipmi_free_recv_msg(msg);
}
static void ipmi_register_bmc(int iface, struct device *dev)
{
struct acpi_ipmi_device *ipmi_device, *temp;
int err;
struct ipmi_smi_info smi_data;
acpi_handle handle;
err = ipmi_get_smi_info(iface, &smi_data);
if (err)
return;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
if (smi_data.addr_src != SI_ACPI)
goto err_ref;
handle = smi_data.addr_info.acpi_info.acpi_handle;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
if (!handle)
goto err_ref;
ipmi_device = ipmi_dev_alloc(iface, smi_data.dev, handle);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
if (!ipmi_device) {
dev_warn(smi_data.dev, "Can't create IPMI user interface\n");
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
goto err_ref;
}
mutex_lock(&driver_data.ipmi_lock);
list_for_each_entry(temp, &driver_data.ipmi_devices, head) {
/*
* if the corresponding ACPI handle is already added
* to the device list, don't add it again.
*/
if (temp->handle == handle)
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
goto err_lock;
}
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
if (!driver_data.selected_smi)
driver_data.selected_smi = ipmi_device;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
list_add_tail(&ipmi_device->head, &driver_data.ipmi_devices);
mutex_unlock(&driver_data.ipmi_lock);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
put_device(smi_data.dev);
return;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
err_lock:
mutex_unlock(&driver_data.ipmi_lock);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
ipmi_dev_release(ipmi_device);
err_ref:
put_device(smi_data.dev);
return;
}
static void ipmi_bmc_gone(int iface)
{
struct acpi_ipmi_device *ipmi_device, *temp;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
bool dev_found = false;
mutex_lock(&driver_data.ipmi_lock);
list_for_each_entry_safe(ipmi_device, temp,
&driver_data.ipmi_devices, head) {
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
if (ipmi_device->ipmi_ifnum != iface) {
dev_found = true;
__ipmi_dev_kill(ipmi_device);
break;
}
}
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
if (!driver_data.selected_smi)
driver_data.selected_smi = list_first_entry_or_null(
&driver_data.ipmi_devices,
struct acpi_ipmi_device, head);
mutex_unlock(&driver_data.ipmi_lock);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
if (dev_found) {
ipmi_flush_tx_msg(ipmi_device);
acpi_ipmi_dev_put(ipmi_device);
}
}
/*
* This is the IPMI opregion space handler.
* @function: indicates the read/write. In fact as the IPMI message is driven
* by command, only write is meaningful.
* @address: This contains the netfn/command of IPMI request message.
* @bits : not used.
* @value : it is an in/out parameter. It points to the IPMI message buffer.
* Before the IPMI message is sent, it represents the actual request
* IPMI message. After the IPMI message is finished, it represents
* the response IPMI message returned by IPMI command.
* @handler_context: IPMI device context.
*/
static acpi_status
acpi_ipmi_space_handler(u32 function, acpi_physical_address address,
u32 bits, acpi_integer *value,
void *handler_context, void *region_context)
{
struct acpi_ipmi_msg *tx_msg;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
struct acpi_ipmi_device *ipmi_device;
int err;
acpi_status status;
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
unsigned long flags;
/*
* IPMI opregion message.
* IPMI message is firstly written to the BMC and system software
* can get the respsonse. So it is unmeaningful for the read access
* of IPMI opregion.
*/
if ((function & ACPI_IO_MASK) == ACPI_READ)
return AE_TYPE;
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
tx_msg = ipmi_msg_alloc();
if (!tx_msg)
return AE_NOT_EXIST;
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
ipmi_device = tx_msg->device;
if (acpi_format_ipmi_request(tx_msg, address, value) != 0) {
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
ipmi_msg_release(tx_msg);
return AE_TYPE;
}
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
acpi_ipmi_msg_get(tx_msg);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
mutex_lock(&driver_data.ipmi_lock);
/* Do not add a tx_msg that can not be flushed. */
if (ipmi_device->dead) {
mutex_unlock(&driver_data.ipmi_lock);
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
ipmi_msg_release(tx_msg);
return AE_NOT_EXIST;
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
}
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
spin_lock_irqsave(&ipmi_device->tx_msg_lock, flags);
list_add_tail(&tx_msg->head, &ipmi_device->tx_msg_list);
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:23 -04:00
spin_unlock_irqrestore(&ipmi_device->tx_msg_lock, flags);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
mutex_unlock(&driver_data.ipmi_lock);
err = ipmi_request_settime(ipmi_device->user_interface,
&tx_msg->addr,
tx_msg->tx_msgid,
&tx_msg->tx_message,
NULL, 0, 0, IPMI_TIMEOUT);
if (err) {
status = AE_ERROR;
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
goto out_msg;
}
wait_for_completion(&tx_msg->tx_complete);
acpi_format_ipmi_response(tx_msg, value);
status = AE_OK;
out_msg:
ACPI / IPMI: Add reference counting for ACPI IPMI transfers This patch adds reference counting for ACPI IPMI transfers to tune the locking granularity of tx_msg_lock. This patch also makes the whole acpi_ipmi module's coding style consistent by using reference counting for all its objects (i.e., acpi_ipmi_device and acpi_ipmi_msg). The acpi_ipmi_msg handling is re-designed using referece counting. 1. tx_msg is always unlinked before complete(), so that it is safe to put complete() out side of tx_msg_lock. 2. tx_msg reference counters are incremented before calling ipmi_request_settime() and tx_msg_lock protection is added to ipmi_cancel_tx_msg() so that a complete() can be safely called in parellel with tx_msg unlinking in failure cases. 3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed and freed in the contexts other than acpi_ipmi_space_handler(). The lockdep_chains shows all acpi_ipmi locks are leaf locks after the tuning: 1. ipmi_lock is always leaf: irq_context: 0 [ffffffff81a943f8] smi_watchers_mutex [ffffffffa06eca60] driver_data.ipmi_lock irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06eca60] driver_data.ipmi_lock 2. without this patch applied, lock used by complete() is held after holding tx_msg_lock: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a6678] s_active#103 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock irq_context: 1 [ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock [ffffffffa06eccf0] &x->wait#25 [ffffffff81e36620] &p->pi_lock [ffffffff81e5d0a8] &rq->lock 3. with this patch applied, tx_msg_lock is always leaf: irq_context: 0 [ffffffff82767b40] &buffer->mutex [ffffffffa00a66d8] s_active#107 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock irq_context: 1 [ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:11 -04:00
ipmi_cancel_tx_msg(ipmi_device, tx_msg);
acpi_ipmi_msg_put(tx_msg);
return status;
}
static int __init acpi_ipmi_init(void)
{
int result;
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
acpi_status status;
if (acpi_disabled)
return 0;
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
status = acpi_install_address_space_handler(ACPI_ROOT_OBJECT,
ACPI_ADR_SPACE_IPMI,
&acpi_ipmi_space_handler,
NULL, NULL);
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
if (ACPI_FAILURE(status)) {
pr_warn("Can't register IPMI opregion space handle\n");
return -EINVAL;
}
result = ipmi_smi_watcher_register(&driver_data.bmc_events);
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
if (result)
pr_err("Can't register IPMI system interface watcher\n");
return result;
}
static void __exit acpi_ipmi_exit(void)
{
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
struct acpi_ipmi_device *ipmi_device;
if (acpi_disabled)
return;
ipmi_smi_watcher_unregister(&driver_data.bmc_events);
/*
* When one smi_watcher is unregistered, it is only deleted
* from the smi_watcher list. But the smi_gone callback function
* is not called. So explicitly uninstall the ACPI IPMI oregion
* handler and free it.
*/
mutex_lock(&driver_data.ipmi_lock);
ACPI / IPMI: Fix race caused by the unprotected ACPI IPMI user This patch uses reference counting to fix the race caused by the unprotected ACPI IPMI user. There are two rules for using the ipmi_si APIs: 1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will be passed to ipmi_msg_handler(), but ipmi_request_settime() can not use an invalid ipmi_user_t. This means the ipmi_si users must ensure that there won't be any local references on ipmi_user_t before invoking ipmi_destroy_user(). 2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by smi_watchers_mutex, so their execution is serialized. But as a new smi can re-use a freed intf_num, it requires that the callback implementation must not use intf_num as an identification mean or it must ensure all references to the previous smi are all dropped before exiting smi_gone() callback. As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler() can happen before setting user_interface to NULL and codes after the check in acpi_ipmi_space_handler() can happen after user_interface becomes NULL, the on-going acpi_ipmi_space_handler() still can pass an invalid acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race conditions are not allowed by the IPMI layer's API design as a crash will happen in ipmi_request_settime() if something like that happens. This patch follows the ipmi_devintf.c design: 1. Invoke ipmi_destroy_user() after the reference count of acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping to 0 also means tx_msg related to this acpi_ipmi_device are all freed. This matches the IPMI layer's API calling rule on ipmi_destroy_user() and ipmi_request_settime(). 2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be running in acpi_ipmi_space_handler(). And it is invoked after invoking __ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a "dead" flag set, and the "dead" flag check is also introduced to the point where a tx_msg is going to be added to the tx_msg_list so that no new tx_msg can be created after returning from the __ipmi_dev_kill(). 3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not required since this patch ensures no acpi_ipmi reference is still held for ipmi_user_t before calling ipmi_destroy_user() and ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen after returning from ipmi_destroy_user(). 4. The flushing of tx_msg is also moved out of ipmi_lock in this patch. The forthcoming IPMI operation region handler installation changes also requires acpi_ipmi_device be handled in this style. The header comment of the file is also updated due to this design change. Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:13:54 -04:00
while (!list_empty(&driver_data.ipmi_devices)) {
ipmi_device = list_first_entry(&driver_data.ipmi_devices,
struct acpi_ipmi_device,
head);
__ipmi_dev_kill(ipmi_device);
mutex_unlock(&driver_data.ipmi_lock);
ipmi_flush_tx_msg(ipmi_device);
acpi_ipmi_dev_put(ipmi_device);
mutex_lock(&driver_data.ipmi_lock);
}
mutex_unlock(&driver_data.ipmi_lock);
ACPI / IPMI: Use global IPMI operation region handler It is found on a real machine, in its ACPI namespace, the IPMI OperationRegions (in the ACPI000D - ACPI power meter) are not defined under the IPMI system interface device (the IPI0001 with KCS type returned from _IFT control method): Device (PMI0) { Name (_HID, "ACPI000D") // _HID: Hardware ID OperationRegion (SYSI, IPMI, 0x0600, 0x0100) Field (SYSI, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0x58), SCMD, 8, GCMD, 8 } OperationRegion (POWR, IPMI, 0x3000, 0x0100) Field (POWR, BufferAcc, Lock, Preserve) { AccessAs (BufferAcc, 0x01), Offset (0xB3), GPMM, 8 } } Device (PCI0) { Device (ISA) { Device (NIPM) { Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type { Return (0x01) } } } } Current ACPI_IPMI code registers IPMI operation region handler on a per-device basis, so for the above namespace the IPMI operation region handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus when an IPMI operation region field of \PMI0 is accessed, there are errors reported on such platform: ACPI Error: No handlers for Region [IPMI] ACPI Error: Region IPMI(7) has no handler The solution is to install an IPMI operation region handler from root node so that every object that defines IPMI OperationRegion can get an address space handler registered. When an IPMI operation region field is accessed, the Network Function (0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are passed to the operation region handler, there is no system interface specified by the BIOS. The patch tries to select one system interface by monitoring the system interface notification. IPMI messages passed from the ACPI codes are sent to this selected global IPMI system interface. The ACPI_IPMI will always select the first registered IPMI interface with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to determine the selection when there are multiple IPMI system interfaces defined in the ACPI namespace. According to the IPMI specification: A BMC device may make available multiple system interfaces, but only one management controller is allowed to be 'active' BMC that provides BMC functionality for the system (in case of a 'partitioned' system, there can be only one active BMC per partition). Only the system interface(s) for the active BMC allowed to respond to the 'Get Device Id' command. According to the ipmi_si desigin: The ipmi_si registeration notifications can only happen after a successful "Get Device ID" command. Thus it should be OK for non-partitioned systems to do such selection. However, we do not have much knowledge on 'partitioned' systems. References: https://bugzilla.kernel.org/show_bug.cgi?id=46741 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-13 01:14:02 -04:00
acpi_remove_address_space_handler(ACPI_ROOT_OBJECT,
ACPI_ADR_SPACE_IPMI,
&acpi_ipmi_space_handler);
}
module_init(acpi_ipmi_init);
module_exit(acpi_ipmi_exit);