Merge tag 'ASB-2023-10-06_11-5.4' of https://android.googlesource.com/kernel/common into android13-5.4-lahaina

https://source.android.com/docs/security/bulletin/2023-10-01 * tag 'ASB-2023-10-06_11-5.4' of https://android.googlesource.com/kernel/common: UPSTREAM: arm64: efi: Make efi_rt_lock a raw_spinlock UPSTREAM: net: sched: sch_qfq: Fix UAF in qfq_dequeue() UPSTREAM: net/sched: sch_hfsc: Ensure inner classes have fsc curve UPSTREAM: net/sched: sch_qfq: account for stab overhead in qfq_enqueue UPSTREAM: netfilter: nf_tables: prevent OOB access in nft_byteorder_eval UPSTREAM: af_unix: Fix null-ptr-deref in unix_stream_sendpage(). Linux 5.4.254 sch_netem: fix issues in netem_change() vs get_dist_table() alpha: remove __init annotation from exported page_is_ram() scsi: core: Fix possible memory leak if device_add() fails scsi: snic: Fix possible memory leak if device_add() fails scsi: 53c700: Check that command slot is not NULL scsi: storvsc: Fix handling of virtual Fibre Channel timeouts scsi: core: Fix legacy /proc parsing buffer overflow netfilter: nf_tables: report use refcount overflow nvme-rdma: fix potential unbalanced freeze & unfreeze nvme-tcp: fix potential unbalanced freeze & unfreeze btrfs: set cache_block_group_error if we find an error btrfs: don't stop integrity writeback too early ibmvnic: Handle DMA unmapping of login buffs in release functions net/mlx5: Allow 0 for total host VFs dmaengine: mcf-edma: Fix a potential un-allocated memory access wifi: cfg80211: fix sband iftype data lookup for AP_VLAN IB/hfi1: Fix possible panic during hotplug remove drivers: net: prevent tun_build_skb() to exceed the packet size limit dccp: fix data-race around dp->dccps_mss_cache bonding: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves net/packet: annotate data-races around tp->status mISDN: Update parameter type of dsp_cmx_send() selftests/rseq: Fix build with undefined __weak drm/nouveau/disp: Revert a NULL check inside nouveau_connector_get_modes x86: Move gds_ucode_mitigated() declaration to header x86/mm: Fix VDSO and VVAR placement on 5-level paging machines x86/cpu/amd: Enable Zenbleed fix for AMD Custom APU 0405 usb: common: usb-conn-gpio: Prevent bailing out if initial role is none usb: dwc3: Properly handle processing of pending events usb-storage: alauda: Fix uninit-value in alauda_check_media() binder: fix memory leak in binder_init() iio: cros_ec: Fix the allocation size for cros_ec_command nilfs2: fix use-after-free of nilfs_root in dirtying inodes via iput x86/pkeys: Revert a5eff72597 ("x86/pkeys: Add PKRU value to init_fpstate") radix tree test suite: fix incorrect allocation size for pthreads drm/nouveau/gr: enable memory loads on helper invocation on all channels dmaengine: pl330: Return DMA_PAUSED when transaction is paused ipv6: adjust ndisc_is_useropt() to also return true for PIO mmc: moxart: read scr register without changing byte order Linux 5.4.253 Revert "driver core: Annotate dev_err_probe() with __must_check" drivers: core: fix kernel-doc markup for dev_err_probe() driver code: print symbolic error code driver core: Annotate dev_err_probe() with __must_check ARM: dts: nxp/imx6sll: fix wrong property name in usbphy node ARM: dts: imx6sll: fixup of operating points ARM: dts: imx: add usb alias ARM: dts: imx: Align L2 cache-controller nodename with dtschema ARM: dts: imx6sll: Make ssi node name same as other platforms arm64: dts: stratix10: fix incorrect I2C property for SCL signal ceph: defer stopping mdsc delayed_work ceph: use kill_anon_super helper ceph: show tasks waiting on caps in debugfs caps file PM: sleep: wakeirq: fix wake irq arming PM / wakeirq: support enabling wake-up irq after runtime_suspend called selftests/rseq: Play nice with binaries statically linked against glibc 2.35+ selftests/rseq: check if libc rseq support is registered powerpc/mm/altmap: Fix altmap boundary check mtd: rawnand: omap_elm: Fix incorrect type in assignment test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation test_firmware: prevent race conditions by a correct implementation of locking ext2: Drop fragment support fs: Protect reconfiguration of sb read-write from racing writes net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urb Bluetooth: L2CAP: Fix use-after-free in l2cap_sock_ready_cb fs/sysv: Null check to prevent null-ptr-deref bug net: tap_open(): set sk_uid from current_fsuid() net: tun_chr_open(): set sk_uid from current_fsuid() mtd: rawnand: meson: fix OOB available bytes for ECC mtd: spinand: toshiba: Fix ecc_get_status USB: zaurus: Add ID for A-300/B-500/C-700 libceph: fix potential hang in ceph_osdc_notify() scsi: zfcp: Defer fc_rport blocking until after ADISC response tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen tcp_metrics: annotate data-races around tm->tcpm_net tcp_metrics: annotate data-races around tm->tcpm_vals[] tcp_metrics: annotate data-races around tm->tcpm_lock tcp_metrics: annotate data-races around tm->tcpm_stamp tcp_metrics: fix addr_same() helper ip6mr: Fix skb_under_panic in ip6mr_cache_report() net: dcb: choose correct policy to parse DCB_ATTR_BCN net: ll_temac: fix error checking of irq_of_parse_and_map() net: ll_temac: Switch to use dev_err_probe() helper driver core: add device probe log helper bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire net/sched: cls_route: No longer copy tcf_result on update to avoid use-after-free net/sched: cls_fw: No longer copy tcf_result on update to avoid use-after-free net/sched: cls_u32: No longer copy tcf_result on update to avoid use-after-free net: add missing data-race annotation for sk_ll_usec net: add missing data-race annotations around sk->sk_peek_off net: add missing READ_ONCE(sk->sk_rcvbuf) annotation net: add missing READ_ONCE(sk->sk_sndbuf) annotation net: add missing READ_ONCE(sk->sk_rcvlowat) annotation net: annotate data-races around sk->sk_max_pacing_rate mISDN: hfcpci: Fix potential deadlock on &hc->lock net: sched: cls_u32: Fix match key mis-addressing perf test uprobe_from_different_cu: Skip if there is no gcc rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length net/mlx5e: fix return value check in mlx5e_ipsec_remove_trailer() net/mlx5: DR, fix memory leak in mlx5dr_cmd_create_reformat_ctx KVM: s390: fix sthyi error handling word-at-a-time: use the same return type for has_zero regardless of endianness loop: Select I/O scheduler 'none' from inside add_disk() perf: Fix function pointer case arm64: Fix bit-shifting UB in the MIDR_CPU_MODEL() macro arm64: Add AMPERE1 to the Spectre-BHB affected list ASoC: cs42l51: fix driver to properly autoload with automatic module loading net/sched: sch_qfq: account for stab overhead in qfq_enqueue btrfs: fix race between quota disable and quota assign ioctls btrfs: qgroup: return ENOTCONN instead of EINVAL when quotas are not enabled btrfs: qgroup: remove one-time use variables for quota_root checks cpufreq: intel_pstate: Drop ACPI _PSS states table patching ACPI: processor: perflib: Avoid updating frequency QoS unnecessarily ACPI: processor: perflib: Use the "no limit" frequency QoS dm cache policy smq: ensure IO doesn't prevent cleaner policy progress ASoC: wm8904: Fill the cache for WM8904_ADC_TEST_0 register s390/dasd: fix hanging device after quiesce/resume virtio-net: fix race between set queues and probe btrfs: check if the transaction was aborted at btrfs_wait_for_commit() irq-bcm6345-l1: Do not assume a fixed block to cpu mapping tpm_tis: Explicitly check for error code btrfs: check for commit error at btrfs_attach_transaction_barrier() hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext() Documentation: security-bugs.rst: clarify CVE handling Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group Revert "usb: xhci: tegra: Fix error check" usb: xhci-mtk: set the dma max_seg_size USB: quirks: add quirk for Focusrite Scarlett usb: ohci-at91: Fix the unhandle interrupt when resume usb: dwc3: don't reset device side if dwc3 was configured as host-only usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy Revert "usb: dwc3: core: Enable AutoRetry feature in the controller" can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED USB: serial: simple: sort driver entries USB: serial: simple: add Kaufmann RKS+CAN VCP USB: serial: option: add Quectel EC200A module support USB: serial: option: support Quectel EM060K_128 serial: sifive: Fix sifive_serial_console_setup() section serial: 8250_dw: Preserve original value of DLF register tracing: Fix warning in trace_buffered_event_disable() ring-buffer: Fix wrong stat of cpu_buffer->read ata: pata_ns87415: mark ns87560_tf_read static dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths block: Fix a source code comment in include/uapi/linux/blkzoned.h ASoC: fsl_spdif: Silence output on stop drm/msm: Fix IS_ERR_OR_NULL() vs NULL check in a5xx_submit_in_rb() drm/msm/adreno: Fix snapshot BINDLESS_DATA size drm/msm/dpu: drop enum dpu_core_perf_data_bus_id RDMA/mlx4: Make check for invalid flags stricter benet: fix return value check in be_lancer_xmit_workarounds() net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64 net/sched: mqprio: add extack to mqprio_parse_nlattr() net/sched: mqprio: refactor nlattr parsing to a separate function platform/x86: msi-laptop: Fix rfkill out-of-sync on MSI Wind U100 team: reset team's flags when down link is P2P device bonding: reset bond's flags when down link is P2P device tcp: Reduce chance of collisions in inet6_hashfn(). ipv6 addrconf: fix bug where deleting a mngtmpaddr can create a new temporary address ethernet: atheros: fix return value check in atl1e_tso_csum() phy: hisilicon: Fix an out of bounds check in hisi_inno_phy_probe() vxlan: calculate correct header length for GPE i40e: Fix an NULL vs IS_ERR() bug for debugfs_create_dir() ext4: fix to check return value of freeze_bdev() in ext4_shutdown() keys: Fix linking a duplicate key to a keyring's assoc_array uapi: General notification queue definitions scsi: qla2xxx: Array index may go out of bound scsi: qla2xxx: Fix inconsistent format argument type in qla_os.c pwm: meson: fix handling of period/duty if greater than UINT_MAX pwm: meson: Simplify duplicated per-channel tracking pwm: meson: Remove redundant assignment to variable fin_freq ftrace: Fix possible warning on checking all pages used in ftrace_process_locs() ftrace: Store the order of pages allocated in ftrace_page ftrace: Check if pages were allocated before calling free_pages() ftrace: Add information on number of page groups allocated fs: dlm: interrupt posix locks only when process is killed dlm: rearrange async condition return dlm: cleanup plock_op vs plock_xop PCI/ASPM: Avoid link retraining race PCI/ASPM: Factor out pcie_wait_for_retrain() PCI/ASPM: Return 0 or -ETIMEDOUT from pcie_retrain_link() ext4: Fix reusing stale buffer heads from last failed mounting ext4: rename journal_dev to s_journal_dev inside ext4_sb_info btrfs: fix extent buffer leak after tree mod log failure at split_node() btrfs: fix race between quota disable and relocation btrfs: qgroup: catch reserved space leaks at unmount time bcache: Fix __bch_btree_node_alloc to make the failure behavior consistent bcache: remove 'int n' from parameter list of bch_bucket_alloc_set() gpio: tps68470: Make tps68470_gpio_output() always set the initial value jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint jbd2: recheck chechpointing non-dirty buffer jbd2: remove redundant buffer io error checks jbd2: fix kernel-doc markups jbd2: fix incorrect code style Linux 5.4.252 x86: fix backwards merge of GDS/SRSO bit xen/netback: Fix buffer overrun triggered by unusual packet x86/cpu, kvm: Add support for CPUID_80000021_EAX x86/bugs: Increase the x86 bugs vector size to two u32s tools headers cpufeatures: Sync with the kernel sources x86/cpufeatures: Assign dedicated feature word for CPUID_0x8000001F[EAX] x86/cpu: Add VM page flush MSR availablility as a CPUID feature x86/cpufeatures: Add SEV-ES CPU feature Documentation/x86: Fix backwards on/off logic about YMM support x86/mm: Initialize text poking earlier mm: Move mm_cachep initialization to mm_init() x86/mm: Use mm_alloc() in poking_init() x86/mm: fix poking_init() for Xen PV guests x86/xen: Fix secondary processors' FPU initialization KVM: Add GDS_NO support to KVM x86/speculation: Add Kconfig option for GDS x86/speculation: Add force option to GDS mitigation x86/speculation: Add Gather Data Sampling mitigation x86/fpu: Move FPU initialization into arch_cpu_finalize_init() x86/fpu: Mark init functions __init x86/fpu: Remove cpuinfo argument from init functions init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init() init: Invoke arch_cpu_finalize_init() earlier init: Remove check_bugs() leftovers um/cpu: Switch to arch_cpu_finalize_init() sparc/cpu: Switch to arch_cpu_finalize_init() sh/cpu: Switch to arch_cpu_finalize_init() mips/cpu: Switch to arch_cpu_finalize_init() m68k/cpu: Switch to arch_cpu_finalize_init() ia64/cpu: Switch to arch_cpu_finalize_init() ARM: cpu: Switch to arch_cpu_finalize_init() x86/cpu: Switch to arch_cpu_finalize_init() init: Provide arch_cpu_finalize_init() Revert "posix-timers: Ensure timer ID search-loop limit is valid" Revert "drm/panel: Initialise panel dev and funcs through drm_panel_init()" Revert "drm/panel: Add and fill drm_panel type field" Revert "drm/panel: simple: Add connector_type for innolux_at043tn24" Revert "Revert "8250: add support for ASIX devices with a FIFO bug"" Linux 5.4.251 tracing/histograms: Return an error if we fail to add histogram to hist_vars list tcp: annotate data-races around fastopenq.max_qlen tcp: annotate data-races around tp->notsent_lowat tcp: annotate data-races around rskq_defer_accept tcp: annotate data-races around tp->linger2 net: Replace the limit of TCP_LINGER2 with TCP_FIN_TIMEOUT_MAX tcp: annotate data-races around tp->tcp_tx_delay netfilter: nf_tables: can't schedule in nft_chain_validate netfilter: nf_tables: fix spurious set element insertion failure llc: Don't drop packet from non-root netns. fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe Revert "tcp: avoid the lookup process failing to get sk in ehash table" net:ipv6: check return value of pskb_trim() iavf: Fix use-after-free in free_netdev net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field() pinctrl: amd: Use amd_pinconf_set() for all config options fbdev: imxfb: warn about invalid left/right margin spi: bcm63xx: fix max prepend length igb: Fix igb_down hung on surprise removal wifi: iwlwifi: mvm: avoid baid size integer overflow wifi: wext-core: Fix -Wstringop-overflow warning in ioctl_standard_iw_point() devlink: report devlink_port_type_warn source device bpf: Address KCSAN report on bpf_lru_list sched/fair: Don't balance task to its current running CPU arm64: mm: fix VA-range sanity check posix-timers: Ensure timer ID search-loop limit is valid md/raid10: prevent soft lockup while flush writes md: fix data corruption for raid456 when reshape restart while grow up nbd: Add the maximum limit of allocated index in nbd_dev_add debugobjects: Recheck debug_objects_enabled before reporting ext4: correct inline offset when handling xattrs in inode body drm/client: Fix memory leak in drm_client_modeset_probe drm/client: Fix memory leak in drm_client_target_cloned can: bcm: Fix UAF in bcm_proc_show() selftests: tc: set timeout to 15 minutes fuse: revalidate: don't invalidate if interrupted btrfs: fix warning when putting transaction with qgroups enabled after abort perf probe: Add test for regression introduced by switch to die_get_decl_file() drm/atomic: Fix potential use-after-free in nonblocking commits scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue scsi: qla2xxx: Pointer may be dereferenced scsi: qla2xxx: Correct the index of array scsi: qla2xxx: Check valid rport returned by fc_bsg_to_rport() scsi: qla2xxx: Fix potential NULL pointer dereference scsi: qla2xxx: Wait for io return on terminate rport tracing/probes: Fix not to count error code to total length tracing: Fix null pointer dereference in tracing_err_log_open() xtensa: ISS: fix call to split_if_spec ring-buffer: Fix deadloop issue on reading trace_pipe tracing/histograms: Add histograms to hist_vars if they have referenced variables tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() when iterating clk tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() in case of error Revert "8250: add support for ASIX devices with a FIFO bug" meson saradc: fix clock divider mask length ceph: don't let check_caps skip sending responses for revoke msgs hwrng: imx-rngc - fix the timeout for init and self check firmware: stratix10-svc: Fix a potential resource leak in svc_create_memory_pool() serial: atmel: don't enable IRQs prematurely drm/rockchip: vop: Leave vblank enabled in self-refresh drm/atomic: Allow vblank-enabled + self-refresh "disable" fs: dlm: return positive pid value for F_GETLK md/raid0: add discard support for the 'original' layout misc: pci_endpoint_test: Re-init completion for every test misc: pci_endpoint_test: Free IRQs before removing the device PCI: rockchip: Set address alignment for endpoint mode PCI: rockchip: Use u32 variable to access 32-bit registers PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked PCI: rockchip: Write PCI Device ID to correct register PCI: rockchip: Assert PCI Configuration Enable bit after probe PCI: qcom: Disable write access to read only registers for IP v2.3.3 PCI: Add function 1 DMA alias quirk for Marvell 88SE9235 PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold jfs: jfs_dmap: Validate db_l2nbperpage while mounting ext4: only update i_reserved_data_blocks on successful block allocation ext4: fix wrong unit use in ext4_mb_clear_bb erofs: fix compact 4B support for 16k block size SUNRPC: Fix UAF in svc_tcp_listen_data_ready() misc: fastrpc: Create fastrpc scalar with correct buffer count powerpc: Fail build if using recordmcount with binutils v2.37 net: bcmgenet: Ensure MDIO unregistration has clocks enabled mtd: rawnand: meson: fix unaligned DMA buffers handling tpm: tpm_vtpm_proxy: fix a race condition in /dev/vtpmx creation pinctrl: amd: Only use special debounce behavior for GPIO 0 pinctrl: amd: Detect internal GPIO0 debounce handling pinctrl: amd: Fix mistake in handling clearing pins at startup net/sched: make psched_mtu() RTNL-less safe net/sched: flower: Ensure both minimum and maximum ports are specified cls_flower: Add extack support for src and dst port range options wifi: airo: avoid uninitialized warning in airo_get_rate() erofs: avoid infinite loop in z_erofs_do_read_page() when reading beyond EOF platform/x86: wmi: Break possible infinite loop when parsing GUID platform/x86: wmi: move variables platform/x86: wmi: use guid_t and guid_equal() platform/x86: wmi: remove unnecessary argument platform/x86: wmi: Fix indentation in some cases platform/x86: wmi: Replace UUID redefinitions by their originals ipv6/addrconf: fix a potential refcount underflow for idev NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init() udp6: fix udp6_ehashfn() typo icmp6: Fix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev(). ionic: remove WARN_ON to prevent panic_on_warn ionic: ionic_intr_free parameter change ionic: move irq request to qcq alloc ionic: clean irq affinity on queue deinit ionic: improve irq numa locality net/sched: cls_fw: Fix improper refcount update leads to use-after-free net: mvneta: fix txq_map in case of txq_number==1 scsi: qla2xxx: Fix error code in qla2x00_start_sp() igc: set TP bit in 'supported' and 'advertising' fields of ethtool_link_ksettings igc: Remove delay during TX ring configuration drm/panel: simple: Add connector_type for innolux_at043tn24 drm/panel: Add and fill drm_panel type field drm/panel: Initialise panel dev and funcs through drm_panel_init() workqueue: clean up WORK_* constant types, clarify masking net: lan743x: Don't sleep in atomic context block/partition: fix signedness issue for Amiga partitions tty: serial: fsl_lpuart: add earlycon for imx8ulp platform netfilter: nf_tables: prevent OOB access in nft_byteorder_eval netfilter: conntrack: Avoid nf_ct_helper_hash uses after free netfilter: nf_tables: fix scheduling-while-atomic splat netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: add rescheduling points during loop detection walks netfilter: nf_tables: use net_generic infra for transaction data netfilter: add helper function to set up the nfnetlink header and use it netfilter: nftables: add helper function to set the base sequence number netfilter: nf_tables: fix nat hook table deletion block: add overflow checks for Amiga partition support fanotify: disallow mount/sb marks on kernel internal pseudo fs fs: no need to check source ARM: orion5x: fix d2net gpio initialization btrfs: fix race when deleting quota root from the dirty cow roots list fs: Lock moved directories fs: Establish locking order for unrelated directories Revert "f2fs: fix potential corruption when moving a directory" ext4: Remove ext4 locking of moved directory fs: avoid empty option when generating legacy mount string jffs2: reduce stack usage in jffs2_build_xattr_subsystem() integrity: Fix possible multiple allocation in integrity_inode_get() bcache: Remove unnecessary NULL point check in node allocations mmc: sdhci: fix DMA configure compatibility issue when 64bit DMA mode is used. mmc: core: disable TRIM on Micron MTFC4GACAJCN-1M mmc: core: disable TRIM on Kingston EMMC04G-M627 NFSD: add encoding of op_recall flag for write delegation ALSA: jack: Fix mutex call in snd_jack_report() i2c: xiic: Don't try to handle more interrupt events after error i2c: xiic: Defer xiic_wakeup() and __xiic_start_xfer() in xiic_process() sh: dma: Fix DMA channel offset calculation net: dsa: tag_sja1105: fix MAC DA patching from meta frames net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX xsk: Honor SO_BINDTODEVICE on bind xsk: Improve documentation for AF_XDP tcp: annotate data races in __tcp_oow_rate_limited() net: bridge: keep ports without IFF_UNICAST_FLT in BR_PROMISC mode powerpc: allow PPC_EARLY_DEBUG_CPM only when SERIAL_CPM=y f2fs: fix error path handling in truncate_dnode() mailbox: ti-msgmgr: Fill non-message tx data fields with 0x0 spi: bcm-qspi: return error if neither hif_mspi nor mspi is available Add MODULE_FIRMWARE() for FIRMWARE_TG357766. sctp: fix potential deadlock on &net->sctp.addr_wq_lock rtc: st-lpc: Release some resources in st_rtc_probe() in case of error pwm: sysfs: Do not apply state to already disabled PWMs pwm: imx-tpm: force 'real_period' to be zero in suspend mfd: stmpe: Only disable the regulators if they are enabled KVM: s390: vsie: fix the length of APCB bitmap mfd: stmfx: Fix error path in stmfx_chip_init serial: 8250_omap: Use force_suspend and resume for system suspend mfd: intel-lpss: Add missing check for platform_get_resource usb: dwc3: qcom: Release the correct resources in dwc3_qcom_remove() KVM: s390: fix KVM_S390_GET_CMMA_BITS for GFNs in memslot holes mfd: rt5033: Drop rt5033-battery sub-device usb: hide unused usbfs_notify_suspend/resume functions usb: phy: phy-tahvo: fix memory leak in tahvo_usb_probe() extcon: Fix kernel doc of property capability fields to avoid warnings extcon: Fix kernel doc of property fields to avoid warnings usb: dwc3: qcom: Fix potential memory leak media: usb: siano: Fix warning due to null work_func_t function pointer media: videodev2.h: Fix struct v4l2_input tuner index comment media: usb: Check az6007_read() return value sh: j2: Use ioremap() to translate device tree address into kernel memory w1: fix loop in w1_fini() block: change all __u32 annotations to __be32 in affs_hardblocks.h block: fix signed int overflow in Amiga partition support usb: dwc3: gadget: Propagate core init errors to UDC during pullup USB: serial: option: add LARA-R6 01B PIDs hwrng: st - keep clock enabled while hwrng is registered hwrng: st - Fix W=1 unused variable warning NFSv4.1: freeze the session table upon receiving NFS4ERR_BADSESSION ARC: define ASM_NL and __ALIGN(_STR) outside #ifdef __ASSEMBLY__ guard modpost: fix off by one in is_executable_section() crypto: marvell/cesa - Fix type mismatch warning modpost: fix section mismatch message for R_ARM_{PC24,CALL,JUMP24} modpost: fix section mismatch message for R_ARM_ABS32 crypto: nx - fix build warnings when DEBUG_FS is not enabled hwrng: virtio - Fix race on data_avail and actual data hwrng: virtio - always add a pending request hwrng: virtio - don't waste entropy hwrng: virtio - don't wait on cleanup hwrng: virtio - add an internal buffer powerpc/mm/dax: Fix the condition when checking if altmap vmemap can cross-boundary pinctrl: at91-pio4: check return value of devm_kasprintf() perf dwarf-aux: Fix off-by-one in die_get_varname() pinctrl: cherryview: Return correct value if pin in push-pull mode PCI: Add pci_clear_master() stub for non-CONFIG_PCI PCI: ftpci100: Release the clock resources PCI: pciehp: Cancel bringup sequence if card is not present scsi: 3w-xxxx: Add error handling for initialization failure in tw_probe() PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free scsi: qedf: Fix NULL dereference in error handling ASoC: imx-audmix: check return value of devm_kasprintf() clk: keystone: sci-clk: check return value of kasprintf() clk: cdce925: check return value of kasprintf() ALSA: ac97: Fix possible NULL dereference in snd_ac97_mixer clk: tegra: tegra124-emc: Fix potential memory leak drm/radeon: fix possible division-by-zero errors drm/amdkfd: Fix potential deallocation of previously deallocated memory. fbdev: omapfb: lcd_mipid: Fix an error handling path in mipid_spi_probe() arm64: dts: renesas: ulcb-kf: Remove flow control for SCIF1 IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors soc/fsl/qe: fix usb.c build errors ASoC: es8316: Do not set rate constraints for unsupported MCLKs ASoC: es8316: Increment max value for ALC Capture Target Volume control memory: brcmstb_dpfe: fix testing array offset after use ARM: ep93xx: fix missing-prototype warnings drm/panel: simple: fix active size for Ampire AM-480272H3TMQW-T01H arm64: dts: qcom: msm8916: correct camss unit address ARM: dts: gta04: Move model property out of pinctrl node RDMA/bnxt_re: Fix to remove an unnecessary log drm: sun4i_tcon: use devm_clk_get_enabled in `sun4i_tcon_init_clocks` Input: adxl34x - do not hardcode interrupt trigger type ARM: dts: BCM5301X: Drop "clock-names" from the SPI node Input: drv260x - sleep between polling GO bit radeon: avoid double free in ci_dpm_init() netlink: Add __sock_i_ino() for __netlink_diag_dump(). ipvlan: Fix return value of ipvlan_queue_xmit() netfilter: nf_conntrack_sip: fix the ct_sip_parse_numerical_param() return value. netfilter: conntrack: dccp: copy entire header to stack buffer, not just basic one lib/ts_bm: reset initial match offset for every block of text net: nfc: Fix use-after-free caused by nfc_llcp_find_local nfc: llcp: simplify llcp_sock_connect() error paths gtp: Fix use-after-free in __gtp_encap_destroy(). selftests: rtnetlink: remove netdevsim device after ipsec offload test netlink: do not hard code device address lenth in fdb dumps netlink: fix potential deadlock in netlink_set_err() wifi: ath9k: convert msecs to jiffies where needed wifi: cfg80211: rewrite merging of inherited elements wifi: iwlwifi: pull from TXQs with softirqs disabled rtnetlink: extend RTEXT_FILTER_SKIP_STATS to IFLA_VF_INFO wifi: ath9k: Fix possible stall on ath9k_txq_list_has_key() memstick r592: make memstick_debug_get_tpc_name() static kexec: fix a memory leak in crash_shrink_memory() watchdog/perf: more properly prevent false positives with turbo modes watchdog/perf: define dummy watchdog_update_hrtimer_threshold() on correct config wifi: rsi: Do not set MMC_PM_KEEP_POWER in shutdown wifi: ath9k: don't allow to overwrite ENDPOINT0 attributes wifi: ray_cs: Fix an error handling path in ray_probe() wifi: ray_cs: Drop useless status variable in parse_addr() wifi: ray_cs: Utilize strnlen() in parse_addr() wifi: wl3501_cs: Fix an error handling path in wl3501_probe() wl3501_cs: use eth_hw_addr_set() net: create netdev->dev_addr assignment helpers wl3501_cs: Fix misspelling and provide missing documentation wl3501_cs: Remove unnecessary NULL check wl3501_cs: Fix a bunch of formatting issues related to function docs wifi: atmel: Fix an error handling path in atmel_probe() wifi: orinoco: Fix an error handling path in orinoco_cs_probe() wifi: orinoco: Fix an error handling path in spectrum_cs_probe() regulator: core: Streamline debugfs operations regulator: core: Fix more error checking for debugfs_create_dir() nfc: llcp: fix possible use of uninitialized variable in nfc_llcp_send_connect() nfc: constify several pointers to u8, char and sk_buff wifi: mwifiex: Fix the size of a memory allocation in mwifiex_ret_802_11_scan() spi: spi-geni-qcom: Correct CS_TOGGLE bit in SPI_TRANS_CFG samples/bpf: Fix buffer overflow in tcp_basertt wifi: ath9k: avoid referencing uninit memory in ath9k_wmi_ctrl_rx wifi: ath9k: fix AR9003 mac hardware hang check register offset calculation ima: Fix build warnings pstore/ram: Add check for kstrdup evm: Complete description of evm_inode_setattr() ARM: 9303/1: kprobes: avoid missing-declaration warnings powercap: RAPL: Fix CONFIG_IOSF_MBI dependency PM: domains: fix integer overflow issues in genpd_parse_state() clocksource/drivers/cadence-ttc: Fix memory leak in ttc_timer_probe clocksource/drivers/cadence-ttc: Use ttc driver as platform driver tracing/timer: Add missing hrtimer modes to decode_hrtimer_mode(). irqchip/jcore-aic: Fix missing allocation of IRQ descriptors irqchip/jcore-aic: Kill use of irq_create_strict_mappings() md/raid10: fix io loss while replacement replace rdev md/raid10: fix null-ptr-deref of mreplace in raid10_sync_request md/raid10: fix wrong setting of max_corr_read_errors md/raid10: fix overflow of md/safe_mode_delay md/raid10: check slab-out-of-bounds in md_bitmap_get_counter x86/resctrl: Only show tasks' pid in current pid namespace x86/resctrl: Use is_closid_match() in more places bgmac: fix *initial* chip reset to support BCM5358 drm/amdgpu: Validate VM ioctl flags. scripts/tags.sh: Resolve gtags empty index generation drm/i915: Initialise outparam for error return from wait_for_register HID: wacom: Use ktime_t rather than int when dealing with timestamps fbdev: imsttfb: Fix use after free bug in imsttfb_probe video: imsttfb: check for ioremap() failures x86/smp: Use dedicated cache-line for mwait_play_dead() gfs2: Don't deref jdesc in evict Linux 5.4.250 x86/cpu/amd: Add a Zenbleed fix x86/cpu/amd: Move the errata checking functionality up x86/microcode/AMD: Load late on both threads too Conflicts: drivers/usb/dwc3/gadget.c Change-Id: Ibd4bab8255496e4640f2eaf4eb7836209dd7cbfb
2023-10-16 15:49:50 +03:00 · 2023-10-16 15:49:50 +03:00 · 6ef34c09c6
commit 6ef34c09c6
parent 0d9a39127b 91f702572a
520 changed files with 5352 additions and 3028 deletions
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@ -480,16 +480,17 @@ Description:	information about CPUs heterogeneity.
 		cpu_capacity: capacity of cpu#.

 What:		/sys/devices/system/cpu/vulnerabilities
-		/sys/devices/system/cpu/vulnerabilities/meltdown
-		/sys/devices/system/cpu/vulnerabilities/spectre_v1
-		/sys/devices/system/cpu/vulnerabilities/spectre_v2
-		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
+		/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
+		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
 		/sys/devices/system/cpu/vulnerabilities/l1tf
 		/sys/devices/system/cpu/vulnerabilities/mds
+		/sys/devices/system/cpu/vulnerabilities/meltdown
+		/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
+		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
+		/sys/devices/system/cpu/vulnerabilities/spectre_v1
+		/sys/devices/system/cpu/vulnerabilities/spectre_v2
 		/sys/devices/system/cpu/vulnerabilities/srbds
 		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
-		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
-		/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
 Date:		January 2018
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	Information about CPU vulnerabilities
--- a/Documentation/admin-guide/hw-vuln/gather_data_sampling.rst
+++ b/Documentation/admin-guide/hw-vuln/gather_data_sampling.rst
@ -0,0 +1,109 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+GDS - Gather Data Sampling
+==========================
+
+Gather Data Sampling is a hardware vulnerability which allows unprivileged
+speculative access to data which was previously stored in vector registers.
+
+Problem
+-------
+When a gather instruction performs loads from memory, different data elements
+are merged into the destination vector register. However, when a gather
+instruction that is transiently executed encounters a fault, stale data from
+architectural or internal vector registers may get transiently forwarded to the
+destination vector register instead. This will allow a malicious attacker to
+infer stale data using typical side channel techniques like cache timing
+attacks. GDS is a purely sampling-based attack.
+
+The attacker uses gather instructions to infer the stale vector register data.
+The victim does not need to do anything special other than use the vector
+registers. The victim does not need to use gather instructions to be
+vulnerable.
+
+Because the buffers are shared between Hyper-Threads cross Hyper-Thread attacks
+are possible.
+
+Attack scenarios
+----------------
+Without mitigation, GDS can infer stale data across virtually all
+permission boundaries:
+
+	Non-enclaves can infer SGX enclave data
+	Userspace can infer kernel data
+	Guests can infer data from hosts
+	Guest can infer guest from other guests
+	Users can infer data from other users
+
+Because of this, it is important to ensure that the mitigation stays enabled in
+lower-privilege contexts like guests and when running outside SGX enclaves.
+
+The hardware enforces the mitigation for SGX. Likewise, VMMs should  ensure
+that guests are not allowed to disable the GDS mitigation. If a host erred and
+allowed this, a guest could theoretically disable GDS mitigation, mount an
+attack, and re-enable it.
+
+Mitigation mechanism
+--------------------
+This issue is mitigated in microcode. The microcode defines the following new
+bits:
+
+ ================================   ===   ============================
+ IA32_ARCH_CAPABILITIES[GDS_CTRL]   R/O   Enumerates GDS vulnerability
+                                          and mitigation support.
+ IA32_ARCH_CAPABILITIES[GDS_NO]     R/O   Processor is not vulnerable.
+ IA32_MCU_OPT_CTRL[GDS_MITG_DIS]    R/W   Disables the mitigation
+                                          0 by default.
+ IA32_MCU_OPT_CTRL[GDS_MITG_LOCK]   R/W   Locks GDS_MITG_DIS=0. Writes
+                                          to GDS_MITG_DIS are ignored
+                                          Can't be cleared once set.
+ ================================   ===   ============================
+
+GDS can also be mitigated on systems that don't have updated microcode by
+disabling AVX. This can be done by setting gather_data_sampling="force" or
+"clearcpuid=avx" on the kernel command-line.
+
+If used, these options will disable AVX use by turning off XSAVE YMM support.
+However, the processor will still enumerate AVX support.  Userspace that
+does not follow proper AVX enumeration to check both AVX *and* XSAVE YMM
+support will break.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+The mitigation can be disabled by setting "gather_data_sampling=off" or
+"mitigations=off" on the kernel command line. Not specifying either will default
+to the mitigation being enabled. Specifying "gather_data_sampling=force" will
+use the microcode mitigation when available or disable AVX on affected systems
+where the microcode hasn't been updated to include the mitigation.
+
+GDS System Information
+------------------------
+The kernel provides vulnerability status information through sysfs. For
+GDS this can be accessed by the following sysfs file:
+
+/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
+
+The possible values contained in this file are:
+
+ ============================== =============================================
+ Not affected                   Processor not vulnerable.
+ Vulnerable                     Processor vulnerable and mitigation disabled.
+ Vulnerable: No microcode       Processor vulnerable and microcode is missing
+                                mitigation.
+ Mitigation: AVX disabled,
+ no microcode                   Processor is vulnerable and microcode is missing
+                                mitigation. AVX disabled as mitigation.
+ Mitigation: Microcode          Processor is vulnerable and mitigation is in
+                                effect.
+ Mitigation: Microcode (locked) Processor is vulnerable and mitigation is in
+                                effect and cannot be disabled.
+ Unknown: Dependent on
+ hypervisor status              Running on a virtual guest processor that is
+                                affected but with no way to know if host
+                                processor is mitigated or vulnerable.
+ ============================== =============================================
+
+GDS Default mitigation
+----------------------
+The updated microcode will enable the mitigation by default. The kernel's
+default action is to leave the mitigation enabled.
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@ -16,3 +16,4 @@ are configurable at compile, boot or run time.
   multihit.rst
   special-register-buffer-data-sampling.rst
   processor_mmio_stale_data.rst
+   gather_data_sampling.rst
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@ -1345,6 +1345,26 @@
 			Format: off | on
 			default: on

+	gather_data_sampling=
+			[X86,INTEL] Control the Gather Data Sampling (GDS)
+			mitigation.
+
+			Gather Data Sampling is a hardware vulnerability which
+			allows unprivileged speculative access to data which was
+			previously stored in vector registers.
+
+			This issue is mitigated by default in updated microcode.
+			The mitigation may have a performance impact but can be
+			disabled. On systems without the microcode mitigation
+			disabling AVX serves as a mitigation.
+
+			force:	Disable AVX to mitigate systems without
+				microcode mitigation. No effect if the microcode
+				mitigation is present. Known to cause crashes in
+				userspace with buggy AVX enumeration.
+
+			off:    Disable GDS mitigation.
+
 	gcov_persist=	[GCOV] When non-zero (default), profiling data for
 			kernel modules is saved and remains accessible via
 			debugfs, even when the module is unloaded/reloaded.
@ -2705,21 +2725,22 @@
 				Disable all optional CPU mitigations.  This
 				improves system performance, but it may also
 				expose users to several CPU vulnerabilities.
-				Equivalent to: nopti [X86,PPC]
+				Equivalent to: gather_data_sampling=off [X86]
 					       kpti=0 [ARM64]
-					       nospectre_v1 [X86,PPC]
-					       nobp=0 [S390]
-					       nospectre_v2 [X86,PPC,S390,ARM64]
-					       spectre_v2_user=off [X86]
-					       spec_store_bypass_disable=off [X86,PPC]
-					       ssbd=force-off [ARM64]
+					       kvm.nx_huge_pages=off [X86]
 					       l1tf=off [X86]
 					       mds=off [X86]
-					       tsx_async_abort=off [X86]
-					       kvm.nx_huge_pages=off [X86]
+					       mmio_stale_data=off [X86]
 					       no_entry_flush [PPC]
 					       no_uaccess_flush [PPC]
-					       mmio_stale_data=off [X86]
+					       nobp=0 [S390]
+					       nopti [X86,PPC]
+					       nospectre_v1 [X86,PPC]
+					       nospectre_v2 [X86,PPC,S390,ARM64]
+					       spec_store_bypass_disable=off [X86,PPC]
+					       spectre_v2_user=off [X86]
+					       ssbd=force-off [ARM64]
+					       tsx_async_abort=off [X86]

 				Exceptions:
 					       This does not have any effect on
--- a/Documentation/admin-guide/security-bugs.rst
+++ b/Documentation/admin-guide/security-bugs.rst
@ -56,31 +56,28 @@ information submitted to the security list and any followup discussions
 of the report are treated confidentially even after the embargo has been
 lifted, in perpetuity.

-Coordination
------------
+Coordination with other groups
+------------------------------

-Fixes for sensitive bugs, such as those that might lead to privilege
-escalations, may need to be coordinated with the private
-<linux-distros@vs.openwall.org> mailing list so that distribution vendors
-are well prepared to issue a fixed kernel upon public disclosure of the
-upstream fix. Distros will need some time to test the proposed patch and
-will generally request at least a few days of embargo, and vendor update
-publication prefers to happen Tuesday through Thursday. When appropriate,
-the security team can assist with this coordination, or the reporter can
-include linux-distros from the start. In this case, remember to prefix
-the email Subject line with "[vs]" as described in the linux-distros wiki:
-<http://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists>
+The kernel security team strongly recommends that reporters of potential
+security issues NEVER contact the "linux-distros" mailing list until
+AFTER discussing it with the kernel security team.  Do not Cc: both
+lists at once.  You may contact the linux-distros mailing list after a
+fix has been agreed on and you fully understand the requirements that
+doing so will impose on you and the kernel community.
+
+The different lists have different goals and the linux-distros rules do
+not contribute to actually fixing any potential security problems.

 CVE assignment
 --------------

-The security team does not normally assign CVEs, nor do we require them
-for reports or fixes, as this can needlessly complicate the process and
-may delay the bug handling. If a reporter wishes to have a CVE identifier
-assigned ahead of public disclosure, they will need to contact the private
-linux-distros list, described above. When such a CVE identifier is known
-before a patch is provided, it is desirable to mention it in the commit
-message if the reporter agrees.
+The security team does not assign CVEs, nor do we require them for
+reports or fixes, as this can needlessly complicate the process and may
+delay the bug handling.  If a reporter wishes to have a CVE identifier
+assigned, they should find one by themselves, for example by contacting
+MITRE directly.  However under no circumstances will a patch inclusion
+be delayed to wait for a CVE identifier to arrive.

 Non-disclosure agreements
 -------------------------
--- a/Documentation/filesystems/directory-locking.rst
+++ b/Documentation/filesystems/directory-locking.rst
@ -22,12 +22,11 @@ exclusive.
 3) object removal.  Locking rules: caller locks parent, finds victim,
 locks victim and calls the method.  Locks are exclusive.

-4) rename() that is _not_ cross-directory.  Locking rules: caller locks
-the parent and finds source and target.  In case of exchange (with
-RENAME_EXCHANGE in flags argument) lock both.  In any case,
-if the target already exists, lock it.  If the source is a non-directory,
-lock it.  If we need to lock both, lock them in inode pointer order.
-Then call the method.  All locks are exclusive.
+4) rename() that is _not_ cross-directory.  Locking rules: caller locks the
+parent and finds source and target.  We lock both (provided they exist).  If we
+need to lock two inodes of different type (dir vs non-dir), we lock directory
+first.  If we need to lock two inodes of the same type, lock them in inode
+pointer order.  Then call the method.  All locks are exclusive.
 NB: we might get away with locking the the source (and target in exchange
 case) shared.

@ -44,15 +43,17 @@ All locks are exclusive.
 rules:

 	* lock the filesystem
-	* lock parents in "ancestors first" order.
+	* lock parents in "ancestors first" order. If one is not ancestor of
+	  the other, lock them in inode pointer order.
 	* find source and target.
 	* if old parent is equal to or is a descendent of target
 	  fail with -ENOTEMPTY
 	* if new parent is equal to or is a descendent of source
 	  fail with -ELOOP
-	* If it's an exchange, lock both the source and the target.
-	* If the target exists, lock it.  If the source is a non-directory,
-	  lock it.  If we need to lock both, do so in inode pointer order.
+	* Lock both the source and the target provided they exist. If we
+	  need to lock two inodes of different type (dir vs non-dir), we lock
+	  the directory first. If we need to lock two inodes of the same type,
+	  lock them in inode pointer order.
 	* call the method.

 All ->i_rwsem are taken exclusive.  Again, we might get away with locking
@ -66,8 +67,9 @@ If no directory is its own ancestor, the scheme above is deadlock-free.

 Proof:

-	First of all, at any moment we have a partial ordering of the
-	objects - A < B iff A is an ancestor of B.
+	First of all, at any moment we have a linear ordering of the
+	objects - A < B iff (A is an ancestor of B) or (B is not an ancestor
+        of A and ptr(A) < ptr(B)).

 	That ordering can change.  However, the following is true:

--- a/Documentation/networking/af_xdp.rst
+++ b/Documentation/networking/af_xdp.rst
@ -40,13 +40,13 @@ allocates memory for this UMEM using whatever means it feels is most
 appropriate (malloc, mmap, huge pages, etc). This memory area is then
 registered with the kernel using the new setsockopt XDP_UMEM_REG. The
 UMEM also has two rings: the FILL ring and the COMPLETION ring. The
-fill ring is used by the application to send down addr for the kernel
+FILL ring is used by the application to send down addr for the kernel
 to fill in with RX packet data. References to these frames will then
 appear in the RX ring once each packet has been received. The
-completion ring, on the other hand, contains frame addr that the
+COMPLETION ring, on the other hand, contains frame addr that the
 kernel has transmitted completely and can now be used again by user
 space, for either TX or RX. Thus, the frame addrs appearing in the
-completion ring are addrs that were previously transmitted using the
+COMPLETION ring are addrs that were previously transmitted using the
 TX ring. In summary, the RX and FILL rings are used for the RX path
 and the TX and COMPLETION rings are used for the TX path.

@ -91,11 +91,16 @@ Concepts
 ========

 In order to use an AF_XDP socket, a number of associated objects need
-to be setup.
+to be setup. These objects and their options are explained in the
+following sections.

-Jonathan Corbet has also written an excellent article on LWN,
-"Accelerating networking with AF_XDP". It can be found at
-https://lwn.net/Articles/750845/.
+For an overview on how AF_XDP works, you can also take a look at the
+Linux Plumbers paper from 2018 on the subject:
+http://vger.kernel.org/lpc_net2018_talks/lpc18_paper_af_xdp_perf-v2.pdf. Do
+NOT consult the paper from 2017 on "AF_PACKET v4", the first attempt
+at AF_XDP. Nearly everything changed since then. Jonathan Corbet has
+also written an excellent article on LWN, "Accelerating networking
+with AF_XDP". It can be found at https://lwn.net/Articles/750845/.

 UMEM
 ----
@ -113,22 +118,22 @@ the next socket B can do this by setting the XDP_SHARED_UMEM flag in
 struct sockaddr_xdp member sxdp_flags, and passing the file descriptor
 of A to struct sockaddr_xdp member sxdp_shared_umem_fd.

-The UMEM has two single-producer/single-consumer rings, that are used
+The UMEM has two single-producer/single-consumer rings that are used
 to transfer ownership of UMEM frames between the kernel and the
 user-space application.

 Rings
 -----

-There are a four different kind of rings: Fill, Completion, RX and
+There are a four different kind of rings: FILL, COMPLETION, RX and
 TX. All rings are single-producer/single-consumer, so the user-space
 application need explicit synchronization of multiple
 processes/threads are reading/writing to them.

-The UMEM uses two rings: Fill and Completion. Each socket associated
+The UMEM uses two rings: FILL and COMPLETION. Each socket associated
 with the UMEM must have an RX queue, TX queue or both. Say, that there
 is a setup with four sockets (all doing TX and RX). Then there will be
-one Fill ring, one Completion ring, four TX rings and four RX rings.
+one FILL ring, one COMPLETION ring, four TX rings and four RX rings.

 The rings are head(producer)/tail(consumer) based rings. A producer
 writes the data ring at the index pointed out by struct xdp_ring
@ -146,7 +151,7 @@ The size of the rings need to be of size power of two.
 UMEM Fill Ring
 ~~~~~~~~~~~~~~

-The Fill ring is used to transfer ownership of UMEM frames from
+The FILL ring is used to transfer ownership of UMEM frames from
 user-space to kernel-space. The UMEM addrs are passed in the ring. As
 an example, if the UMEM is 64k and each chunk is 4k, then the UMEM has
 16 chunks and can pass addrs between 0 and 64k.
@ -164,8 +169,8 @@ chunks mode, then the incoming addr will be left untouched.
 UMEM Completion Ring
 ~~~~~~~~~~~~~~~~~~~~

-The Completion Ring is used transfer ownership of UMEM frames from
-kernel-space to user-space. Just like the Fill ring, UMEM indicies are
+The COMPLETION Ring is used transfer ownership of UMEM frames from
+kernel-space to user-space. Just like the FILL ring, UMEM indices are
 used.

 Frames passed from the kernel to user-space are frames that has been
@ -181,7 +186,7 @@ The RX ring is the receiving side of a socket. Each entry in the ring
 is a struct xdp_desc descriptor. The descriptor contains UMEM offset
 (addr) and the length of the data (len).

-If no frames have been passed to kernel via the Fill ring, no
+If no frames have been passed to kernel via the FILL ring, no
 descriptors will (or can) appear on the RX ring.

 The user application consumes struct xdp_desc descriptors from this
@ -199,8 +204,24 @@ be relaxed in the future.
 The user application produces struct xdp_desc descriptors to this
 ring.

+Libbpf
+======
+
+Libbpf is a helper library for eBPF and XDP that makes using these
+technologies a lot simpler. It also contains specific helper functions
+in tools/lib/bpf/xsk.h for facilitating the use of AF_XDP. It
+contains two types of functions: those that can be used to make the
+setup of AF_XDP socket easier and ones that can be used in the data
+plane to access the rings safely and quickly. To see an example on how
+to use this API, please take a look at the sample application in
+samples/bpf/xdpsock_usr.c which uses libbpf for both setup and data
+plane operations.
+
+We recommend that you use this library unless you have become a power
+user. It will make your program a lot simpler.
+
 XSKMAP / BPF_MAP_TYPE_XSKMAP
----------------------------
+============================

 On XDP side there is a BPF map type BPF_MAP_TYPE_XSKMAP (XSKMAP) that
 is used in conjunction with bpf_redirect_map() to pass the ingress
@ -216,21 +237,193 @@ queue 17. Only the XDP program executing for eth0 and queue 17 will
 successfully pass data to the socket. Please refer to the sample
 application (samples/bpf/) in for an example.

+Configuration Flags and Socket Options
+======================================
+
+These are the various configuration flags that can be used to control
+and monitor the behavior of AF_XDP sockets.
+
+XDP_COPY and XDP_ZERO_COPY bind flags
+-------------------------------------
+
+When you bind to a socket, the kernel will first try to use zero-copy
+copy. If zero-copy is not supported, it will fall back on using copy
+mode, i.e. copying all packets out to user space. But if you would
+like to force a certain mode, you can use the following flags. If you
+pass the XDP_COPY flag to the bind call, the kernel will force the
+socket into copy mode. If it cannot use copy mode, the bind call will
+fail with an error. Conversely, the XDP_ZERO_COPY flag will force the
+socket into zero-copy mode or fail.
+
+XDP_SHARED_UMEM bind flag
+-------------------------
+
+This flag enables you to bind multiple sockets to the same UMEM, but
+only if they share the same queue id. In this mode, each socket has
+their own RX and TX rings, but the UMEM (tied to the fist socket
+created) only has a single FILL ring and a single COMPLETION
+ring. To use this mode, create the first socket and bind it in the normal
+way. Create a second socket and create an RX and a TX ring, or at
+least one of them, but no FILL or COMPLETION rings as the ones from
+the first socket will be used. In the bind call, set he
+XDP_SHARED_UMEM option and provide the initial socket's fd in the
+sxdp_shared_umem_fd field. You can attach an arbitrary number of extra
+sockets this way.
+
+What socket will then a packet arrive on? This is decided by the XDP
+program. Put all the sockets in the XSK_MAP and just indicate which
+index in the array you would like to send each packet to. A simple
+round-robin example of distributing packets is shown below:
+
+.. code-block:: c
+
+   #include <linux/bpf.h>
+   #include "bpf_helpers.h"
+
+   #define MAX_SOCKS 16
+
+   struct {
+        __uint(type, BPF_MAP_TYPE_XSKMAP);
+        __uint(max_entries, MAX_SOCKS);
+        __uint(key_size, sizeof(int));
+        __uint(value_size, sizeof(int));
+   } xsks_map SEC(".maps");
+
+   static unsigned int rr;
+
+   SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
+   {
+	rr = (rr + 1) & (MAX_SOCKS - 1);
+
+	return bpf_redirect_map(&xsks_map, rr, 0);
+   }
+
+Note, that since there is only a single set of FILL and COMPLETION
+rings, and they are single producer, single consumer rings, you need
+to make sure that multiple processes or threads do not use these rings
+concurrently. There are no synchronization primitives in the
+libbpf code that protects multiple users at this point in time.
+
+XDP_USE_NEED_WAKEUP bind flag
+-----------------------------
+
+This option adds support for a new flag called need_wakeup that is
+present in the FILL ring and the TX ring, the rings for which user
+space is a producer. When this option is set in the bind call, the
+need_wakeup flag will be set if the kernel needs to be explicitly
+woken up by a syscall to continue processing packets. If the flag is
+zero, no syscall is needed.
+
+If the flag is set on the FILL ring, the application needs to call
+poll() to be able to continue to receive packets on the RX ring. This
+can happen, for example, when the kernel has detected that there are no
+more buffers on the FILL ring and no buffers left on the RX HW ring of
+the NIC. In this case, interrupts are turned off as the NIC cannot
+receive any packets (as there are no buffers to put them in), and the
+need_wakeup flag is set so that user space can put buffers on the
+FILL ring and then call poll() so that the kernel driver can put these
+buffers on the HW ring and start to receive packets.
+
+If the flag is set for the TX ring, it means that the application
+needs to explicitly notify the kernel to send any packets put on the
+TX ring. This can be accomplished either by a poll() call, as in the
+RX path, or by calling sendto().
+
+An example of how to use this flag can be found in
+samples/bpf/xdpsock_user.c. An example with the use of libbpf helpers
+would look like this for the TX path:
+
+.. code-block:: c
+
+   if (xsk_ring_prod__needs_wakeup(&my_tx_ring))
+      sendto(xsk_socket__fd(xsk_handle), NULL, 0, MSG_DONTWAIT, NULL, 0);
+
+I.e., only use the syscall if the flag is set.
+
+We recommend that you always enable this mode as it usually leads to
+better performance especially if you run the application and the
+driver on the same core, but also if you use different cores for the
+application and the kernel driver, as it reduces the number of
+syscalls needed for the TX path.
+
+XDP_{RX|TX|UMEM_FILL|UMEM_COMPLETION}_RING setsockopts
+------------------------------------------------------
+
+These setsockopts sets the number of descriptors that the RX, TX,
+FILL, and COMPLETION rings respectively should have. It is mandatory
+to set the size of at least one of the RX and TX rings. If you set
+both, you will be able to both receive and send traffic from your
+application, but if you only want to do one of them, you can save
+resources by only setting up one of them. Both the FILL ring and the
+COMPLETION ring are mandatory if you have a UMEM tied to your socket,
+which is the normal case. But if the XDP_SHARED_UMEM flag is used, any
+socket after the first one does not have a UMEM and should in that
+case not have any FILL or COMPLETION rings created.
+
+XDP_UMEM_REG setsockopt
+-----------------------
+
+This setsockopt registers a UMEM to a socket. This is the area that
+contain all the buffers that packet can recide in. The call takes a
+pointer to the beginning of this area and the size of it. Moreover, it
+also has parameter called chunk_size that is the size that the UMEM is
+divided into. It can only be 2K or 4K at the moment. If you have an
+UMEM area that is 128K and a chunk size of 2K, this means that you
+will be able to hold a maximum of 128K / 2K = 64 packets in your UMEM
+area and that your largest packet size can be 2K.
+
+There is also an option to set the headroom of each single buffer in
+the UMEM. If you set this to N bytes, it means that the packet will
+start N bytes into the buffer leaving the first N bytes for the
+application to use. The final option is the flags field, but it will
+be dealt with in separate sections for each UMEM flag.
+
+SO_BINDTODEVICE setsockopt
+--------------------------
+
+This is a generic SOL_SOCKET option that can be used to tie AF_XDP
+socket to a particular network interface.  It is useful when a socket
+is created by a privileged process and passed to a non-privileged one.
+Once the option is set, kernel will refuse attempts to bind that socket
+to a different interface.  Updating the value requires CAP_NET_RAW.
+
+XDP_STATISTICS getsockopt
+-------------------------
+
+Gets drop statistics of a socket that can be useful for debug
+purposes. The supported statistics are shown below:
+
+.. code-block:: c
+
+   struct xdp_statistics {
+	  __u64 rx_dropped; /* Dropped for reasons other than invalid desc */
+	  __u64 rx_invalid_descs; /* Dropped due to invalid descriptor */
+	  __u64 tx_invalid_descs; /* Dropped due to invalid descriptor */
+   };
+
+XDP_OPTIONS getsockopt
+----------------------
+
+Gets options from an XDP socket. The only one supported so far is
+XDP_OPTIONS_ZEROCOPY which tells you if zero-copy is on or not.
+
 Usage
 =====

-In order to use AF_XDP sockets there are two parts needed. The
+In order to use AF_XDP sockets two parts are needed. The
 user-space application and the XDP program. For a complete setup and
 usage example, please refer to the sample application. The user-space
 side is xdpsock_user.c and the XDP side is part of libbpf.

-The XDP code sample included in tools/lib/bpf/xsk.c is the following::
+The XDP code sample included in tools/lib/bpf/xsk.c is the following:
+
+.. code-block:: c

   SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
   {
       int index = ctx->rx_queue_index;

-       // A set entry here means that the correspnding queue_id
+       // A set entry here means that the corresponding queue_id
       // has an active AF_XDP socket bound to it.
       if (bpf_map_lookup_elem(&xsks_map, &index))
           return bpf_redirect_map(&xsks_map, index, 0);
@ -238,7 +431,10 @@ The XDP code sample included in tools/lib/bpf/xsk.c is the following::
       return XDP_PASS;
   }

-Naive ring dequeue and enqueue could look like this::
+A simple but not so performance ring dequeue and enqueue could look
+like this:
+
+.. code-block:: c

    // struct xdp_rxtx_ring {
    // 	__u32 *producer;
@ -287,17 +483,16 @@ Naive ring dequeue and enqueue could look like this::
        return 0;
    }

-
-For a more optimized version, please refer to the sample application.
+But please use the libbpf functions as they are optimized and ready to
+use. Will make your life easier.

 Sample application
 ==================

 There is a xdpsock benchmarking/test application included that
-demonstrates how to use AF_XDP sockets with both private and shared
-UMEMs. Say that you would like your UDP traffic from port 4242 to end
-up in queue 16, that we will enable AF_XDP on. Here, we use ethtool
-for this::
+demonstrates how to use AF_XDP sockets with private UMEMs. Say that
+you would like your UDP traffic from port 4242 to end up in queue 16,
+that we will enable AF_XDP on. Here, we use ethtool for this::

      ethtool -N p3p2 rx-flow-hash udp4 fn
      ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \
@ -311,13 +506,18 @@ using::
 For XDP_SKB mode, use the switch "-S" instead of "-N" and all options
 can be displayed with "-h", as usual.

+This sample application uses libbpf to make the setup and usage of
+AF_XDP simpler. If you want to know how the raw uapi of AF_XDP is
+really used to make something more advanced, take a look at the libbpf
+code in tools/lib/bpf/xsk.[ch].
+
 FAQ
 =======

 Q: I am not seeing any traffic on the socket. What am I doing wrong?

 A: When a netdev of a physical NIC is initialized, Linux usually
-   allocates one Rx and Tx queue pair per core. So on a 8 core system,
+   allocates one RX and TX queue pair per core. So on a 8 core system,
   queue ids 0 to 7 will be allocated, one per core. In the AF_XDP
   bind call or the xsk_socket__create libbpf function call, you
   specify a specific queue id to bind to and it is only the traffic
@ -343,9 +543,21 @@ A: When a netdev of a physical NIC is initialized, Linux usually
     sudo ethtool -N <interface> flow-type udp4 src-port 4242 dst-port \
     4242 action 2

-   A number of other ways are possible all up to the capabilitites of
+   A number of other ways are possible all up to the capabilities of
   the NIC you have.

+Q: Can I use the XSKMAP to implement a switch betwen different umems
+   in copy mode?
+
+A: The short answer is no, that is not supported at the moment. The
+   XSKMAP can only be used to switch traffic coming in on queue id X
+   to sockets bound to the same queue id X. The XSKMAP can contain
+   sockets bound to different queue ids, for example X and Y, but only
+   traffic goming in from queue id Y can be directed to sockets bound
+   to the same queue id Y. In zero-copy mode, you should use the
+   switch, or other distribution mechanism, in your NIC to direct
+   traffic to the correct queue id and socket.
+
 Credits
 =======

--- a/2
+++ b/2
@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 VERSION = 5
 PATCHLEVEL = 4
-SUBLEVEL = 249
+SUBLEVEL = 254
 EXTRAVERSION =
 NAME = Kleptomaniac Octopus

--- a/arch/Kconfig
+++ b/arch/Kconfig
@ -271,6 +271,9 @@ config ARCH_HAS_UNCACHED_SEGMENT
 	select ARCH_HAS_DMA_PREP_COHERENT
 	bool

+config ARCH_HAS_CPU_FINALIZE_INIT
+	bool
+
 # Select if arch init_task must go in the __init_task_data section
 config ARCH_TASK_STRUCT_ON_STACK
       bool
--- a/arch/alpha/include/asm/bugs.h
+++ b/arch/alpha/include/asm/bugs.h
@ -1,20 +0,0 @@
-/*
- *  include/asm-alpha/bugs.h
- *
- *  Copyright (C) 1994  Linus Torvalds
- */
-
-/*
- * This is included by init/main.c to check for architecture-dependent bugs.
- *
- * Needs:
- *	void check_bugs(void);
- */
-
-/*
- * I don't know of any alpha bugs yet.. Nice chip
- */
-
-static void check_bugs(void)
-{
-}
--- a/arch/alpha/kernel/setup.c
+++ b/arch/alpha/kernel/setup.c
@ -394,8 +394,7 @@ setup_memory(void *kernel_end)
 extern void setup_memory(void *);
 #endif /* !CONFIG_DISCONTIGMEM */

-int __init
-page_is_ram(unsigned long pfn)
+int page_is_ram(unsigned long pfn)
 {
 	struct memclust_struct * cluster;
 	struct memdesc_struct * memdesc;
--- a/arch/arc/include/asm/linkage.h
+++ b/arch/arc/include/asm/linkage.h
@ -8,6 +8,10 @@

 #include <asm/dwarf.h>

+#define ASM_NL		 `	/* use '`' to mark new line in macro */
+#define __ALIGN		.align 4
+#define __ALIGN_STR	__stringify(__ALIGN)
+
 #ifdef __ASSEMBLY__

 .macro ST2 e, o, off
@ -28,10 +32,6 @@
 #endif
 .endm

-#define ASM_NL		 `	/* use '`' to mark new line in macro */
-#define __ALIGN		.align 4
-#define __ALIGN_STR	__stringify(__ALIGN)
-
 /* annotation for data we want in DCCM - if enabled in .config */
 .macro ARCFP_DATA nm
 #ifdef CONFIG_ARC_HAS_DCCM
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@ -5,6 +5,7 @@ config ARM
 	select ARCH_32BIT_OFF_T
 	select ARCH_CLOCKSOURCE_DATA
 	select ARCH_HAS_BINFMT_FLAT
+	select ARCH_HAS_CPU_FINALIZE_INIT if MMU
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_DMA_COHERENT_TO_PFN if SWIOTLB
--- a/arch/arm/boot/dts/bcm5301x.dtsi
+++ b/arch/arm/boot/dts/bcm5301x.dtsi
@ -511,7 +511,6 @@
 				  "spi_lr_session_done",
 				  "spi_lr_overread";
 		clocks = <&iprocmed>;
-		clock-names = "iprocmed";
 		num-cs = <2>;
 		#address-cells = <1>;
 		#size-cells = <0>;
--- a/arch/arm/boot/dts/imx35.dtsi
+++ b/arch/arm/boot/dts/imx35.dtsi
@ -59,7 +59,7 @@
 		interrupt-parent = <&avic>;
 		ranges;

-		L2: l2-cache@30000000 {
+		L2: cache-controller@30000000 {
 			compatible = "arm,l210-cache";
 			reg = <0x30000000 0x1000>;
 			cache-unified;
--- a/arch/arm/boot/dts/imx6qdl.dtsi
+++ b/arch/arm/boot/dts/imx6qdl.dtsi
@ -45,6 +45,10 @@
 		spi1 = &ecspi2;
 		spi2 = &ecspi3;
 		spi3 = &ecspi4;
+		usb0 = &usbotg;
+		usb1 = &usbh1;
+		usb2 = &usbh2;
+		usb3 = &usbh3;
 		usbphy0 = &usbphy1;
 		usbphy1 = &usbphy2;
 	};
@ -255,7 +259,7 @@
 			interrupt-parent = <&intc>;
 		};

-		L2: l2-cache@a02000 {
+		L2: cache-controller@a02000 {
 			compatible = "arm,pl310-cache";
 			reg = <0x00a02000 0x1000>;
 			interrupts = <0 92 IRQ_TYPE_LEVEL_HIGH>;
--- a/arch/arm/boot/dts/imx6sl.dtsi
+++ b/arch/arm/boot/dts/imx6sl.dtsi
@ -39,6 +39,9 @@
 		spi1 = &ecspi2;
 		spi2 = &ecspi3;
 		spi3 = &ecspi4;
+		usb0 = &usbotg1;
+		usb1 = &usbotg2;
+		usb2 = &usbh;
 		usbphy0 = &usbphy1;
 		usbphy1 = &usbphy2;
 	};
@ -136,7 +139,7 @@
 			interrupt-parent = <&intc>;
 		};

-		L2: l2-cache@a02000 {
+		L2: cache-controller@a02000 {
 			compatible = "arm,pl310-cache";
 			reg = <0x00a02000 0x1000>;
 			interrupts = <0 92 IRQ_TYPE_LEVEL_HIGH>;
--- a/arch/arm/boot/dts/imx6sll.dtsi
+++ b/arch/arm/boot/dts/imx6sll.dtsi
@ -36,6 +36,8 @@
 		spi1 = &ecspi2;
 		spi3 = &ecspi3;
 		spi4 = &ecspi4;
+		usb0 = &usbotg1;
+		usb1 = &usbotg2;
 		usbphy0 = &usbphy1;
 		usbphy1 = &usbphy2;
 	};
@ -49,20 +51,18 @@
 			device_type = "cpu";
 			reg = <0>;
 			next-level-cache = <&L2>;
-			operating-points = <
+			operating-points =
 				/* kHz    uV */
-				996000  1275000
-				792000  1175000
-				396000  1075000
-				198000	975000
-			>;
-			fsl,soc-operating-points = <
+				<996000  1275000>,
+				<792000  1175000>,
+				<396000  1075000>,
+				<198000	  975000>;
+			fsl,soc-operating-points =
 				/* ARM kHz      SOC-PU uV */
-				996000          1175000
-				792000          1175000
-				396000          1175000
-				198000		1175000
-			>;
+				<996000         1175000>,
+				<792000         1175000>,
+				<396000         1175000>,
+				<198000		1175000>;
 			clock-latency = <61036>; /* two CLK32 periods */
 			#cooling-cells = <2>;
 			clocks = <&clks IMX6SLL_CLK_ARM>,
@ -137,7 +137,7 @@
 			interrupt-parent = <&intc>;
 		};

-		L2: l2-cache@a02000 {
+		L2: cache-controller@a02000 {
 			compatible = "arm,pl310-cache";
 			reg = <0x00a02000 0x1000>;
 			interrupts = <GIC_SPI 92 IRQ_TYPE_LEVEL_HIGH>;
@ -272,7 +272,7 @@
 					status = "disabled";
 				};

-				ssi1: ssi-controller@2028000 {
+				ssi1: ssi@2028000 {
 					compatible = "fsl,imx6sl-ssi", "fsl,imx51-ssi";
 					reg = <0x02028000 0x4000>;
 					interrupts = <GIC_SPI 46 IRQ_TYPE_LEVEL_HIGH>;
@ -285,7 +285,7 @@
 					status = "disabled";
 				};

-				ssi2: ssi-controller@202c000 {
+				ssi2: ssi@202c000 {
 					compatible = "fsl,imx6sl-ssi", "fsl,imx51-ssi";
 					reg = <0x0202c000 0x4000>;
 					interrupts = <GIC_SPI 47 IRQ_TYPE_LEVEL_HIGH>;
@ -298,7 +298,7 @@
 					status = "disabled";
 				};

-				ssi3: ssi-controller@2030000 {
+				ssi3: ssi@2030000 {
 					compatible = "fsl,imx6sl-ssi", "fsl,imx51-ssi";
 					reg = <0x02030000 0x4000>;
 					interrupts = <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;
@ -550,7 +550,7 @@
 				reg = <0x020ca000 0x1000>;
 				interrupts = <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>;
 				clocks = <&clks IMX6SLL_CLK_USBPHY2>;
-				phy-reg_3p0-supply = <&reg_3p0>;
+				phy-3p0-supply = <&reg_3p0>;
 				fsl,anatop = <&anatop>;
 			};

--- a/arch/arm/boot/dts/imx6sx.dtsi
+++ b/arch/arm/boot/dts/imx6sx.dtsi
@ -49,6 +49,9 @@
 		spi2 = &ecspi3;
 		spi3 = &ecspi4;
 		spi4 = &ecspi5;
+		usb0 = &usbotg1;
+		usb1 = &usbotg2;
+		usb2 = &usbh;
 		usbphy0 = &usbphy1;
 		usbphy1 = &usbphy2;
 	};
@ -187,7 +190,7 @@
 			interrupt-parent = <&intc>;
 		};

-		L2: l2-cache@a02000 {
+		L2: cache-controller@a02000 {
 			compatible = "arm,pl310-cache";
 			reg = <0x00a02000 0x1000>;
 			interrupts = <GIC_SPI 92 IRQ_TYPE_LEVEL_HIGH>;
--- a/arch/arm/boot/dts/imx6ul.dtsi
+++ b/arch/arm/boot/dts/imx6ul.dtsi
@ -47,6 +47,8 @@
 		spi1 = &ecspi2;
 		spi2 = &ecspi3;
 		spi3 = &ecspi4;
+		usb0 = &usbotg1;
+		usb1 = &usbotg2;
 		usbphy0 = &usbphy1;
 		usbphy1 = &usbphy2;
 	};
--- a/arch/arm/boot/dts/imx7d.dtsi
+++ b/arch/arm/boot/dts/imx7d.dtsi
@ -7,6 +7,12 @@
 #include <dt-bindings/reset/imx7-reset.h>

 / {
+	aliases {
+		usb0 = &usbotg1;
+		usb1 = &usbotg2;
+		usb2 = &usbh;
+	};
+
 	cpus {
 		cpu0: cpu@0 {
 			clock-frequency = <996000000>;
--- a/arch/arm/boot/dts/imx7s.dtsi
+++ b/arch/arm/boot/dts/imx7s.dtsi
@ -47,6 +47,8 @@
 		spi1 = &ecspi2;
 		spi2 = &ecspi3;
 		spi3 = &ecspi4;
+		usb0 = &usbotg1;
+		usb1 = &usbh;
 	};

 	cpus {
--- a/arch/arm/boot/dts/omap3-gta04a5one.dts
+++ b/arch/arm/boot/dts/omap3-gta04a5one.dts
@ -5,9 +5,11 @@

 #include "omap3-gta04a5.dts"

-&omap3_pmx_core {
+/ {
 	model = "Goldelico GTA04A5/Letux 2804 with OneNAND";
+};

+&omap3_pmx_core {
 	gpmc_pins: pinmux_gpmc_pins {
 		pinctrl-single,pins = <

--- a/arch/arm/include/asm/bugs.h
+++ b/arch/arm/include/asm/bugs.h
@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 /*
- *  arch/arm/include/asm/bugs.h
- *
 *  Copyright (C) 1995-2003 Russell King
 */
 #ifndef __ASM_BUGS_H
@ -10,10 +8,8 @@
 extern void check_writebuffer_bugs(void);

 #ifdef CONFIG_MMU
-extern void check_bugs(void);
 extern void check_other_bugs(void);
 #else
-#define check_bugs() do { } while (0)
 #define check_other_bugs() do { } while (0)
 #endif

--- a/arch/arm/kernel/bugs.c
+++ b/arch/arm/kernel/bugs.c
@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/init.h>
+#include <linux/cpu.h>
 #include <asm/bugs.h>
 #include <asm/proc-fns.h>

@ -11,7 +12,7 @@ void check_other_bugs(void)
 #endif
 }

-void __init check_bugs(void)
+void __init arch_cpu_finalize_init(void)
 {
 	check_writebuffer_bugs();
 	check_other_bugs();
--- a/arch/arm/mach-ep93xx/timer-ep93xx.c
+++ b/arch/arm/mach-ep93xx/timer-ep93xx.c
@ -9,6 +9,7 @@
 #include <linux/io.h>
 #include <asm/mach/time.h>
 #include "soc.h"
+#include "platform.h"

 /*************************************************************************
 * Timer handling for EP93xx
@ -60,7 +61,7 @@ static u64 notrace ep93xx_read_sched_clock(void)
 	return ret;
 }

-u64 ep93xx_clocksource_read(struct clocksource *c)
+static u64 ep93xx_clocksource_read(struct clocksource *c)
 {
 	u64 ret;

--- a/arch/arm/mach-orion5x/board-dt.c
+++ b/arch/arm/mach-orion5x/board-dt.c
@ -63,6 +63,9 @@ static void __init orion5x_dt_init(void)
 	if (of_machine_is_compatible("maxtor,shared-storage-2"))
 		mss2_init();

+	if (of_machine_is_compatible("lacie,d2-network"))
+		d2net_init();
+
 	of_platform_default_populate(NULL, orion5x_auxdata_lookup, NULL);
 }

--- a/arch/arm/mach-orion5x/common.h
+++ b/arch/arm/mach-orion5x/common.h
@ -75,6 +75,12 @@ extern void mss2_init(void);
 static inline void mss2_init(void) {}
 #endif

+#ifdef CONFIG_MACH_D2NET_DT
+void d2net_init(void);
+#else
+static inline void d2net_init(void) {}
+#endif
+
 /*****************************************************************************
 * Helpers to access Orion registers
 ****************************************************************************/
--- a/arch/arm/probes/kprobes/checkers-common.c
+++ b/arch/arm/probes/kprobes/checkers-common.c
@ -40,7 +40,7 @@ enum probes_insn checker_stack_use_imm_0xx(probes_opcode_t insn,
 * Different from other insn uses imm8, the real addressing offset of
 * STRD in T32 encoding should be imm8 * 4. See ARMARM description.
 */
-enum probes_insn checker_stack_use_t32strd(probes_opcode_t insn,
+static enum probes_insn checker_stack_use_t32strd(probes_opcode_t insn,
 		struct arch_probes_insn *asi,
 		const struct decode_header *h)
 {
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@ -231,7 +231,7 @@ singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb)
 * kprobe, and that level is reserved for user kprobe handlers, so we can't
 * risk encountering a new kprobe in an interrupt handler.
 */
-void __kprobes kprobe_handler(struct pt_regs *regs)
+static void __kprobes kprobe_handler(struct pt_regs *regs)
 {
 	struct kprobe *p, *cur;
 	struct kprobe_ctlblk *kcb;
--- a/arch/arm/probes/kprobes/opt-arm.c
+++ b/arch/arm/probes/kprobes/opt-arm.c
@ -145,8 +145,6 @@ __arch_remove_optimized_kprobe(struct optimized_kprobe *op, int dirty)
 	}
 }

-extern void kprobe_handler(struct pt_regs *regs);
-
 static void
 optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
 {
--- a/arch/arm/probes/kprobes/test-core.c
+++ b/arch/arm/probes/kprobes/test-core.c
@ -720,7 +720,7 @@ static const char coverage_register_lookup[16] = {
 	[REG_TYPE_NOSPPCX]	= COVERAGE_ANY_REG | COVERAGE_SP,
 };

-unsigned coverage_start_registers(const struct decode_header *h)
+static unsigned coverage_start_registers(const struct decode_header *h)
 {
 	unsigned regs = 0;
 	int i;
--- a/arch/arm/probes/kprobes/test-core.h
+++ b/arch/arm/probes/kprobes/test-core.h
@ -454,3 +454,7 @@ void kprobe_thumb32_test_cases(void);
 #else
 void kprobe_arm_test_cases(void);
 #endif
+
+void __kprobes_test_case_start(void);
+void __kprobes_test_case_end_16(void);
+void __kprobes_test_case_end_32(void);
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
@ -129,7 +129,7 @@
 	status = "okay";
 	clock-frequency = <100000>;
 	i2c-sda-falling-time-ns = <890>;  /* hcnt */
-	i2c-sdl-falling-time-ns = <890>;  /* lcnt */
+	i2c-scl-falling-time-ns = <890>;  /* lcnt */

 	adc@14 {
 		compatible = "lltc,ltc2497";
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@ -1451,7 +1451,7 @@
 			};
 		};

-		camss: camss@1b00000 {
+		camss: camss@1b0ac00 {
 			compatible = "qcom,msm8916-camss";
 			reg = <0x1b0ac00 0x200>,
 				<0x1b00030 0x4>,
--- a/arch/arm64/boot/dts/renesas/ulcb-kf.dtsi
+++ b/arch/arm64/boot/dts/renesas/ulcb-kf.dtsi
@ -269,7 +269,7 @@
 	};

 	scif1_pins: scif1 {
-		groups = "scif1_data_b", "scif1_ctrl";
+		groups = "scif1_data_b";
 		function = "scif1";
 	};

@ -329,7 +329,6 @@
 &scif1 {
 	pinctrl-0 = <&scif1_pins>;
 	pinctrl-names = "default";
-	uart-has-rtscts;

 	status = "okay";
 };
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@ -41,7 +41,7 @@
 	(((midr) & MIDR_IMPLEMENTOR_MASK) >> MIDR_IMPLEMENTOR_SHIFT)

 #define MIDR_CPU_MODEL(imp, partnum) \
-	(((imp)			<< MIDR_IMPLEMENTOR_SHIFT) | \
+	((_AT(u32, imp)		<< MIDR_IMPLEMENTOR_SHIFT) | \
 	(0xf			<< MIDR_ARCHITECTURE_SHIFT) | \
 	((partnum)		<< MIDR_PARTNUM_SHIFT))

@ -59,6 +59,7 @@
 #define ARM_CPU_IMP_NVIDIA		0x4E
 #define ARM_CPU_IMP_FUJITSU		0x46
 #define ARM_CPU_IMP_HISI		0x48
+#define ARM_CPU_IMP_AMPERE		0xC0

 #define ARM_CPU_PART_AEM_V8		0xD0F
 #define ARM_CPU_PART_FOUNDATION		0xD00
@ -102,6 +103,8 @@

 #define HISI_CPU_PART_TSV110		0xD01

+#define AMPERE_CPU_PART_AMPERE1		0xAC3
+
 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
 #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
 #define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
@ -133,6 +136,7 @@
 #define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_CARMEL)
 #define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX)
 #define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110)
+#define MIDR_AMPERE1 MIDR_CPU_MODEL(ARM_CPU_IMP_AMPERE, AMPERE_CPU_PART_AMPERE1)

 /* Fujitsu Erratum 010001 affects A64FX 1.0 and 1.1, (v0r0 and v1r0) */
 #define MIDR_FUJITSU_ERRATUM_010001		MIDR_FUJITSU_A64FX
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@ -25,7 +25,7 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
 ({									\
 	efi_virtmap_load();						\
 	__efi_fpsimd_begin();						\
-	spin_lock(&efi_rt_lock);					\
+	raw_spin_lock(&efi_rt_lock);					\
 })

 #define arch_efi_call_virt(p, f, args...)				\
@ -37,12 +37,12 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);

 #define arch_efi_call_virt_teardown()					\
 ({									\
-	spin_unlock(&efi_rt_lock);					\
+	raw_spin_unlock(&efi_rt_lock);					\
 	__efi_fpsimd_end();						\
 	efi_virtmap_unload();						\
 })

-extern spinlock_t efi_rt_lock;
+extern raw_spinlock_t efi_rt_lock;
 efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...);

 #define ARCH_EFI_IRQ_FLAGS_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@ -1145,6 +1145,10 @@ u8 spectre_bhb_loop_affected(int scope)
 			MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
 			{},
 		};
+		static const struct midr_range spectre_bhb_k11_list[] = {
+			MIDR_ALL_VERSIONS(MIDR_AMPERE1),
+			{},
+		};
 		static const struct midr_range spectre_bhb_k8_list[] = {
 			MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
 			MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
@ -1155,6 +1159,8 @@ u8 spectre_bhb_loop_affected(int scope)
 			k = 32;
 		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
 			k = 24;
+		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
+			k = 11;
 		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
 			k =  8;

--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@ -144,7 +144,7 @@ asmlinkage efi_status_t efi_handle_corrupted_x18(efi_status_t s, const char *f)
 	return s;
 }

-DEFINE_SPINLOCK(efi_rt_lock);
+DEFINE_RAW_SPINLOCK(efi_rt_lock);

 asmlinkage u64 *efi_rt_stack_top __ro_after_init;

--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@ -407,7 +407,7 @@ void create_pgtable_mapping(phys_addr_t start, phys_addr_t end)
 static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
 				  phys_addr_t size, pgprot_t prot)
 {
-	if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
+	if (virt < PAGE_OFFSET) {
 		pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n",
 			&phys, virt);
 		return;
@ -434,7 +434,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
 				phys_addr_t size, pgprot_t prot)
 {
-	if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
+	if (virt < PAGE_OFFSET) {
 		pr_warn("BUG: not updating mapping for %pa at 0x%016lx - outside kernel range\n",
 			&phys, virt);
 		return;
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@ -8,6 +8,7 @@ menu "Processor type and features"

 config IA64
 	bool
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ACPI
--- a/arch/ia64/include/asm/bugs.h
+++ b/arch/ia64/include/asm/bugs.h
@ -1,20 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * This is included by init/main.c to check for architecture-dependent bugs.
- *
- * Needs:
- *	void check_bugs(void);
- *
- * Based on <asm-alpha/bugs.h>.
- *
- * Modified 1998, 1999, 2003
- *	David Mosberger-Tang <davidm@hpl.hp.com>,  Hewlett-Packard Co.
- */
-#ifndef _ASM_IA64_BUGS_H
-#define _ASM_IA64_BUGS_H
-
-#include <asm/processor.h>
-
-extern void check_bugs (void);
-
-#endif /* _ASM_IA64_BUGS_H */
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@ -1073,8 +1073,7 @@ cpu_init (void)
 	}
 }

-void __init
-check_bugs (void)
+void __init arch_cpu_finalize_init(void)
 {
 	ia64_patch_mckinley_e9((unsigned long) __start___mckinley_e9_bundles,
 			       (unsigned long) __end___mckinley_e9_bundles);
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@ -4,6 +4,7 @@ config M68K
 	default y
 	select ARCH_32BIT_OFF_T
 	select ARCH_HAS_BINFMT_FLAT
+	select ARCH_HAS_CPU_FINALIZE_INIT if MMU
 	select ARCH_HAS_DMA_PREP_COHERENT if HAS_DMA && MMU && !COLDFIRE
 	select ARCH_HAS_SYNC_DMA_FOR_DEVICE if HAS_DMA
 	select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
--- a/arch/m68k/include/asm/bugs.h
+++ b/arch/m68k/include/asm/bugs.h
@ -1,21 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- *  include/asm-m68k/bugs.h
- *
- *  Copyright (C) 1994  Linus Torvalds
- */
-
-/*
- * This is included by init/main.c to check for architecture-dependent bugs.
- *
- * Needs:
- *	void check_bugs(void);
- */
-
-#ifdef CONFIG_MMU
-extern void check_bugs(void);	/* in arch/m68k/kernel/setup.c */
-#else
-static void check_bugs(void)
-{
-}
-#endif
--- a/arch/m68k/kernel/setup_mm.c
+++ b/arch/m68k/kernel/setup_mm.c
@ -10,6 +10,7 @@
 */

 #include <linux/kernel.h>
+#include <linux/cpu.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/delay.h>
@ -527,7 +528,7 @@ static int __init proc_hardware_init(void)
 module_init(proc_hardware_init);
 #endif

-void check_bugs(void)
+void __init arch_cpu_finalize_init(void)
 {
 #if defined(CONFIG_FPU) && !defined(CONFIG_M68KFPU_EMU)
 	if (m68k_fputype == 0) {
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@ -5,6 +5,7 @@ config MIPS
 	select ARCH_32BIT_OFF_T if !64BIT
 	select ARCH_BINFMT_ELF_STATE if MIPS_FP_SUPPORT
 	select ARCH_CLOCKSOURCE_DATA
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
 	select ARCH_SUPPORTS_UPROBES
--- a/arch/mips/include/asm/bugs.h
+++ b/arch/mips/include/asm/bugs.h
@ -1,17 +1,11 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
- * This is included by init/main.c to check for architecture-dependent bugs.
- *
 * Copyright (C) 2007  Maciej W. Rozycki
- *
- * Needs:
- *	void check_bugs(void);
 */
 #ifndef _ASM_BUGS_H
 #define _ASM_BUGS_H

 #include <linux/bug.h>
-#include <linux/delay.h>
 #include <linux/smp.h>

 #include <asm/cpu.h>
@ -31,17 +25,6 @@ static inline void check_bugs_early(void)
 #endif
 }

-static inline void check_bugs(void)
-{
-	unsigned int cpu = smp_processor_id();
-
-	cpu_data[cpu].udelay_val = loops_per_jiffy;
-	check_bugs32();
-#ifdef CONFIG_64BIT
-	check_bugs64();
-#endif
-}
-
 static inline int r4k_daddiu_bug(void)
 {
 #ifdef CONFIG_64BIT
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@ -11,6 +11,8 @@
 * Copyright (C) 2000, 2001, 2002, 2007	 Maciej W. Rozycki
 */
 #include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/delay.h>
 #include <linux/ioport.h>
 #include <linux/export.h>
 #include <linux/screen_info.h>
@ -812,3 +814,14 @@ static int __init setnocoherentio(char *str)
 }
 early_param("nocoherentio", setnocoherentio);
 #endif
+
+void __init arch_cpu_finalize_init(void)
+{
+	unsigned int cpu = smp_processor_id();
+
+	cpu_data[cpu].udelay_val = loops_per_jiffy;
+	check_bugs32();
+
+	if (IS_ENABLED(CONFIG_CPU_R4X00_BUGS64))
+		check_bugs64();
+}
--- a/arch/parisc/include/asm/bugs.h
+++ b/arch/parisc/include/asm/bugs.h
@ -1,20 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- *  include/asm-parisc/bugs.h
- *
- *  Copyright (C) 1999	Mike Shaver
- */
-
-/*
- * This is included by init/main.c to check for architecture-dependent bugs.
- *
- * Needs:
- *	void check_bugs(void);
- */
-
-#include <asm/processor.h>
-
-static inline void check_bugs(void)
-{
-//	identify_cpu(&boot_cpu_data);
-}
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@ -234,7 +234,7 @@ config PPC_EARLY_DEBUG_40x

 config PPC_EARLY_DEBUG_CPM
 	bool "Early serial debugging for Freescale CPM-based serial ports"
-	depends on SERIAL_CPM
+	depends on SERIAL_CPM=y
 	help
 	  Select this to enable early debugging for Freescale chips
 	  using a CPM-based serial port.  This assumes that the bootwrapper
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@ -425,3 +425,11 @@ checkbin:
 		echo -n '*** Please use a different binutils version.' ; \
 		false ; \
 	fi
+	@if test "x${CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT}" = "xy" -a \
+		"x${CONFIG_LD_IS_BFD}" = "xy" -a \
+		"${CONFIG_LD_VERSION}" = "23700" ; then \
+		echo -n '*** binutils 2.37 drops unused section symbols, which recordmcount ' ; \
+		echo 'is unable to handle.' ; \
+		echo '*** Please use a different binutils version.' ; \
+		false ; \
+	fi
--- a/arch/powerpc/include/asm/bugs.h
+++ b/arch/powerpc/include/asm/bugs.h
@ -1,15 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-or-later */
-#ifndef _ASM_POWERPC_BUGS_H
-#define _ASM_POWERPC_BUGS_H
-
-/*
- */
-
-/*
- * This file is included by 'init/main.c' to check for
- * architecture-dependent bugs.
- */
-
-static inline void check_bugs(void) { }
-
-#endif	/* _ASM_POWERPC_BUGS_H */
--- a/arch/powerpc/include/asm/word-at-a-time.h
+++ b/arch/powerpc/include/asm/word-at-a-time.h
@ -34,7 +34,7 @@ static inline long find_zero(unsigned long mask)
 	return leading_zero_bits >> 3;
 }

-static inline bool has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
+static inline unsigned long has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
 {
 	unsigned long rhs = val | c->low_bits;
 	*data = rhs;
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@ -178,7 +178,7 @@ static bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long star
 	unsigned long nr_pfn = page_size / sizeof(struct page);
 	unsigned long start_pfn = page_to_pfn((struct page *)start);

-	if ((start_pfn + nr_pfn) > altmap->end_pfn)
+	if ((start_pfn + nr_pfn - 1) > altmap->end_pfn)
 		return true;

 	if (start_pfn < altmap->base_pfn)
@ -279,8 +279,7 @@ void __ref vmemmap_free(unsigned long start, unsigned long end,
 	start = _ALIGN_DOWN(start, page_size);
 	if (altmap) {
 		alt_start = altmap->base_pfn;
-		alt_end = altmap->base_pfn + altmap->reserve +
-			  altmap->free + altmap->alloc + altmap->align;
+		alt_end = altmap->base_pfn + altmap->reserve + altmap->free;
 	}

 	pr_debug("vmemmap_free %lx...%lx\n", start, end);
--- a/arch/s390/kernel/sthyi.c
+++ b/arch/s390/kernel/sthyi.c
@ -460,9 +460,9 @@ static int sthyi_update_cache(u64 *rc)
 *
 * Fills the destination with system information returned by the STHYI
 * instruction. The data is generated by emulation or execution of STHYI,
- * if available. The return value is the condition code that would be
- * returned, the rc parameter is the return code which is passed in
- * register R2 + 1.
+ * if available. The return value is either a negative error value or
+ * the condition code that would be returned, the rc parameter is the
+ * return code which is passed in register R2 + 1.
 */
 int sthyi_fill(void *dst, u64 *rc)
 {
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@ -360,8 +360,8 @@ static int handle_partial_execution(struct kvm_vcpu *vcpu)
 */
 int handle_sthyi(struct kvm_vcpu *vcpu)
 {
-	int reg1, reg2, r = 0;
-	u64 code, addr, cc = 0, rc = 0;
+	int reg1, reg2, cc = 0, r = 0;
+	u64 code, addr, rc = 0;
 	struct sthyi_sctns *sctns = NULL;

 	if (!test_kvm_facility(vcpu->kvm, 74))
@ -392,7 +392,10 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
 		return -ENOMEM;

 	cc = sthyi_fill(sctns, &rc);
-
+	if (cc < 0) {
+		free_page((unsigned long)sctns);
+		return cc;
+	}
 out:
 	if (!cc) {
 		r = write_guest(vcpu, addr, reg2, sctns, PAGE_SIZE);
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@ -1982,6 +1982,10 @@ static unsigned long kvm_s390_next_dirty_cmma(struct kvm_memslots *slots,
 		ms = slots->memslots + slotidx;
 		ofs = 0;
 	}
+
+	if (cur_gfn < ms->base_gfn)
+		ofs = 0;
+
 	ofs = find_next_bit(kvm_second_dirty_bitmap(ms), ms->npages, ofs);
 	while ((slotidx > 0) && (ofs >= ms->npages)) {
 		slotidx--;
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@ -168,7 +168,8 @@ static int setup_apcb00(struct kvm_vcpu *vcpu, unsigned long *apcb_s,
 			    sizeof(struct kvm_s390_apcb0)))
 		return -EFAULT;

-	bitmap_and(apcb_s, apcb_s, apcb_h, sizeof(struct kvm_s390_apcb0));
+	bitmap_and(apcb_s, apcb_s, apcb_h,
+		   BITS_PER_BYTE * sizeof(struct kvm_s390_apcb0));

 	return 0;
 }
@ -190,7 +191,8 @@ static int setup_apcb11(struct kvm_vcpu *vcpu, unsigned long *apcb_s,
 			    sizeof(struct kvm_s390_apcb1)))
 		return -EFAULT;

-	bitmap_and(apcb_s, apcb_s, apcb_h, sizeof(struct kvm_s390_apcb1));
+	bitmap_and(apcb_s, apcb_s, apcb_h,
+		   BITS_PER_BYTE * sizeof(struct kvm_s390_apcb1));

 	return 0;
 }
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@ -2,6 +2,7 @@
 config SUPERH
 	def_bool y
 	select ARCH_HAS_BINFMT_FLAT if !MMU
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_MIGHT_HAVE_PC_PARPORT
--- a/arch/sh/drivers/dma/dma-sh.c
+++ b/arch/sh/drivers/dma/dma-sh.c
@ -18,6 +18,18 @@
 #include <cpu/dma-register.h>
 #include <cpu/dma.h>

+/*
+ * Some of the SoCs feature two DMAC modules. In such a case, the channels are
+ * distributed equally among them.
+ */
+#ifdef	SH_DMAC_BASE1
+#define	SH_DMAC_NR_MD_CH	(CONFIG_NR_ONCHIP_DMA_CHANNELS / 2)
+#else
+#define	SH_DMAC_NR_MD_CH	CONFIG_NR_ONCHIP_DMA_CHANNELS
+#endif
+
+#define	SH_DMAC_CH_SZ		0x10
+
 /*
 * Define the default configuration for dual address memory-memory transfer.
 * The 0x400 value represents auto-request, external->external.
@ -29,7 +41,7 @@ static unsigned long dma_find_base(unsigned int chan)
 	unsigned long base = SH_DMAC_BASE0;

 #ifdef SH_DMAC_BASE1
-	if (chan >= 6)
+	if (chan >= SH_DMAC_NR_MD_CH)
 		base = SH_DMAC_BASE1;
 #endif

@ -40,13 +52,13 @@ static unsigned long dma_base_addr(unsigned int chan)
 {
 	unsigned long base = dma_find_base(chan);

-	/* Normalize offset calculation */
-	if (chan >= 9)
-		chan -= 6;
-	if (chan >= 4)
-		base += 0x10;
+	chan = (chan % SH_DMAC_NR_MD_CH) * SH_DMAC_CH_SZ;

-	return base + (chan * 0x10);
+	/* DMAOR is placed inside the channel register space. Step over it. */
+	if (chan >= DMAOR)
+		base += SH_DMAC_CH_SZ;
+
+	return base + chan;
 }

 #ifdef CONFIG_SH_DMA_IRQ_MULTI
@ -250,12 +262,11 @@ static int sh_dmac_get_dma_residue(struct dma_channel *chan)
 #define NR_DMAOR	1
 #endif

-/*
- * DMAOR bases are broken out amongst channel groups. DMAOR0 manages
- * channels 0 - 5, DMAOR1 6 - 11 (optional).
- */
-#define dmaor_read_reg(n)		__raw_readw(dma_find_base((n)*6))
-#define dmaor_write_reg(n, data)	__raw_writew(data, dma_find_base(n)*6)
+#define dmaor_read_reg(n)		__raw_readw(dma_find_base((n) * \
+						    SH_DMAC_NR_MD_CH) + DMAOR)
+#define dmaor_write_reg(n, data)	__raw_writew(data, \
+						     dma_find_base((n) * \
+						     SH_DMAC_NR_MD_CH) + DMAOR)

 static inline int dmaor_reset(int no)
 {
--- a/arch/sh/include/asm/bugs.h
+++ b/arch/sh/include/asm/bugs.h
@ -1,78 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __ASM_SH_BUGS_H
-#define __ASM_SH_BUGS_H
-
-/*
- * This is included by init/main.c to check for architecture-dependent bugs.
- *
- * Needs:
- *	void check_bugs(void);
- */
-
-/*
- * I don't know of any Super-H bugs yet.
- */
-
-#include <asm/processor.h>
-
-extern void select_idle_routine(void);
-
-static void __init check_bugs(void)
-{
-	extern unsigned long loops_per_jiffy;
-	char *p = &init_utsname()->machine[2]; /* "sh" */
-
-	select_idle_routine();
-
-	current_cpu_data.loops_per_jiffy = loops_per_jiffy;
-
-	switch (current_cpu_data.family) {
-	case CPU_FAMILY_SH2:
-		*p++ = '2';
-		break;
-	case CPU_FAMILY_SH2A:
-		*p++ = '2';
-		*p++ = 'a';
-		break;
-	case CPU_FAMILY_SH3:
-		*p++ = '3';
-		break;
-	case CPU_FAMILY_SH4:
-		*p++ = '4';
-		break;
-	case CPU_FAMILY_SH4A:
-		*p++ = '4';
-		*p++ = 'a';
-		break;
-	case CPU_FAMILY_SH4AL_DSP:
-		*p++ = '4';
-		*p++ = 'a';
-		*p++ = 'l';
-		*p++ = '-';
-		*p++ = 'd';
-		*p++ = 's';
-		*p++ = 'p';
-		break;
-	case CPU_FAMILY_SH5:
-		*p++ = '6';
-		*p++ = '4';
-		break;
-	case CPU_FAMILY_UNKNOWN:
-		/*
-		 * Specifically use CPU_FAMILY_UNKNOWN rather than
-		 * default:, so we're able to have the compiler whine
-		 * about unhandled enumerations.
-		 */
-		break;
-	}
-
-	printk("CPU: %s\n", get_cpu_subtype(&current_cpu_data));
-
-#ifndef __LITTLE_ENDIAN__
-	/* 'eb' means 'Endian Big' */
-	*p++ = 'e';
-	*p++ = 'b';
-#endif
-	*p = '\0';
-}
-#endif /* __ASM_SH_BUGS_H */
--- a/arch/sh/include/asm/processor.h
+++ b/arch/sh/include/asm/processor.h
@ -173,6 +173,8 @@ extern unsigned int instruction_size(unsigned int insn);
 #define instruction_size(insn)	(4)
 #endif

+void select_idle_routine(void);
+
 #endif /* __ASSEMBLY__ */

 #ifdef CONFIG_SUPERH32
--- a/arch/sh/kernel/cpu/sh2/probe.c
+++ b/arch/sh/kernel/cpu/sh2/probe.c
@ -21,7 +21,7 @@ static int __init scan_cache(unsigned long node, const char *uname,
 	if (!of_flat_dt_is_compatible(node, "jcore,cache"))
 		return 0;

-	j2_ccr_base = (u32 __iomem *)of_flat_dt_translate_address(node);
+	j2_ccr_base = ioremap(of_flat_dt_translate_address(node), 4);

 	return 1;
 }
--- a/arch/sh/kernel/idle.c
+++ b/arch/sh/kernel/idle.c
@ -15,6 +15,7 @@
 #include <linux/smp.h>
 #include <linux/atomic.h>
 #include <asm/pgalloc.h>
+#include <asm/processor.h>
 #include <asm/smp.h>
 #include <asm/bl_bit.h>

--- a/arch/sh/kernel/setup.c
+++ b/arch/sh/kernel/setup.c
@ -43,6 +43,7 @@
 #include <asm/smp.h>
 #include <asm/mmu_context.h>
 #include <asm/mmzone.h>
+#include <asm/processor.h>
 #include <asm/sparsemem.h>

 /*
@ -362,3 +363,57 @@ int test_mode_pin(int pin)
 {
 	return sh_mv.mv_mode_pins() & pin;
 }
+
+void __init arch_cpu_finalize_init(void)
+{
+	char *p = &init_utsname()->machine[2]; /* "sh" */
+
+	select_idle_routine();
+
+	current_cpu_data.loops_per_jiffy = loops_per_jiffy;
+
+	switch (current_cpu_data.family) {
+	case CPU_FAMILY_SH2:
+		*p++ = '2';
+		break;
+	case CPU_FAMILY_SH2A:
+		*p++ = '2';
+		*p++ = 'a';
+		break;
+	case CPU_FAMILY_SH3:
+		*p++ = '3';
+		break;
+	case CPU_FAMILY_SH4:
+		*p++ = '4';
+		break;
+	case CPU_FAMILY_SH4A:
+		*p++ = '4';
+		*p++ = 'a';
+		break;
+	case CPU_FAMILY_SH4AL_DSP:
+		*p++ = '4';
+		*p++ = 'a';
+		*p++ = 'l';
+		*p++ = '-';
+		*p++ = 'd';
+		*p++ = 's';
+		*p++ = 'p';
+		break;
+	case CPU_FAMILY_UNKNOWN:
+		/*
+		 * Specifically use CPU_FAMILY_UNKNOWN rather than
+		 * default:, so we're able to have the compiler whine
+		 * about unhandled enumerations.
+		 */
+		break;
+	}
+
+	pr_info("CPU: %s\n", get_cpu_subtype(&current_cpu_data));
+
+#ifndef __LITTLE_ENDIAN__
+	/* 'eb' means 'Endian Big' */
+	*p++ = 'e';
+	*p++ = 'b';
+#endif
+	*p = '\0';
+}
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@ -52,6 +52,7 @@ config SPARC
 config SPARC32
 	def_bool !64BIT
 	select ARCH_32BIT_OFF_T
+	select ARCH_HAS_CPU_FINALIZE_INIT if !SMP
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
 	select GENERIC_ATOMIC64
 	select CLZ_TAB
--- a/arch/sparc/include/asm/bugs.h
+++ b/arch/sparc/include/asm/bugs.h
@ -1,18 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/* include/asm/bugs.h:  Sparc probes for various bugs.
- *
- * Copyright (C) 1996, 2007 David S. Miller (davem@davemloft.net)
- */
-
-#ifdef CONFIG_SPARC32
-#include <asm/cpudata.h>
-#endif
-
-extern unsigned long loops_per_jiffy;
-
-static void __init check_bugs(void)
-{
-#if defined(CONFIG_SPARC32) && !defined(CONFIG_SMP)
-	cpu_data(0).udelay_val = loops_per_jiffy;
-#endif
-}
--- a/arch/sparc/kernel/setup_32.c
+++ b/arch/sparc/kernel/setup_32.c
@ -422,3 +422,10 @@ static int __init topology_init(void)
 }

 subsys_initcall(topology_init);
+
+#if defined(CONFIG_SPARC32) && !defined(CONFIG_SMP)
+void __init arch_cpu_finalize_init(void)
+{
+	cpu_data(0).udelay_val = loops_per_jiffy;
+}
+#endif
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@ -5,6 +5,7 @@ menu "UML-specific options"
 config UML
 	bool
 	default y
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_HAS_KCOV
 	select ARCH_NO_PREEMPT
 	select HAVE_ARCH_AUDITSYSCALL
--- a/arch/um/include/asm/bugs.h
+++ b/arch/um/include/asm/bugs.h
@ -1,7 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __UM_BUGS_H
-#define __UM_BUGS_H
-
-void check_bugs(void);
-
-#endif
--- a/arch/um/kernel/um_arch.c
+++ b/arch/um/kernel/um_arch.c
@ -3,6 +3,7 @@
 * Copyright (C) 2000 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
 */

+#include <linux/cpu.h>
 #include <linux/delay.h>
 #include <linux/init.h>
 #include <linux/mm.h>
@ -353,7 +354,7 @@ void __init setup_arch(char **cmdline_p)
 	setup_hostinfo(host_info, sizeof host_info);
 }

-void __init check_bugs(void)
+void __init arch_cpu_finalize_init(void)
 {
 	arch_check_bugs();
 	os_check_bugs();
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@ -60,6 +60,7 @@ config X86
 	select ARCH_CLOCKSOURCE_DATA
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_HAS_ACPI_TABLE_UPGRADE	if ACPI
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_ELF_RANDOMIZE
@ -2502,6 +2503,25 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK
 	def_bool y
 	depends on X86_64 || X86_PAE

+config GDS_FORCE_MITIGATION
+	bool "Force GDS Mitigation"
+	depends on CPU_SUP_INTEL
+	default n
+	help
+	  Gather Data Sampling (GDS) is a hardware vulnerability which allows
+	  unprivileged speculative access to data which was previously stored in
+	  vector registers.
+
+	  This option is equivalent to setting gather_data_sampling=force on the
+	  command line. The microcode mitigation is used if present, otherwise
+	  AVX is disabled as a mitigation. On affected systems that are missing
+	  the microcode any userspace code that unconditionally uses AVX will
+	  break with this option set.
+
+	  Setting this option on systems not vulnerable to GDS has no effect.
+
+	  If in doubt, say N.
+
 config ARCH_ENABLE_HUGEPAGE_MIGRATION
 	def_bool y
 	depends on X86_64 && HUGETLB_PAGE && MIGRATION
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@ -222,8 +222,8 @@ static unsigned long vdso_addr(unsigned long start, unsigned len)

 	/* Round the lowest possible end address up to a PMD boundary. */
 	end = (start + len + PMD_SIZE - 1) & PMD_MASK;
-	if (end >= TASK_SIZE_MAX)
-		end = TASK_SIZE_MAX;
+	if (end >= DEFAULT_MAP_WINDOW)
+		end = DEFAULT_MAP_WINDOW;
 	end -= len;

 	if (end > start) {
--- a/arch/x86/include/asm/bugs.h
+++ b/arch/x86/include/asm/bugs.h
@ -4,8 +4,6 @@

 #include <asm/processor.h>

-extern void check_bugs(void);
-
 #if defined(CONFIG_CPU_SUP_INTEL)
 void check_mpx_erratum(struct cpuinfo_x86 *c);
 #else
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@ -30,6 +30,8 @@ enum cpuid_leafs
 	CPUID_7_ECX,
 	CPUID_8000_0007_EBX,
 	CPUID_7_EDX,
+	CPUID_8000_001F_EAX,
+	CPUID_8000_0021_EAX,
 };

 #ifdef CONFIG_X86_FEATURE_NAMES
@ -88,8 +90,10 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
+	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 19, feature_bit) ||	\
+	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 20, feature_bit) ||	\
 	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 21))

 #define DISABLED_MASK_BIT_SET(feature_bit)				\
 	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@ -111,8 +115,10 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
+	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 19, feature_bit) ||	\
+	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 20, feature_bit) ||	\
 	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 21))

 #define cpu_has(c, bit)							\
 	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@ -13,8 +13,8 @@
 /*
 * Defines x86 CPU feature bits
 */
-#define NCAPINTS			19	   /* N 32-bit words worth of info */
-#define NBUGINTS			1	   /* N 32-bit bug flags */
+#define NCAPINTS			21	   /* N 32-bit words worth of info */
+#define NBUGINTS			2	   /* N 32-bit bug flags */

 /*
 * Note: If the comment begins with a quoted string, that string is used
@ -96,7 +96,7 @@
 #define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
 #define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
 #define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
-#define X86_FEATURE_SME_COHERENT	( 3*32+17) /* "" AMD hardware-enforced cache coherency */
+/* FREE!                                ( 3*32+17) */
 #define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
 #define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
 #define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
@ -201,7 +201,7 @@
 #define X86_FEATURE_INVPCID_SINGLE	( 7*32+ 7) /* Effectively INVPCID && CR4.PCIDE=1 */
 #define X86_FEATURE_HW_PSTATE		( 7*32+ 8) /* AMD HW-PState */
 #define X86_FEATURE_PROC_FEEDBACK	( 7*32+ 9) /* AMD ProcFeedbackInterface */
-#define X86_FEATURE_SME			( 7*32+10) /* AMD Secure Memory Encryption */
+/* FREE!                                ( 7*32+10) */
 #define X86_FEATURE_PTI			( 7*32+11) /* Kernel Page Table Isolation enabled */
 #define X86_FEATURE_KERNEL_IBRS		( 7*32+12) /* "" Set/clear IBRS on kernel entry/exit */
 #define X86_FEATURE_RSB_VMEXIT		( 7*32+13) /* "" Fill RSB on VM-Exit */
@ -211,7 +211,7 @@
 #define X86_FEATURE_SSBD		( 7*32+17) /* Speculative Store Bypass Disable */
 #define X86_FEATURE_MBA			( 7*32+18) /* Memory Bandwidth Allocation */
 #define X86_FEATURE_RSB_CTXSW		( 7*32+19) /* "" Fill RSB on context switches */
-#define X86_FEATURE_SEV			( 7*32+20) /* AMD Secure Encrypted Virtualization */
+/* FREE!                                ( 7*32+20) */
 #define X86_FEATURE_USE_IBPB		( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */
 #define X86_FEATURE_USE_IBRS_FW		( 7*32+22) /* "" Use IBRS during runtime firmware calls */
 #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE	( 7*32+23) /* "" Disable Speculative Store Bypass. */
@ -375,6 +375,13 @@
 #define X86_FEATURE_ARCH_CAPABILITIES	(18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */
 #define X86_FEATURE_SPEC_CTRL_SSBD	(18*32+31) /* "" Speculative Store Bypass Disable */

+/* AMD-defined memory encryption features, CPUID level 0x8000001f (EAX), word 19 */
+#define X86_FEATURE_SME			(19*32+ 0) /* AMD Secure Memory Encryption */
+#define X86_FEATURE_SEV			(19*32+ 1) /* AMD Secure Encrypted Virtualization */
+#define X86_FEATURE_VM_PAGE_FLUSH	(19*32+ 2) /* "" VM Page Flush MSR is supported */
+#define X86_FEATURE_SEV_ES		(19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */
+#define X86_FEATURE_SME_COHERENT	(19*32+10) /* "" AMD hardware-enforced cache coherency */
+
 /*
 * BUG word(s)
 */
@ -415,5 +422,6 @@
 #define X86_BUG_RETBLEED		X86_BUG(26) /* CPU is affected by RETBleed */
 #define X86_BUG_EIBRS_PBRSB		X86_BUG(27) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
 #define X86_BUG_MMIO_UNKNOWN		X86_BUG(28) /* CPU is too old and its MMIO Stale Data status is unknown */
+#define X86_BUG_GDS			X86_BUG(29) /* CPU is affected by Gather Data Sampling */

 #endif /* _ASM_X86_CPUFEATURES_H */
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@ -84,6 +84,8 @@
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
 #define DISABLED_MASK17	0
 #define DISABLED_MASK18	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define DISABLED_MASK19	0
+#define DISABLED_MASK20	0
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21)

 #endif /* _ASM_X86_DISABLED_FEATURES_H */
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@ -41,7 +41,7 @@ extern int  dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate);
 extern void fpu__init_cpu(void);
 extern void fpu__init_system_xstate(void);
 extern void fpu__init_cpu_xstate(void);
-extern void fpu__init_system(struct cpuinfo_x86 *c);
+extern void fpu__init_system(void);
 extern void fpu__init_check_bugs(void);
 extern void fpu__resume_cpu(void);
 extern u64 fpu__get_supported_xfeatures_mask(void);
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@ -77,6 +77,8 @@ early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0;
 static inline int __init
 early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }

+static inline void mem_encrypt_init(void) { }
+
 #define __bss_decrypted

 #endif	/* CONFIG_AMD_MEM_ENCRYPT */
--- a/arch/x86/include/asm/microcode.h
+++ b/arch/x86/include/asm/microcode.h
@ -5,6 +5,7 @@
 #include <asm/cpu.h>
 #include <linux/earlycpio.h>
 #include <linux/initrd.h>
+#include <asm/microcode_amd.h>

 struct ucode_patch {
 	struct list_head plist;
--- a/arch/x86/include/asm/microcode_amd.h
+++ b/arch/x86/include/asm/microcode_amd.h
@ -48,11 +48,13 @@ extern void __init load_ucode_amd_bsp(unsigned int family);
 extern void load_ucode_amd_ap(unsigned int family);
 extern int __init save_microcode_in_initrd_amd(unsigned int family);
 void reload_ucode_amd(unsigned int cpu);
+extern void amd_check_microcode(void);
 #else
 static inline void __init load_ucode_amd_bsp(unsigned int family) {}
 static inline void load_ucode_amd_ap(unsigned int family) {}
 static inline int __init
 save_microcode_in_initrd_amd(unsigned int family) { return -EINVAL; }
 static inline void reload_ucode_amd(unsigned int cpu) {}
+static inline void amd_check_microcode(void) {}
 #endif
 #endif /* _ASM_X86_MICROCODE_AMD_H */
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@ -147,6 +147,15 @@
 						 * Not susceptible to Post-Barrier
 						 * Return Stack Buffer Predictions.
 						 */
+#define ARCH_CAP_GDS_CTRL		BIT(25)	/*
+						 * CPU is vulnerable to Gather
+						 * Data Sampling (GDS) and
+						 * has controls for mitigation.
+						 */
+#define ARCH_CAP_GDS_NO			BIT(26)	/*
+						 * CPU is not vulnerable to Gather
+						 * Data Sampling (GDS).
+						 */

 #define MSR_IA32_FLUSH_CMD		0x0000010b
 #define L1D_FLUSH			BIT(0)	/*
@ -165,6 +174,8 @@
 #define MSR_IA32_MCU_OPT_CTRL		0x00000123
 #define RNGDS_MITG_DIS			BIT(0)
 #define FB_CLEAR_DIS			BIT(3)	/* CPU Fill buffer clear disable */
+#define GDS_MITG_DIS			BIT(4)	/* Disable GDS mitigation */
+#define GDS_MITG_LOCKED			BIT(5)	/* GDS mitigation locked */

 #define MSR_IA32_SYSENTER_CS		0x00000174
 #define MSR_IA32_SYSENTER_ESP		0x00000175
@ -462,6 +473,7 @@
 #define MSR_AMD64_DE_CFG		0xc0011029
 #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT	 1
 #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE	BIT_ULL(MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT)
+#define MSR_AMD64_DE_CFG_ZEN2_FP_BACKUP_FIX_BIT 9

 #define MSR_AMD64_BU_CFG2		0xc001102a
 #define MSR_AMD64_IBSFETCHCTL		0xc0011030
@ -483,6 +495,7 @@
 #define MSR_AMD64_ICIBSEXTDCTL		0xc001103c
 #define MSR_AMD64_IBSOPDATA4		0xc001103d
 #define MSR_AMD64_IBS_REG_COUNT_MAX	8 /* includes MSR_AMD64_IBSBRTARGET */
+#define MSR_AMD64_VM_PAGE_FLUSH		0xc001011e
 #define MSR_AMD64_SEV			0xc0010131
 #define MSR_AMD64_SEV_ENABLED_BIT	0
 #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@ -976,4 +976,6 @@ enum taa_mitigations {
 	TAA_MITIGATION_TSX_DISABLED,
 };

+extern bool gds_ucode_mitigated(void);
+
 #endif /* _ASM_X86_PROCESSOR_H */
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@ -101,6 +101,8 @@
 #define REQUIRED_MASK16	0
 #define REQUIRED_MASK17	0
 #define REQUIRED_MASK18	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
+#define REQUIRED_MASK19	0
+#define REQUIRED_MASK20	0
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21)

 #endif /* _ASM_X86_REQUIRED_FEATURES_H */
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@ -26,11 +26,6 @@

 #include "cpu.h"

-static const int amd_erratum_383[];
-static const int amd_erratum_400[];
-static const int amd_erratum_1054[];
-static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum);
-
 /*
 * nodes_per_socket: Stores the number of nodes per socket.
 * Refer to Fam15h Models 00-0fh BKDG - CPUID Fn8000_001E_ECX
@ -38,6 +33,79 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum);
 */
 static u32 nodes_per_socket = 1;

+/*
+ * AMD errata checking
+ *
+ * Errata are defined as arrays of ints using the AMD_LEGACY_ERRATUM() or
+ * AMD_OSVW_ERRATUM() macros. The latter is intended for newer errata that
+ * have an OSVW id assigned, which it takes as first argument. Both take a
+ * variable number of family-specific model-stepping ranges created by
+ * AMD_MODEL_RANGE().
+ *
+ * Example:
+ *
+ * const int amd_erratum_319[] =
+ *	AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x10, 0x2, 0x1, 0x4, 0x2),
+ *			   AMD_MODEL_RANGE(0x10, 0x8, 0x0, 0x8, 0x0),
+ *			   AMD_MODEL_RANGE(0x10, 0x9, 0x0, 0x9, 0x0));
+ */
+
+#define AMD_LEGACY_ERRATUM(...)		{ -1, __VA_ARGS__, 0 }
+#define AMD_OSVW_ERRATUM(osvw_id, ...)	{ osvw_id, __VA_ARGS__, 0 }
+#define AMD_MODEL_RANGE(f, m_start, s_start, m_end, s_end) \
+	((f << 24) | (m_start << 16) | (s_start << 12) | (m_end << 4) | (s_end))
+#define AMD_MODEL_RANGE_FAMILY(range)	(((range) >> 24) & 0xff)
+#define AMD_MODEL_RANGE_START(range)	(((range) >> 12) & 0xfff)
+#define AMD_MODEL_RANGE_END(range)	((range) & 0xfff)
+
+static const int amd_erratum_400[] =
+	AMD_OSVW_ERRATUM(1, AMD_MODEL_RANGE(0xf, 0x41, 0x2, 0xff, 0xf),
+			    AMD_MODEL_RANGE(0x10, 0x2, 0x1, 0xff, 0xf));
+
+static const int amd_erratum_383[] =
+	AMD_OSVW_ERRATUM(3, AMD_MODEL_RANGE(0x10, 0, 0, 0xff, 0xf));
+
+/* #1054: Instructions Retired Performance Counter May Be Inaccurate */
+static const int amd_erratum_1054[] =
+	AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0, 0, 0x2f, 0xf));
+
+static const int amd_zenbleed[] =
+	AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0x30, 0x0, 0x4f, 0xf),
+			   AMD_MODEL_RANGE(0x17, 0x60, 0x0, 0x7f, 0xf),
+			   AMD_MODEL_RANGE(0x17, 0x90, 0x0, 0x91, 0xf),
+			   AMD_MODEL_RANGE(0x17, 0xa0, 0x0, 0xaf, 0xf));
+
+static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
+{
+	int osvw_id = *erratum++;
+	u32 range;
+	u32 ms;
+
+	if (osvw_id >= 0 && osvw_id < 65536 &&
+	    cpu_has(cpu, X86_FEATURE_OSVW)) {
+		u64 osvw_len;
+
+		rdmsrl(MSR_AMD64_OSVW_ID_LENGTH, osvw_len);
+		if (osvw_id < osvw_len) {
+			u64 osvw_bits;
+
+			rdmsrl(MSR_AMD64_OSVW_STATUS + (osvw_id >> 6),
+			    osvw_bits);
+			return osvw_bits & (1ULL << (osvw_id & 0x3f));
+		}
+	}
+
+	/* OSVW unavailable or ID unknown, match family-model-stepping range */
+	ms = (cpu->x86_model << 4) | cpu->x86_stepping;
+	while ((range = *erratum++))
+		if ((cpu->x86 == AMD_MODEL_RANGE_FAMILY(range)) &&
+		    (ms >= AMD_MODEL_RANGE_START(range)) &&
+		    (ms <= AMD_MODEL_RANGE_END(range)))
+			return true;
+
+	return false;
+}
+
 static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p)
 {
 	u32 gprs[8] = { 0 };
@ -596,7 +664,7 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
 	 *	      If BIOS has not enabled SME then don't advertise the
 	 *	      SME feature (set in scattered.c).
 	 *   For SEV: If BIOS has not enabled SEV then don't advertise the
-	 *            SEV feature (set in scattered.c).
+	 *            SEV and SEV_ES feature (set in scattered.c).
 	 *
 	 *   In all cases, since support for SME and SEV requires long mode,
 	 *   don't advertise the feature under CONFIG_X86_32.
@ -627,6 +695,7 @@ clear_all:
 		setup_clear_cpu_cap(X86_FEATURE_SME);
 clear_sev:
 		setup_clear_cpu_cap(X86_FEATURE_SEV);
+		setup_clear_cpu_cap(X86_FEATURE_SEV_ES);
 	}
 }

@ -918,6 +987,47 @@ static void init_amd_zn(struct cpuinfo_x86 *c)
 	}
 }

+static bool cpu_has_zenbleed_microcode(void)
+{
+	u32 good_rev = 0;
+
+	switch (boot_cpu_data.x86_model) {
+	case 0x30 ... 0x3f: good_rev = 0x0830107a; break;
+	case 0x60 ... 0x67: good_rev = 0x0860010b; break;
+	case 0x68 ... 0x6f: good_rev = 0x08608105; break;
+	case 0x70 ... 0x7f: good_rev = 0x08701032; break;
+	case 0xa0 ... 0xaf: good_rev = 0x08a00008; break;
+
+	default:
+		return false;
+		break;
+	}
+
+	if (boot_cpu_data.microcode < good_rev)
+		return false;
+
+	return true;
+}
+
+static void zenbleed_check(struct cpuinfo_x86 *c)
+{
+	if (!cpu_has_amd_erratum(c, amd_zenbleed))
+		return;
+
+	if (cpu_has(c, X86_FEATURE_HYPERVISOR))
+		return;
+
+	if (!cpu_has(c, X86_FEATURE_AVX))
+		return;
+
+	if (!cpu_has_zenbleed_microcode()) {
+		pr_notice_once("Zenbleed: please update your microcode for the most optimal fix\n");
+		msr_set_bit(MSR_AMD64_DE_CFG, MSR_AMD64_DE_CFG_ZEN2_FP_BACKUP_FIX_BIT);
+	} else {
+		msr_clear_bit(MSR_AMD64_DE_CFG, MSR_AMD64_DE_CFG_ZEN2_FP_BACKUP_FIX_BIT);
+	}
+}
+
 static void init_amd(struct cpuinfo_x86 *c)
 {
 	early_init_amd(c);
@ -1005,6 +1115,8 @@ static void init_amd(struct cpuinfo_x86 *c)
 		msr_set_bit(MSR_K7_HWCR, MSR_K7_HWCR_IRPERF_EN_BIT);

 	check_null_seg_clears_base(c);
+
+	zenbleed_check(c);
 }

 #ifdef CONFIG_X86_32
@ -1100,73 +1212,6 @@ static const struct cpu_dev amd_cpu_dev = {

 cpu_dev_register(amd_cpu_dev);

-/*
- * AMD errata checking
- *
- * Errata are defined as arrays of ints using the AMD_LEGACY_ERRATUM() or
- * AMD_OSVW_ERRATUM() macros. The latter is intended for newer errata that
- * have an OSVW id assigned, which it takes as first argument. Both take a
- * variable number of family-specific model-stepping ranges created by
- * AMD_MODEL_RANGE().
- *
- * Example:
- *
- * const int amd_erratum_319[] =
- *	AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x10, 0x2, 0x1, 0x4, 0x2),
- *			   AMD_MODEL_RANGE(0x10, 0x8, 0x0, 0x8, 0x0),
- *			   AMD_MODEL_RANGE(0x10, 0x9, 0x0, 0x9, 0x0));
- */
-
-#define AMD_LEGACY_ERRATUM(...)		{ -1, __VA_ARGS__, 0 }
-#define AMD_OSVW_ERRATUM(osvw_id, ...)	{ osvw_id, __VA_ARGS__, 0 }
-#define AMD_MODEL_RANGE(f, m_start, s_start, m_end, s_end) \
-	((f << 24) | (m_start << 16) | (s_start << 12) | (m_end << 4) | (s_end))
-#define AMD_MODEL_RANGE_FAMILY(range)	(((range) >> 24) & 0xff)
-#define AMD_MODEL_RANGE_START(range)	(((range) >> 12) & 0xfff)
-#define AMD_MODEL_RANGE_END(range)	((range) & 0xfff)
-
-static const int amd_erratum_400[] =
-	AMD_OSVW_ERRATUM(1, AMD_MODEL_RANGE(0xf, 0x41, 0x2, 0xff, 0xf),
-			    AMD_MODEL_RANGE(0x10, 0x2, 0x1, 0xff, 0xf));
-
-static const int amd_erratum_383[] =
-	AMD_OSVW_ERRATUM(3, AMD_MODEL_RANGE(0x10, 0, 0, 0xff, 0xf));
-
-/* #1054: Instructions Retired Performance Counter May Be Inaccurate */
-static const int amd_erratum_1054[] =
-	AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0, 0, 0x2f, 0xf));
-
-static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
-{
-	int osvw_id = *erratum++;
-	u32 range;
-	u32 ms;
-
-	if (osvw_id >= 0 && osvw_id < 65536 &&
-	    cpu_has(cpu, X86_FEATURE_OSVW)) {
-		u64 osvw_len;
-
-		rdmsrl(MSR_AMD64_OSVW_ID_LENGTH, osvw_len);
-		if (osvw_id < osvw_len) {
-			u64 osvw_bits;
-
-			rdmsrl(MSR_AMD64_OSVW_STATUS + (osvw_id >> 6),
-			    osvw_bits);
-			return osvw_bits & (1ULL << (osvw_id & 0x3f));
-		}
-	}
-
-	/* OSVW unavailable or ID unknown, match family-model-stepping range */
-	ms = (cpu->x86_model << 4) | cpu->x86_stepping;
-	while ((range = *erratum++))
-		if ((cpu->x86 == AMD_MODEL_RANGE_FAMILY(range)) &&
-		    (ms >= AMD_MODEL_RANGE_START(range)) &&
-		    (ms <= AMD_MODEL_RANGE_END(range)))
-			return true;
-
-	return false;
-}
-
 void set_dr_addr_mask(unsigned long mask, int dr)
 {
 	if (!boot_cpu_has(X86_FEATURE_BPEXT))
@ -1185,3 +1230,15 @@ void set_dr_addr_mask(unsigned long mask, int dr)
 		break;
 	}
 }
+
+static void zenbleed_check_cpu(void *unused)
+{
+	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
+
+	zenbleed_check(c);
+}
+
+void amd_check_microcode(void)
+{
+	on_each_cpu(zenbleed_check_cpu, NULL, 1);
+}
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@ -9,7 +9,6 @@
 *	- Andrew D. Balsa (code cleanup).
 */
 #include <linux/init.h>
-#include <linux/utsname.h>
 #include <linux/cpu.h>
 #include <linux/module.h>
 #include <linux/nospec.h>
@ -25,9 +24,7 @@
 #include <asm/msr.h>
 #include <asm/vmx.h>
 #include <asm/paravirt.h>
-#include <asm/alternative.h>
 #include <asm/pgtable.h>
-#include <asm/set_memory.h>
 #include <asm/intel-family.h>
 #include <asm/e820/api.h>
 #include <asm/hypervisor.h>
@ -47,6 +44,7 @@ static void __init md_clear_select_mitigation(void);
 static void __init taa_select_mitigation(void);
 static void __init mmio_select_mitigation(void);
 static void __init srbds_select_mitigation(void);
+static void __init gds_select_mitigation(void);

 /* The base value of the SPEC_CTRL MSR without task-specific bits set */
 u64 x86_spec_ctrl_base;
@ -115,21 +113,8 @@ EXPORT_SYMBOL_GPL(mds_idle_clear);
 DEFINE_STATIC_KEY_FALSE(mmio_stale_data_clear);
 EXPORT_SYMBOL_GPL(mmio_stale_data_clear);

-void __init check_bugs(void)
+void __init cpu_select_mitigations(void)
 {
-	identify_boot_cpu();
-
-	/*
-	 * identify_boot_cpu() initialized SMT support information, let the
-	 * core code know.
-	 */
-	cpu_smt_check_topology();
-
-	if (!IS_ENABLED(CONFIG_SMP)) {
-		pr_info("CPU: ");
-		print_cpu_info(&boot_cpu_data);
-	}
-
 	/*
 	 * Read the SPEC_CTRL MSR to account for reserved bits which may
 	 * have unknown values. AMD64_LS_CFG MSR is cached in the early AMD
@ -165,39 +150,7 @@ void __init check_bugs(void)
 	l1tf_select_mitigation();
 	md_clear_select_mitigation();
 	srbds_select_mitigation();
-
-	arch_smt_update();
-
-#ifdef CONFIG_X86_32
-	/*
-	 * Check whether we are able to run this kernel safely on SMP.
-	 *
-	 * - i386 is no longer supported.
-	 * - In order to run on anything without a TSC, we need to be
-	 *   compiled for a i486.
-	 */
-	if (boot_cpu_data.x86 < 4)
-		panic("Kernel requires i486+ for 'invlpg' and other features");
-
-	init_utsname()->machine[1] =
-		'0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86);
-	alternative_instructions();
-
-	fpu__init_check_bugs();
-#else /* CONFIG_X86_64 */
-	alternative_instructions();
-
-	/*
-	 * Make sure the first 2MB area is not mapped by huge pages
-	 * There are typically fixed size MTRRs in there and overlapping
-	 * MTRRs into large pages causes slow downs.
-	 *
-	 * Right now we don't do that with gbpages because there seems
-	 * very little benefit for that case.
-	 */
-	if (!direct_gbpages)
-		set_memory_4k((unsigned long)__va(0), 1);
-#endif
+	gds_select_mitigation();
 }

 /*
@ -648,6 +601,149 @@ static int __init srbds_parse_cmdline(char *str)
 }
 early_param("srbds", srbds_parse_cmdline);

+#undef pr_fmt
+#define pr_fmt(fmt)	"GDS: " fmt
+
+enum gds_mitigations {
+	GDS_MITIGATION_OFF,
+	GDS_MITIGATION_UCODE_NEEDED,
+	GDS_MITIGATION_FORCE,
+	GDS_MITIGATION_FULL,
+	GDS_MITIGATION_FULL_LOCKED,
+	GDS_MITIGATION_HYPERVISOR,
+};
+
+#if IS_ENABLED(CONFIG_GDS_FORCE_MITIGATION)
+static enum gds_mitigations gds_mitigation __ro_after_init = GDS_MITIGATION_FORCE;
+#else
+static enum gds_mitigations gds_mitigation __ro_after_init = GDS_MITIGATION_FULL;
+#endif
+
+static const char * const gds_strings[] = {
+	[GDS_MITIGATION_OFF]		= "Vulnerable",
+	[GDS_MITIGATION_UCODE_NEEDED]	= "Vulnerable: No microcode",
+	[GDS_MITIGATION_FORCE]		= "Mitigation: AVX disabled, no microcode",
+	[GDS_MITIGATION_FULL]		= "Mitigation: Microcode",
+	[GDS_MITIGATION_FULL_LOCKED]	= "Mitigation: Microcode (locked)",
+	[GDS_MITIGATION_HYPERVISOR]	= "Unknown: Dependent on hypervisor status",
+};
+
+bool gds_ucode_mitigated(void)
+{
+	return (gds_mitigation == GDS_MITIGATION_FULL ||
+		gds_mitigation == GDS_MITIGATION_FULL_LOCKED);
+}
+EXPORT_SYMBOL_GPL(gds_ucode_mitigated);
+
+void update_gds_msr(void)
+{
+	u64 mcu_ctrl_after;
+	u64 mcu_ctrl;
+
+	switch (gds_mitigation) {
+	case GDS_MITIGATION_OFF:
+		rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
+		mcu_ctrl |= GDS_MITG_DIS;
+		break;
+	case GDS_MITIGATION_FULL_LOCKED:
+		/*
+		 * The LOCKED state comes from the boot CPU. APs might not have
+		 * the same state. Make sure the mitigation is enabled on all
+		 * CPUs.
+		 */
+	case GDS_MITIGATION_FULL:
+		rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
+		mcu_ctrl &= ~GDS_MITG_DIS;
+		break;
+	case GDS_MITIGATION_FORCE:
+	case GDS_MITIGATION_UCODE_NEEDED:
+	case GDS_MITIGATION_HYPERVISOR:
+		return;
+	};
+
+	wrmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
+
+	/*
+	 * Check to make sure that the WRMSR value was not ignored. Writes to
+	 * GDS_MITG_DIS will be ignored if this processor is locked but the boot
+	 * processor was not.
+	 */
+	rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl_after);
+	WARN_ON_ONCE(mcu_ctrl != mcu_ctrl_after);
+}
+
+static void __init gds_select_mitigation(void)
+{
+	u64 mcu_ctrl;
+
+	if (!boot_cpu_has_bug(X86_BUG_GDS))
+		return;
+
+	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
+		gds_mitigation = GDS_MITIGATION_HYPERVISOR;
+		goto out;
+	}
+
+	if (cpu_mitigations_off())
+		gds_mitigation = GDS_MITIGATION_OFF;
+	/* Will verify below that mitigation _can_ be disabled */
+
+	/* No microcode */
+	if (!(x86_read_arch_cap_msr() & ARCH_CAP_GDS_CTRL)) {
+		if (gds_mitigation == GDS_MITIGATION_FORCE) {
+			/*
+			 * This only needs to be done on the boot CPU so do it
+			 * here rather than in update_gds_msr()
+			 */
+			setup_clear_cpu_cap(X86_FEATURE_AVX);
+			pr_warn("Microcode update needed! Disabling AVX as mitigation.\n");
+		} else {
+			gds_mitigation = GDS_MITIGATION_UCODE_NEEDED;
+		}
+		goto out;
+	}
+
+	/* Microcode has mitigation, use it */
+	if (gds_mitigation == GDS_MITIGATION_FORCE)
+		gds_mitigation = GDS_MITIGATION_FULL;
+
+	rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
+	if (mcu_ctrl & GDS_MITG_LOCKED) {
+		if (gds_mitigation == GDS_MITIGATION_OFF)
+			pr_warn("Mitigation locked. Disable failed.\n");
+
+		/*
+		 * The mitigation is selected from the boot CPU. All other CPUs
+		 * _should_ have the same state. If the boot CPU isn't locked
+		 * but others are then update_gds_msr() will WARN() of the state
+		 * mismatch. If the boot CPU is locked update_gds_msr() will
+		 * ensure the other CPUs have the mitigation enabled.
+		 */
+		gds_mitigation = GDS_MITIGATION_FULL_LOCKED;
+	}
+
+	update_gds_msr();
+out:
+	pr_info("%s\n", gds_strings[gds_mitigation]);
+}
+
+static int __init gds_parse_cmdline(char *str)
+{
+	if (!str)
+		return -EINVAL;
+
+	if (!boot_cpu_has_bug(X86_BUG_GDS))
+		return 0;
+
+	if (!strcmp(str, "off"))
+		gds_mitigation = GDS_MITIGATION_OFF;
+	else if (!strcmp(str, "force"))
+		gds_mitigation = GDS_MITIGATION_FORCE;
+
+	return 0;
+}
+early_param("gather_data_sampling", gds_parse_cmdline);
+
 #undef pr_fmt
 #define pr_fmt(fmt)     "Spectre V1 : " fmt

@ -2207,6 +2303,11 @@ static ssize_t retbleed_show_state(char *buf)
 	return sprintf(buf, "%s\n", retbleed_strings[retbleed_mitigation]);
 }

+static ssize_t gds_show_state(char *buf)
+{
+	return sysfs_emit(buf, "%s\n", gds_strings[gds_mitigation]);
+}
+
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
 			       char *buf, unsigned int bug)
 {
@ -2256,6 +2357,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
 	case X86_BUG_RETBLEED:
 		return retbleed_show_state(buf);

+	case X86_BUG_GDS:
+		return gds_show_state(buf);
+
 	default:
 		break;
 	}
@ -2320,4 +2424,9 @@ ssize_t cpu_show_retbleed(struct device *dev, struct device_attribute *attr, cha
 {
 	return cpu_show_common(dev, attr, buf, X86_BUG_RETBLEED);
 }
+
+ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return cpu_show_common(dev, attr, buf, X86_BUG_GDS);
+}
 #endif
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@ -17,11 +17,16 @@
 #include <linux/init.h>
 #include <linux/kprobes.h>
 #include <linux/kgdb.h>
+#include <linux/mem_encrypt.h>
 #include <linux/smp.h>
+#include <linux/cpu.h>
 #include <linux/io.h>
 #include <linux/syscore_ops.h>

 #include <asm/stackprotector.h>
+#include <linux/utsname.h>
+
+#include <asm/alternative.h>
 #include <asm/perf_event.h>
 #include <asm/mmu_context.h>
 #include <asm/archrandom.h>
@ -57,6 +62,7 @@
 #ifdef CONFIG_X86_LOCAL_APIC
 #include <asm/uv/uv.h>
 #endif
+#include <asm/set_memory.h>

 #include "cpu.h"

@ -444,8 +450,6 @@ static bool pku_disabled;

 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	struct pkru_state *pk;
-
 	/* check the boot processor, plus compile options for PKU: */
 	if (!cpu_feature_enabled(X86_FEATURE_PKU))
 		return;
@ -456,9 +460,6 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 		return;

 	cr4_set_bits(X86_CR4_PKE);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (pk)
-		pk->pkru = init_pkru_value;
 	/*
 	 * Seting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
 	 * cpuid bit to be set.  We need to ensure that we
@ -961,6 +962,12 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 	if (c->extended_cpuid_level >= 0x8000000a)
 		c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);

+	if (c->extended_cpuid_level >= 0x8000001f)
+		c->x86_capability[CPUID_8000_001F_EAX] = cpuid_eax(0x8000001f);
+
+	if (c->extended_cpuid_level >= 0x80000021)
+		c->x86_capability[CPUID_8000_0021_EAX] = cpuid_eax(0x80000021);
+
 	init_scattered_cpuid_features(c);
 	init_speculation_control(c);
 	init_cqm(c);
@ -1123,6 +1130,12 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
 #define MMIO_SBDS	BIT(2)
 /* CPU is affected by RETbleed, speculating where you would not expect it */
 #define RETBLEED	BIT(3)
+/* CPU is affected by SMT (cross-thread) return predictions */
+#define SMT_RSB		BIT(4)
+/* CPU is affected by SRSO */
+#define SRSO		BIT(5)
+/* CPU is affected by GDS */
+#define GDS		BIT(6)

 static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
 	VULNBL_INTEL_STEPPINGS(IVYBRIDGE,	X86_STEPPING_ANY,		SRBDS),
@ -1135,19 +1148,21 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
 	VULNBL_INTEL_STEPPINGS(BROADWELL_X,	X86_STEPPING_ANY,		MMIO),
 	VULNBL_INTEL_STEPPINGS(BROADWELL,	X86_STEPPING_ANY,		SRBDS),
 	VULNBL_INTEL_STEPPINGS(SKYLAKE_L,	X86_STEPPING_ANY,		SRBDS | MMIO | RETBLEED),
-	VULNBL_INTEL_STEPPINGS(SKYLAKE_X,	X86_STEPPING_ANY,		MMIO | RETBLEED),
+	VULNBL_INTEL_STEPPINGS(SKYLAKE_X,	X86_STEPPING_ANY,		MMIO | RETBLEED | GDS),
 	VULNBL_INTEL_STEPPINGS(SKYLAKE,		X86_STEPPING_ANY,		SRBDS | MMIO | RETBLEED),
-	VULNBL_INTEL_STEPPINGS(KABYLAKE_L,	X86_STEPPING_ANY,		SRBDS | MMIO | RETBLEED),
-	VULNBL_INTEL_STEPPINGS(KABYLAKE,	X86_STEPPING_ANY,		SRBDS | MMIO | RETBLEED),
+	VULNBL_INTEL_STEPPINGS(KABYLAKE_L,	X86_STEPPING_ANY,		SRBDS | MMIO | RETBLEED | GDS),
+	VULNBL_INTEL_STEPPINGS(KABYLAKE,	X86_STEPPING_ANY,		SRBDS | MMIO | RETBLEED | GDS),
 	VULNBL_INTEL_STEPPINGS(CANNONLAKE_L,	X86_STEPPING_ANY,		RETBLEED),
-	VULNBL_INTEL_STEPPINGS(ICELAKE_L,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS | RETBLEED),
-	VULNBL_INTEL_STEPPINGS(ICELAKE_D,	X86_STEPPING_ANY,		MMIO),
-	VULNBL_INTEL_STEPPINGS(ICELAKE_X,	X86_STEPPING_ANY,		MMIO),
-	VULNBL_INTEL_STEPPINGS(COMETLAKE,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS | RETBLEED),
+	VULNBL_INTEL_STEPPINGS(ICELAKE_L,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS | RETBLEED | GDS),
+	VULNBL_INTEL_STEPPINGS(ICELAKE_D,	X86_STEPPING_ANY,		MMIO | GDS),
+	VULNBL_INTEL_STEPPINGS(ICELAKE_X,	X86_STEPPING_ANY,		MMIO | GDS),
+	VULNBL_INTEL_STEPPINGS(COMETLAKE,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS | RETBLEED | GDS),
 	VULNBL_INTEL_STEPPINGS(COMETLAKE_L,	X86_STEPPINGS(0x0, 0x0),	MMIO | RETBLEED),
-	VULNBL_INTEL_STEPPINGS(COMETLAKE_L,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS | RETBLEED),
+	VULNBL_INTEL_STEPPINGS(COMETLAKE_L,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS | RETBLEED | GDS),
+	VULNBL_INTEL_STEPPINGS(TIGERLAKE_L,	X86_STEPPING_ANY,		GDS),
+	VULNBL_INTEL_STEPPINGS(TIGERLAKE,	X86_STEPPING_ANY,		GDS),
 	VULNBL_INTEL_STEPPINGS(LAKEFIELD,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS | RETBLEED),
-	VULNBL_INTEL_STEPPINGS(ROCKETLAKE,	X86_STEPPING_ANY,		MMIO | RETBLEED),
+	VULNBL_INTEL_STEPPINGS(ROCKETLAKE,	X86_STEPPING_ANY,		MMIO | RETBLEED | GDS),
 	VULNBL_INTEL_STEPPINGS(ATOM_TREMONT,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS),
 	VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D,	X86_STEPPING_ANY,		MMIO),
 	VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L,	X86_STEPPING_ANY,		MMIO | MMIO_SBDS),
@ -1273,6 +1288,16 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 	    !(ia32_cap & ARCH_CAP_PBRSB_NO))
 		setup_force_cpu_bug(X86_BUG_EIBRS_PBRSB);

+	/*
+	 * Check if CPU is vulnerable to GDS. If running in a virtual machine on
+	 * an affected processor, the VMM may have disabled the use of GATHER by
+	 * disabling AVX2. The only way to do this in HW is to clear XCR0[2],
+	 * which means that AVX will be disabled.
+	 */
+	if (cpu_matches(cpu_vuln_blacklist, GDS) && !(ia32_cap & ARCH_CAP_GDS_NO) &&
+	    boot_cpu_has(X86_FEATURE_AVX))
+		setup_force_cpu_bug(X86_BUG_GDS);
+
 	if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
 		return;

@ -1358,8 +1383,6 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)

 	cpu_set_bug_bits(c);

-	fpu__init_system(c);
-
 #ifdef CONFIG_X86_32
 	/*
 	 * Regardless of whether PCID is enumerated, the SDM says
@ -1751,6 +1774,8 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c)
 	validate_apic_and_package_id(c);
 	x86_spec_ctrl_setup_ap();
 	update_srbds_msr();
+	if (boot_cpu_has_bug(X86_BUG_GDS))
+		update_gds_msr();
 }

 static __init int setup_noclflush(char *arg)
@ -2049,8 +2074,6 @@ void cpu_init(void)
 	clear_all_debug_regs();
 	dbg_restore_debug_regs();

-	fpu__init_cpu();
-
 	if (is_uv_system())
 		uv_cpu_init();

@ -2108,8 +2131,6 @@ void cpu_init(void)
 	clear_all_debug_regs();
 	dbg_restore_debug_regs();

-	fpu__init_cpu();
-
 	load_fixmap_gdt(cpu);
 }
 #endif
@ -2125,6 +2146,8 @@ void microcode_check(void)

 	perf_check_microcode();

+	amd_check_microcode();
+
 	/* Reload CPUID max function as it might've changed. */
 	info.cpuid_level = cpuid_eax(0);

@ -2154,3 +2177,69 @@ void arch_smt_update(void)
 	/* Check whether IPI broadcasting can be enabled */
 	apic_smt_update();
 }
+
+void __init arch_cpu_finalize_init(void)
+{
+	identify_boot_cpu();
+
+	/*
+	 * identify_boot_cpu() initialized SMT support information, let the
+	 * core code know.
+	 */
+	cpu_smt_check_topology();
+
+	if (!IS_ENABLED(CONFIG_SMP)) {
+		pr_info("CPU: ");
+		print_cpu_info(&boot_cpu_data);
+	}
+
+	cpu_select_mitigations();
+
+	arch_smt_update();
+
+	if (IS_ENABLED(CONFIG_X86_32)) {
+		/*
+		 * Check whether this is a real i386 which is not longer
+		 * supported and fixup the utsname.
+		 */
+		if (boot_cpu_data.x86 < 4)
+			panic("Kernel requires i486+ for 'invlpg' and other features");
+
+		init_utsname()->machine[1] =
+			'0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86);
+	}
+
+	/*
+	 * Must be before alternatives because it might set or clear
+	 * feature bits.
+	 */
+	fpu__init_system();
+	fpu__init_cpu();
+
+	alternative_instructions();
+
+	if (IS_ENABLED(CONFIG_X86_64)) {
+		/*
+		 * Make sure the first 2MB area is not mapped by huge pages
+		 * There are typically fixed size MTRRs in there and overlapping
+		 * MTRRs into large pages causes slow downs.
+		 *
+		 * Right now we don't do that with gbpages because there seems
+		 * very little benefit for that case.
+		 */
+		if (!direct_gbpages)
+			set_memory_4k((unsigned long)__va(0), 1);
+	} else {
+		fpu__init_check_bugs();
+	}
+
+	/*
+	 * This needs to be called before any devices perform DMA
+	 * operations that might use the SWIOTLB bounce buffers. It will
+	 * mark the bounce buffers as decrypted so that their usage will
+	 * not cause "plain-text" data to be decrypted when accessed. It
+	 * must be called after late_time_init() so that Hyper-V x86/x64
+	 * hypercalls work when the SWIOTLB bounce buffers are decrypted.
+	 */
+	mem_encrypt_init();
+}
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@ -76,9 +76,11 @@ extern void detect_ht(struct cpuinfo_x86 *c);
 extern void check_null_seg_clears_base(struct cpuinfo_x86 *c);

 unsigned int aperfmperf_get_khz(int cpu);
+void cpu_select_mitigations(void);

 extern void x86_spec_ctrl_setup_ap(void);
 extern void update_srbds_msr(void);
+extern void update_gds_msr(void);

 extern u64 x86_read_arch_cap_msr(void);

--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@ -700,7 +700,7 @@ static enum ucode_state apply_microcode_amd(int cpu)
 	rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);

 	/* need to apply patch? */
-	if (rev >= mc_amd->hdr.patch_id) {
+	if (rev > mc_amd->hdr.patch_id) {
 		ret = UCODE_OK;
 		goto out;
 	}
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@ -593,6 +593,18 @@ static int __rdtgroup_move_task(struct task_struct *tsk,
 	return 0;
 }

+static bool is_closid_match(struct task_struct *t, struct rdtgroup *r)
+{
+	return (rdt_alloc_capable &&
+	       (r->type == RDTCTRL_GROUP) && (t->closid == r->closid));
+}
+
+static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r)
+{
+	return (rdt_mon_capable &&
+	       (r->type == RDTMON_GROUP) && (t->rmid == r->mon.rmid));
+}
+
 /**
 * rdtgroup_tasks_assigned - Test if tasks have been assigned to resource group
 * @r: Resource group
@ -608,8 +620,7 @@ int rdtgroup_tasks_assigned(struct rdtgroup *r)

 	rcu_read_lock();
 	for_each_process_thread(p, t) {
-		if ((r->type == RDTCTRL_GROUP && t->closid == r->closid) ||
-		    (r->type == RDTMON_GROUP && t->rmid == r->mon.rmid)) {
+		if (is_closid_match(t, r) || is_rmid_match(t, r)) {
 			ret = 1;
 			break;
 		}
@ -704,12 +715,15 @@ unlock:
 static void show_rdt_tasks(struct rdtgroup *r, struct seq_file *s)
 {
 	struct task_struct *p, *t;
+	pid_t pid;

 	rcu_read_lock();
 	for_each_process_thread(p, t) {
-		if ((r->type == RDTCTRL_GROUP && t->closid == r->closid) ||
-		    (r->type == RDTMON_GROUP && t->rmid == r->mon.rmid))
-			seq_printf(s, "%d\n", t->pid);
+		if (is_closid_match(t, r) || is_rmid_match(t, r)) {
+			pid = task_pid_vnr(t);
+			if (pid)
+				seq_printf(s, "%d\n", pid);
+		}
 	}
 	rcu_read_unlock();
 }
@ -2148,18 +2162,6 @@ static int reset_all_ctrls(struct rdt_resource *r)
 	return 0;
 }

-static bool is_closid_match(struct task_struct *t, struct rdtgroup *r)
-{
-	return (rdt_alloc_capable &&
-		(r->type == RDTCTRL_GROUP) && (t->closid == r->closid));
-}
-
-static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r)
-{
-	return (rdt_mon_capable &&
-		(r->type == RDTMON_GROUP) && (t->rmid == r->mon.rmid));
-}
-
 /*
 * Move tasks from one to the other group. If @from is NULL, then all tasks
 * in the systems are moved unconditionally (used for teardown).
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@ -40,9 +40,6 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_MBA,		CPUID_EBX,  6, 0x80000008, 0 },
-	{ X86_FEATURE_SME,		CPUID_EAX,  0, 0x8000001f, 0 },
-	{ X86_FEATURE_SEV,		CPUID_EAX,  1, 0x8000001f, 0 },
-	{ X86_FEATURE_SME_COHERENT,	CPUID_EAX, 10, 0x8000001f, 0 },
 	{ 0, 0, 0, 0, 0 }
 };

--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@ -50,7 +50,7 @@ void fpu__init_cpu(void)
 	fpu__init_cpu_xstate();
 }

-static bool fpu__probe_without_cpuid(void)
+static bool __init fpu__probe_without_cpuid(void)
 {
 	unsigned long cr0;
 	u16 fsw, fcw;
@ -68,7 +68,7 @@ static bool fpu__probe_without_cpuid(void)
 	return fsw == 0 && (fcw & 0x103f) == 0x003f;
 }

-static void fpu__init_system_early_generic(struct cpuinfo_x86 *c)
+static void __init fpu__init_system_early_generic(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_CPUID) &&
 	    !test_bit(X86_FEATURE_FPU, (unsigned long *)cpu_caps_cleared)) {
@ -290,10 +290,10 @@ static void __init fpu__init_parse_early_param(void)
 * Called on the boot CPU once per system bootup, to set up the initial
 * FPU state that is later cloned into all processes:
 */
-void __init fpu__init_system(struct cpuinfo_x86 *c)
+void __init fpu__init_system(void)
 {
 	fpu__init_parse_early_param();
-	fpu__init_system_early_generic(c);
+	fpu__init_system_early_generic();

 	/*
 	 * The FPU has to be operational for some of the
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@ -99,6 +99,17 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
 DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 EXPORT_PER_CPU_SYMBOL(cpu_info);

+struct mwait_cpu_dead {
+	unsigned int	control;
+	unsigned int	status;
+};
+
+/*
+ * Cache line aligned data for mwait_play_dead(). Separate on purpose so
+ * that it's unlikely to be touched by other CPUs.
+ */
+static DEFINE_PER_CPU_ALIGNED(struct mwait_cpu_dead, mwait_cpu_dead);
+
 /* Logical package management. We might want to allocate that dynamically */
 unsigned int __max_logical_packages __read_mostly;
 EXPORT_SYMBOL(__max_logical_packages);
@ -224,6 +235,7 @@ static void notrace start_secondary(void *unused)
 #endif
 	load_current_idt();
 	cpu_init();
+	fpu__init_cpu();
 	rcu_cpu_starting(raw_smp_processor_id());
 	x86_cpuinit.early_percpu_clock_init();
 	preempt_disable();
@ -1675,10 +1687,10 @@ static bool wakeup_cpu0(void)
 */
 static inline void mwait_play_dead(void)
 {
+	struct mwait_cpu_dead *md = this_cpu_ptr(&mwait_cpu_dead);
 	unsigned int eax, ebx, ecx, edx;
 	unsigned int highest_cstate = 0;
 	unsigned int highest_subcstate = 0;
-	void *mwait_ptr;
 	int i;

 	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
@ -1713,13 +1725,6 @@ static inline void mwait_play_dead(void)
 			(highest_subcstate - 1);
 	}

-	/*
-	 * This should be a memory location in a cache line which is
-	 * unlikely to be touched by other processors.  The actual
-	 * content is immaterial as it is not actually modified in any way.
-	 */
-	mwait_ptr = &current_thread_info()->flags;
-
 	wbinvd();

 	while (1) {
@ -1731,9 +1736,9 @@ static inline void mwait_play_dead(void)
 		 * case where we return around the loop.
 		 */
 		mb();
-		clflush(mwait_ptr);
+		clflush(md);
 		mb();
-		__monitor(mwait_ptr, 0, 0);
+		__monitor(md, 0, 0);
 		mb();
 		__mwait(eax, 0);
 		/*
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@ -53,6 +53,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
 	[CPUID_7_ECX]         = {         7, 0, CPUID_ECX},
 	[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
 	[CPUID_7_EDX]         = {         7, 0, CPUID_EDX},
+	[CPUID_8000_0021_EAX] = {0x80000021, 0, CPUID_EAX},
 };

 static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@ -1409,6 +1409,9 @@ static u64 kvm_get_arch_capabilities(void)
 	/* Guests don't need to know "Fill buffer clear control" exists */
 	data &= ~ARCH_CAP_FB_CLEAR_CTRL;

+	if (!boot_cpu_has_bug(X86_BUG_GDS) || gds_ucode_mitigated())
+		data |= ARCH_CAP_GDS_NO;
+
 	return data;
 }

--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@ -7,6 +7,7 @@
 #include <linux/swapops.h>
 #include <linux/kmemleak.h>
 #include <linux/sched/task.h>
+#include <linux/sched/mm.h>

 #include <asm/set_memory.h>
 #include <asm/cpu_device_id.h>
@ -26,6 +27,7 @@
 #include <asm/cpufeature.h>
 #include <asm/pti.h>
 #include <asm/text-patching.h>
+#include <asm/paravirt.h>

 /*
 * We need to define the tracepoints somewhere, and tlb.c
@ -735,9 +737,12 @@ void __init poking_init(void)
 	spinlock_t *ptl;
 	pte_t *ptep;

-	poking_mm = copy_init_mm();
+	poking_mm = mm_alloc();
 	BUG_ON(!poking_mm);

+	/* Xen PV guests need the PGD to be pinned. */
+	paravirt_arch_dup_mmap(NULL, poking_mm);
+
 	/*
 	 * Randomize the poking address, but make sure that the following page
 	 * will be mapped at the same PMD. We need 2 pages, so find space for 3,
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@ -10,7 +10,6 @@

 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/fpu/internal.h>		/* init_fpstate			*/

 int __execute_only_pkey(struct mm_struct *mm)
 {
@ -154,7 +153,6 @@ static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf,
 static ssize_t init_pkru_write_file(struct file *file,
 		 const char __user *user_buf, size_t count, loff_t *ppos)
 {
-	struct pkru_state *pk;
 	char buf[32];
 	ssize_t len;
 	u32 new_init_pkru;
@ -177,10 +175,6 @@ static ssize_t init_pkru_write_file(struct file *file,
 		return -EINVAL;

 	WRITE_ONCE(init_pkru_value, new_init_pkru);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (!pk)
-		return -EINVAL;
-	pk->pkru = new_init_pkru;
 	return count;
 }

--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@ -28,6 +28,7 @@
 #include <asm/desc.h>
 #include <asm/pgtable.h>
 #include <asm/cpu.h>
+#include <asm/fpu/internal.h>

 #include <xen/interface/xen.h>
 #include <xen/interface/vcpu.h>
@ -61,6 +62,7 @@ static void cpu_bringup(void)

 	cr4_init();
 	cpu_init();
+	fpu__init_cpu();
 	touch_softlockup_watchdog();
 	preempt_disable();

--- a/Show More
+++ b/Show More