-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmGBh6EACgkQONu9yGCS
aT4J7A//f9Hx5zW04Y1HOqF4Cd3zDTjSVLzgArYwHRsO22+jin+SqgxgjeXhW0d8
3VkZeSaSvEuwWMB8HCuayl88nzDudFNHm/XReCTnt4uKiP8VOFoDMHQGDeGGl6Rr
U2212K8Q3xIkA5OYa5Oma1/IbnL7XDUnte4iHTvIYiBvNwFFd3rDiCUi9sdFti0P
SWZI0jFtkZVztohayTdb9y5dcIMiLbvtJEB0aX1XAmHiFWqgD0maVym2fdX2L+5c
p6O+eZxRH0LEVham6URh61YnD9b1by+bcIUdWlgnmZkPAf3AXskmWBo1bIcISSXC
M/8RlBqlNgKVXD0Y7890ytkTQF+EQgILj0lR5plaeYIp47YyOTYLFg/Ues7dRhn6
XeP3sP/viqguYNzE54dX3t5HfYTbW3h/xzEXMoVZPuPRcM2f/YGAiOjxVyjv5hgv
/4bQ1E9gfkNprXiDAad0VUfokxcqzFQR6s9asqmXaaNbvZ1a0Mk8UeR0qcl0FTvw
dC6tQZgW2+d0Yi5kAG8pv/RCbZzgJwJa/tJ+I67XYdMUvISXkaGF5hMx6WG7wZBF
NSW5JsBh0m8b2hKyypA3sktK0DJfx01y3/wZSXgAv+8by66hvQQuDN1mftChQnZH
SAmQovITD85QXZ3LPiAZPtd2fRKAOWJhSQk7bP4cEmjBGPb4HoU=
=q1OB
-----END PGP SIGNATURE-----
Merge 5.4.157 into android11-5.4-lts
Changes in 5.4.157
ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
ARM: 9134/1: remove duplicate memcpy() definition
ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
ARM: 9141/1: only warn about XIP address when not compile testing
powerpc/bpf: Fix BPF_MOD when imm == 1
ipv6: use siphash in rt6_exception_hash()
ipv4: use siphash instead of Jenkins in fnhe_hashfun()
usbnet: sanity check for maxpacket
usbnet: fix error return code in usbnet_probe()
Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
ata: sata_mv: Fix the error handling of mv_chip_id()
nfc: port100: fix using -ERRNO as command type mask
Revert "net: mdiobus: Fix memory leak in __mdiobus_register"
net/tls: Fix flipped sign in tls_err_abort() calls
mmc: vub300: fix control-message timeouts
mmc: cqhci: clear HALT state after CQE enable
mmc: dw_mmc: exynos: fix the finding clock sample value
mmc: sdhci: Map more voltage level to SDHCI_POWER_330
mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit
cfg80211: scan: fix RCU in cfg80211_add_nontrans_list()
net: lan78xx: fix division by zero in send path
drm/ttm: fix memleak in ttm_transfered_destroy
tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
IB/hfi1: Fix abba locking issue with sc_disable()
nvmet-tcp: fix data digest pointer calculation
nvme-tcp: fix data digest pointer calculation
RDMA/mlx5: Set user priority for DCT
arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node
regmap: Fix possible double-free in regcache_rbtree_exit()
net: batman-adv: fix error handling
net: Prevent infinite while loop in skb_tx_hash()
RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST
net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume fails
net: ethernet: microchip: lan743x: Fix dma allocation failure by using dma_set_mask_and_coherent
net: nxp: lpc_eth.c: avoid hang when bringing interface down
net/tls: Fix flipped sign in async_wait.err assignment
phy: phy_ethtool_ksettings_get: Lock the phy for consistency
phy: phy_start_aneg: Add an unlocked version
sctp: use init_tag from inithdr for ABORT chunk
sctp: fix the processing for INIT_ACK chunk
sctp: fix the processing for COOKIE_ECHO chunk
sctp: add vtag check in sctp_sf_violation
sctp: add vtag check in sctp_sf_do_8_5_1_E_sa
sctp: add vtag check in sctp_sf_ootb
net: use netif_is_bridge_port() to check for IFF_BRIDGE_PORT
cfg80211: correct bridge/4addr mode check
KVM: s390: clear kicked_mask before sleeping again
KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu
perf script: Check session->header.env.arch before using it
Linux 5.4.157
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8dd3b408b22bc98c06b6a941260157df2a40de00
commit da353fac65fede6b8b4cfe207f0d9408e3121105 upstream.
sk->sk_err appears to expect a positive value, a convention that ktls
doesn't always follow and that leads to memory corruption in other code.
For instance,
[kworker]
tls_encrypt_done(..., err=<negative error from crypto request>)
tls_err_abort(.., err)
sk->sk_err = err;
[task]
splice_from_pipe_feed
...
tls_sw_do_sendpage
if (sk->sk_err) {
ret = -sk->sk_err; // ret is positive
splice_from_pipe_feed (continued)
ret = actor(...) // ret is still positive and interpreted as bytes
// written, resulting in underflow of buf->len and
// sd->len, leading to huge buf->offset and bogus
// addresses computed in later calls to actor()
Fix all tls_err_abort() callers to pass a negative error code
consistently and centralize the error-prone sign flip there, throwing in
a warning to catch future misuse and uninlining the function so it
really does only warn once.
Cc: stable@vger.kernel.org
Fixes: c46234ebb4 ("tls: RX path for ktls")
Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl/PSigACgkQONu9yGCS
aT6bSw//eDCpWcnLDa1Rt4bOrnO82484ebr1PZeYPfca/3QVS59j8DsVOf6Xklmz
z2ponI6SRFxZwO2SmXrfoiOhUVI9Kd3ohTH+LSo3ezpk0klamIf60L914RBc7QFE
wmVgOPz5LwLxfkU5a148/H4rwLGlM9oBxVcCXpnLkN03Ul4JM/P6A/T3rFrX8ZkW
3r4NYu3jOHgNz+irosW8zAea+jIf7ALg4Gch3ILwrbM4KSQiyXbAp0mJsY+li7HE
BSa1RJHBXkqCwK/mWT4LWuJNf871T656kKr04/rxipRu2lEcGCPghO4DGba1mjqR
NdnuMWBjoxetlRAbWOylWT+2ngQNx+E9hFrBxg1+js/mcHvfpeM4EuSK4YCnI7rO
6r5JZqYdw7GGHqvy51JPLx1m+NMt8XhTp5+1vOIZhjtdNrcTMBz0kxIiGbvTwdlb
BbO+LDjmBmQYwmTcadbBPPMRLKnvx5bbNtTAzdwkvYEC8ev5RfxebFO/StTbmVRd
JIUKkwmNw803OjhMgs+dXVw0lX8C1nLSSROKHf4+lCGFhCDnDhos5DpKpfBIwXxP
Xv0Uf1YA4ygFVId+kuJOoXWNBkzB6UOlKMxoU1YcuRwpZHFk8b+MvTAzaCbSSl3A
nJT6CK3K3H6WSiF9PC8i85kFJbAJbwifjx904nGBekaqU0bgI+s=
=Faec
-----END PGP SIGNATURE-----
Merge 5.4.82 into android11-5.4-lts
Changes in 5.4.82
devlink: Hold rtnl lock while reading netdev attributes
ipv6: addrlabel: fix possible memory leak in ip6addrlbl_net_init
net/af_iucv: set correct sk_protocol for child sockets
net/tls: missing received data after fast remote close
net/tls: Protect from calling tls_dev_del for TLS RX twice
rose: Fix Null pointer dereference in rose_send_frame()
sock: set sk_err to ee_errno on dequeue from errq
tcp: Set INET_ECN_xmit configuration in tcp_reinit_congestion_control
tun: honor IOCB_NOWAIT flag
usbnet: ipheth: fix connectivity with iOS 14
bonding: wait for sysfs kobject destruction before freeing struct slave
staging/octeon: fix up merge error
ima: extend boot_aggregate with kernel measurements
sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list
netfilter: bridge: reset skb->pkt_type after NF_INET_POST_ROUTING traversal
ipv4: Fix tos mask in inet_rtm_getroute()
dt-bindings: net: correct interrupt flags in examples
chelsio/chtls: fix panic during unload reload chtls
ibmvnic: Ensure that SCRQ entry reads are correctly ordered
ibmvnic: Fix TX completion error handling
inet_ecn: Fix endianness of checksum update when setting ECT(1)
geneve: pull IP header before ECN decapsulation
net: ip6_gre: set dev->hard_header_len when using header_ops
net/x25: prevent a couple of overflows
cxgb3: fix error return code in t3_sge_alloc_qset()
net: pasemi: fix error return code in pasemi_mac_open()
vxlan: fix error return code in __vxlan_dev_create()
chelsio/chtls: fix a double free in chtls_setkey()
net: mvpp2: Fix error return code in mvpp2_open()
net: skbuff: ensure LSE is pullable before decrementing the MPLS ttl
net: openvswitch: ensure LSE is pullable before reading it
net/sched: act_mpls: ensure LSE is pullable before reading it
net/mlx5: DR, Proper handling of unsupported Connect-X6DX SW steering
net/mlx5: Fix wrong address reclaim when command interface is down
ALSA: usb-audio: US16x08: fix value count for level meters
Input: xpad - support Ardwiino Controllers
Input: i8042 - add ByteSpeed touchpad to noloop table
tracing: Remove WARN_ON in start_thread()
RDMA/i40iw: Address an mmap handler exploit in i40iw
Linux 5.4.82
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie7c035895e3413f7a58012c372cfc64deb2e6081
[ Upstream commit 025cc2fb6a4e84e9a0552c0017dcd1c24b7ac7da ]
tls_device_offload_cleanup_rx doesn't clear tls_ctx->netdev after
calling tls_dev_del if TLX TX offload is also enabled. Clearing
tls_ctx->netdev gets postponed until tls_device_gc_task. It leaves a
time frame when tls_device_down may get called and call tls_dev_del for
RX one extra time, confusing the driver, which may lead to a crash.
This patch corrects this racy behavior by adding a flag to prevent
tls_device_down from calling tls_dev_del the second time.
Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20201125221810.69870-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl7wXk8ACgkQONu9yGCS
aT5uyhAA1EoV9ROPRt8Vw1fzlDIrRA5X2T+FCGXskg2kKWehVHAvge4U76nZ16+i
aYcBX3lAmN7GGVw+/GiRHf9QpiwOUF5f3ZUQZ0KuLS1gcuaXx+VC1h5yyunx3tm1
CI01B2p+GQ3jABWopnhsujMVAeWjbD18NqY+a+xOzTn8CCyLAli+LiviWCR/apQp
p4r6++eevWo1yMDlJGNGoMYsFcxChWhtlnDQKWCsIDCN3I1cinGz8wopiv93WqRH
Sz3wb1YMuhXb10usNZcZFaSvDGf5XSaMxpRkyNSxN7CLv8LzbovXQOE+fFDGAYxd
lUCjRK0wFBMzRSeZ2iGYqqQf5xyYKb6hNmViGprdqwR2c3MBHN/Xs5aDLqJEgHkr
OXzZLyHUngRfp3GpagFGV6q06S6fgb9ca/7FuT4Hn8Z3tb5Xt7b/KlPcW3VymiSt
I37itASNA/Qs6Njl4tDd9GjwbcOAs+s/XabasU+pXscOkf3o8fYMy2krisy176D/
AXtRTLq4pc42I8c3tv5uCNz7Zje/qytKSPErNRBAedvOu5JX7ab6hgULPH4N7r0N
Di/LyKqYw+ZBa4AfzcsvlR3wJLWqni+aFj5yppSrNkH7kNzZGLmlw8xIo8v1CFYw
T86b13WmHPqvyFWQLpX5WCEYu0OCw5YCUyQXSsLZN5oC7gAwC7U=
=FSdI
-----END PGP SIGNATURE-----
Merge 5.4.48 into android-5.4-stable
Changes in 5.4.48
ACPI: GED: use correct trigger type field in _Exx / _Lxx handling
drm/amdgpu: fix and cleanup amdgpu_gem_object_close v4
ath10k: Fix the race condition in firmware dump work queue
drm: bridge: adv7511: Extend list of audio sample rates
media: staging: imgu: do not hold spinlock during freeing mmu page table
media: imx: imx7-mipi-csis: Cleanup and fix subdev pad format handling
crypto: ccp -- don't "select" CONFIG_DMADEVICES
media: vicodec: Fix error codes in probe function
media: si2157: Better check for running tuner in init
objtool: Ignore empty alternatives
spi: spi-mem: Fix Dual/Quad modes on Octal-capable devices
drm/amdgpu: Init data to avoid oops while reading pp_num_states.
arm64/kernel: Fix range on invalidating dcache for boot page tables
libbpf: Fix memory leak and possible double-free in hashmap__clear
spi: pxa2xx: Apply CS clk quirk to BXT
x86,smap: Fix smap_{save,restore}() alternatives
sched/fair: Refill bandwidth before scaling
net: atlantic: make hw_get_regs optional
net: ena: fix error returning in ena_com_get_hash_function()
efi/libstub/x86: Work around LLVM ELF quirk build regression
ath10k: remove the max_sched_scan_reqs value
arm64: cacheflush: Fix KGDB trap detection
media: staging: ipu3: Fix stale list entries on parameter queue failure
rtw88: fix an issue about leak system resources
spi: dw: Zero DMA Tx and Rx configurations on stack
ACPICA: Dispatcher: add status checks
block: alloc map and request for new hardware queue
arm64: insn: Fix two bugs in encoding 32-bit logical immediates
block: reset mapping if failed to update hardware queue count
drm: rcar-du: Set primary plane zpos immutably at initializing
lockdown: Allow unprivileged users to see lockdown status
ixgbe: Fix XDP redirect on archs with PAGE_SIZE above 4K
platform/x86: dell-laptop: don't register micmute LED if there is no token
MIPS: Loongson: Build ATI Radeon GPU driver as module
Bluetooth: Add SCO fallback for invalid LMP parameters error
kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
kgdb: Prevent infinite recursive entries to the debugger
pmu/smmuv3: Clear IRQ affinity hint on device removal
ACPI/IORT: Fix PMCG node single ID mapping handling
mips: Fix cpu_has_mips64r1/2 activation for MIPS32 CPUs
spi: dw: Enable interrupts in accordance with DMA xfer mode
clocksource: dw_apb_timer: Make CPU-affiliation being optional
clocksource: dw_apb_timer_of: Fix missing clockevent timers
media: dvbdev: Fix tuner->demod media controller link
btrfs: account for trans_block_rsv in may_commit_transaction
btrfs: do not ignore error from btrfs_next_leaf() when inserting checksums
ARM: 8978/1: mm: make act_mm() respect THREAD_SIZE
batman-adv: Revert "disable ethtool link speed detection when auto negotiation off"
ice: Fix memory leak
ice: Fix for memory leaks and modify ICE_FREE_CQ_BUFS
mmc: meson-mx-sdio: trigger a soft reset after a timeout or CRC error
Bluetooth: btmtkuart: Improve exception handling in btmtuart_probe()
spi: dw: Fix Rx-only DMA transfers
x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit
net: vmxnet3: fix possible buffer overflow caused by bad DMA value in vmxnet3_get_rss()
x86: fix vmap arguments in map_irq_stack
staging: android: ion: use vmap instead of vm_map_ram
ath10k: fix kernel null pointer dereference
media: staging/intel-ipu3: Implement lock for stream on/off operations
spi: Respect DataBitLength field of SpiSerialBusV2() ACPI resource
brcmfmac: fix wrong location to get firmware feature
regulator: qcom-rpmh: Fix typos in pm8150 and pm8150l
tools api fs: Make xxx__mountpoint() more scalable
e1000: Distribute switch variables for initialization
dt-bindings: display: mediatek: control dpi pins mode to avoid leakage
drm/mediatek: set dpi pin mode to gpio low to avoid leakage current
audit: fix a net reference leak in audit_send_reply()
media: dvb: return -EREMOTEIO on i2c transfer failure.
media: platform: fcp: Set appropriate DMA parameters
MIPS: Make sparse_init() using top-down allocation
ath10k: add flush tx packets for SDIO chip
Bluetooth: btbcm: Add 2 missing models to subver tables
audit: fix a net reference leak in audit_list_rules_send()
Drivers: hv: vmbus: Always handle the VMBus messages on CPU0
dpaa2-eth: fix return codes used in ndo_setup_tc
netfilter: nft_nat: return EOPNOTSUPP if type or flags are not supported
selftests/bpf: Fix memory leak in extract_build_id()
net: bcmgenet: set Rx mode before starting netif
net: bcmgenet: Fix WoL with password after deep sleep
lib/mpi: Fix 64-bit MIPS build with Clang
exit: Move preemption fixup up, move blocking operations down
sched/core: Fix illegal RCU from offline CPUs
drivers/perf: hisi: Fix typo in events attribute array
iocost_monitor: drop string wrap around numbers when outputting json
net: lpc-enet: fix error return code in lpc_mii_init()
selinux: fix error return code in policydb_read()
drivers: net: davinci_mdio: fix potential NULL dereference in davinci_mdio_probe()
media: cec: silence shift wrapping warning in __cec_s_log_addrs()
net: allwinner: Fix use correct return type for ndo_start_xmit()
powerpc/spufs: fix copy_to_user while atomic
libertas_tf: avoid a null dereference in pointer priv
xfs: clean up the error handling in xfs_swap_extents
Crypto/chcr: fix for ccm(aes) failed test
MIPS: Truncate link address into 32bit for 32bit kernel
mips: cm: Fix an invalid error code of INTVN_*_ERR
kgdb: Fix spurious true from in_dbg_master()
xfs: reset buffer write failure state on successful completion
xfs: fix duplicate verification from xfs_qm_dqflush()
platform/x86: intel-vbtn: Use acpi_evaluate_integer()
platform/x86: intel-vbtn: Split keymap into buttons and switches parts
platform/x86: intel-vbtn: Do not advertise switches to userspace if they are not there
platform/x86: intel-vbtn: Also handle tablet-mode switch on "Detachable" and "Portable" chassis-types
iwlwifi: avoid debug max amsdu config overwriting itself
nvme: refine the Qemu Identify CNS quirk
nvme-pci: align io queue count with allocted nvme_queue in nvme_probe
nvme-tcp: use bh_lock in data_ready
ath10k: Remove msdu from idr when management pkt send fails
wcn36xx: Fix error handling path in 'wcn36xx_probe()'
net: qed*: Reduce RX and TX default ring count when running inside kdump kernel
drm/mcde: dsi: Fix return value check in mcde_dsi_bind()
mt76: avoid rx reorder buffer overflow
md: don't flush workqueue unconditionally in md_open
raid5: remove gfp flags from scribble_alloc()
iocost: don't let vrate run wild while there's no saturation signal
veth: Adjust hard_start offset on redirect XDP frames
net/mlx5e: IPoIB, Drop multicast packets that this interface sent
rtlwifi: Fix a double free in _rtl_usb_tx_urb_setup()
mwifiex: Fix memory corruption in dump_station
kgdboc: Use a platform device to handle tty drivers showing up late
x86/boot: Correct relocation destination on old linkers
sched: Defend cfs and rt bandwidth quota against overflow
mips: MAAR: Use more precise address mask
mips: Add udelay lpj numbers adjustment
crypto: stm32/crc32 - fix ext4 chksum BUG_ON()
crypto: stm32/crc32 - fix run-time self test issue.
crypto: stm32/crc32 - fix multi-instance
drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven
drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
selftests/bpf: CONFIG_IPV6_SEG6_BPF required for test_seg6_loop.o
x86/mm: Stop printing BRK addresses
MIPS: tools: Fix resource leak in elf-entry.c
m68k: mac: Don't call via_flush_cache() on Mac IIfx
btrfs: improve global reserve stealing logic
btrfs: qgroup: mark qgroup inconsistent if we're inherting snapshot to a new qgroup
macvlan: Skip loopback packets in RX handler
PCI: Don't disable decoding when mmio_always_on is set
MIPS: Fix IRQ tracing when call handle_fpe() and handle_msa_fpe()
bcache: fix refcount underflow in bcache_device_free()
mmc: sdhci-msm: Set SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 quirk
staging: greybus: sdio: Respect the cmd->busy_timeout from the mmc core
mmc: via-sdmmc: Respect the cmd->busy_timeout from the mmc core
ice: fix potential double free in probe unrolling
ixgbe: fix signed-integer-overflow warning
iwlwifi: mvm: fix aux station leak
mmc: sdhci-esdhc-imx: fix the mask for tuning start point
spi: dw: Return any value retrieved from the dma_transfer callback
cpuidle: Fix three reference count leaks
platform/x86: hp-wmi: Convert simple_strtoul() to kstrtou32()
platform/x86: intel-hid: Add a quirk to support HP Spectre X2 (2015)
platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type
platform/x86: asus_wmi: Reserve more space for struct bias_args
libbpf: Fix perf_buffer__free() API for sparse allocs
bpf: Fix map permissions check
bpf: Refactor sockmap redirect code so its easy to reuse
bpf: Fix running sk_skb program types with ktls
selftests/bpf, flow_dissector: Close TAP device FD after the test
kasan: stop tests being eliminated as dead code with FORTIFY_SOURCE
string.h: fix incompatibility between FORTIFY_SOURCE and KASAN
btrfs: free alien device after device add
btrfs: include non-missing as a qualifier for the latest_bdev
btrfs: send: emit file capabilities after chown
btrfs: force chunk allocation if our global rsv is larger than metadata
btrfs: fix error handling when submitting direct I/O bio
btrfs: fix wrong file range cleanup after an error filling dealloc range
btrfs: fix space_info bytes_may_use underflow after nocow buffered write
btrfs: fix space_info bytes_may_use underflow during space cache writeout
powerpc/mm: Fix conditions to perform MMU specific management by blocks on PPC32.
mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()
mm: initialize deferred pages with interrupts enabled
mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init
mm: call cond_resched() from deferred_init_memmap()
ima: Fix ima digest hash table key calculation
ima: Switch to ima_hash_algo for boot aggregate
ima: Evaluate error in init_ima()
ima: Directly assign the ima_default_policy pointer to ima_rules
ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()
ima: Remove __init annotation from ima_pcrread()
evm: Fix possible memory leak in evm_calc_hmac_or_hash()
ext4: fix EXT_MAX_EXTENT/INDEX to check for zeroed eh_max
ext4: fix error pointer dereference
ext4: fix race between ext4_sync_parent() and rename()
PCI: Avoid Pericom USB controller OHCI/EHCI PME# defect
PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0
PCI: Avoid FLR for AMD Starship USB 3.0
PCI: Add ACS quirk for Intel Root Complex Integrated Endpoints
PCI: vmd: Add device id for VMD device 8086:9A0B
x86/amd_nb: Add Family 19h PCI IDs
PCI: Add Loongson vendor ID
serial: 8250_pci: Move Pericom IDs to pci_ids.h
x86/amd_nb: Add AMD family 17h model 60h PCI IDs
ima: Remove redundant policy rule set in add_rules()
ima: Set again build_ima_appraise variable
PCI: Program MPS for RCiEP devices
e1000e: Disable TSO for buffer overrun workaround
e1000e: Relax condition to trigger reset for ME workaround
carl9170: remove P2P_GO support
media: go7007: fix a miss of snd_card_free
media: cedrus: Program output format during each run
serial: 8250: Avoid error message on reprobe
Bluetooth: hci_bcm: fix freeing not-requested IRQ
b43legacy: Fix case where channel status is corrupted
b43: Fix connection problem with WPA3
b43_legacy: Fix connection problem with WPA3
media: ov5640: fix use of destroyed mutex
clk: mediatek: assign the initial value to clk_init_data of mtk_mux
igb: Report speed and duplex as unknown when device is runtime suspended
hwmon: (k10temp) Add AMD family 17h model 60h PCI match
EDAC/amd64: Add AMD family 17h model 60h PCI IDs
power: vexpress: add suppress_bind_attrs to true
power: supply: core: fix HWMON temperature labels
power: supply: core: fix memory leak in HWMON error path
pinctrl: samsung: Correct setting of eint wakeup mask on s5pv210
pinctrl: samsung: Save/restore eint_mask over suspend for EINT_TYPE GPIOs
gnss: sirf: fix error return code in sirf_probe()
sparc32: fix register window handling in genregs32_[gs]et()
sparc64: fix misuses of access_process_vm() in genregs32_[sg]et()
dm crypt: avoid truncating the logical block size
alpha: fix memory barriers so that they conform to the specification
powerpc/fadump: use static allocation for reserved memory ranges
powerpc/fadump: consider reserved ranges while reserving memory
powerpc/fadump: Account for memory_limit while reserving memory
kernel/cpu_pm: Fix uninitted local in cpu_pm
ARM: tegra: Correct PL310 Auxiliary Control Register initialization
soc/tegra: pmc: Select GENERIC_PINCONF
ARM: dts: exynos: Fix GPIO polarity for thr GalaxyS3 CM36651 sensor's bus
ARM: dts: at91: sama5d2_ptc_ek: fix vbus pin
ARM: dts: s5pv210: Set keep-power-in-suspend for SDHCI1 on Aries
drivers/macintosh: Fix memleak in windfarm_pm112 driver
powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG
powerpc/kasan: Fix issues by lowering KASAN_SHADOW_END
powerpc/kasan: Fix shadow pages allocation failure
powerpc/32: Disable KASAN with pages bigger than 16k
powerpc/64s: Don't let DT CPU features set FSCR_DSCR
powerpc/64s: Save FSCR to init_task.thread.fscr after feature init
kbuild: force to build vmlinux if CONFIG_MODVERSION=y
sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
sunrpc: clean up properly in gss_mech_unregister()
mtd: rawnand: Fix nand_gpio_waitrdy()
mtd: rawnand: onfi: Fix redundancy detection check
mtd: rawnand: brcmnand: fix hamming oob layout
mtd: rawnand: diskonchip: Fix the probe error path
mtd: rawnand: sharpsl: Fix the probe error path
mtd: rawnand: ingenic: Fix the probe error path
mtd: rawnand: xway: Fix the probe error path
mtd: rawnand: orion: Fix the probe error path
mtd: rawnand: socrates: Fix the probe error path
mtd: rawnand: oxnas: Fix the probe error path
mtd: rawnand: sunxi: Fix the probe error path
mtd: rawnand: plat_nand: Fix the probe error path
mtd: rawnand: pasemi: Fix the probe error path
mtd: rawnand: mtk: Fix the probe error path
mtd: rawnand: tmio: Fix the probe error path
w1: omap-hdq: cleanup to add missing newline for some dev_dbg
f2fs: fix checkpoint=disable:%u%%
perf probe: Do not show the skipped events
perf probe: Fix to check blacklist address correctly
perf probe: Check address correctness by map instead of _etext
perf symbols: Fix debuginfo search for Ubuntu
perf symbols: Fix kernel maps for kcore and eBPF
Linux 5.4.48
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9954fb3f08956419e8586bcb9078e604df207fb9
[ Upstream commit e91de6afa81c10e9f855c5695eb9a53168d96b73 ]
KTLS uses a stream parser to collect TLS messages and send them to
the upper layer tls receive handler. This ensures the tls receiver
has a full TLS header to parse when it is run. However, when a
socket has BPF_SK_SKB_STREAM_VERDICT program attached before KTLS
is enabled we end up with two stream parsers running on the same
socket.
The result is both try to run on the same socket. First the KTLS
stream parser runs and calls read_sock() which will tcp_read_sock
which in turn calls tcp_rcv_skb(). This dequeues the skb from the
sk_receive_queue. When this is done KTLS code then data_ready()
callback which because we stacked KTLS on top of the bpf stream
verdict program has been replaced with sk_psock_start_strp(). This
will in turn kick the stream parser again and eventually do the
same thing KTLS did above calling into tcp_rcv_skb() and dequeuing
a skb from the sk_receive_queue.
At this point the data stream is broke. Part of the stream was
handled by the KTLS side some other bytes may have been handled
by the BPF side. Generally this results in either missing data
or more likely a "Bad Message" complaint from the kTLS receive
handler as the BPF program steals some bytes meant to be in a
TLS header and/or the TLS header length is no longer correct.
We've already broke the idealized model where we can stack ULPs
in any order with generic callbacks on the TX side to handle this.
So in this patch we do the same thing but for RX side. We add
a sk_psock_strp_enabled() helper so TLS can learn a BPF verdict
program is running and add a tls_sw_has_ctx_rx() helper so BPF
side can learn there is a TLS ULP on the socket.
Then on BPF side we omit calling our stream parser to avoid
breaking the data stream for the KTLS receiver. Then on the
KTLS side we call BPF_SK_SKB_STREAM_VERDICT once the KTLS
receiver is done with the packet but before it posts the
msg to userspace. This gives us symmetry between the TX and
RX halfs and IMO makes it usable again. On the TX side we
process packets in this order BPF -> TLS -> TCP and on
the receive side in the reverse order TCP -> TLS -> BPF.
Discovered while testing OpenSSL 3.0 Alpha2.0 release.
Fixes: d829e9c411 ("tls: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159079361946.5745.605854335665044485.stgit@john-Precision-5820-Tower
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl7XQXMACgkQONu9yGCS
aT4OHw//YYuI/61rkff6/3qAE4gDwTZolVywu5HHzT5W7t7qeHPzJin2u04RBiS8
4S8Mut0RUSK/0IyB0B3S342ntia1v41Q04veWm0K90iAScScjjUapLDXC/P3StA0
iitGKJ8QFDS49+PFKFYkyXEsv6HYlDbtTmS0yxVoooSr+uqeR7m6rS1jsDsfUTaR
T4tvfX8VPHkgfkfkOKCUq8/rM3uDW3lSk3JflIbPwRBQo9KvNPnfBetU9p//dCHG
CB1K9K3sB6xLkKe7Ut7PlwoTq/Lc8qOma535xy3A8Iv6fVq4+hPE2jsB93WGI270
WoEZbHpon7W6g/bU+C+CGfov2zBtz1dKHfWNcK5+dEkEQjjzKvvigfUvaKjyUUKB
Vo5rQ3GZQ4JsMkHEJaLOlp3/SkdRd6RV/E0YErBISNeswzqsOgTrX8mz6wfQInwd
Ww7V9LKdwSD6h2DuzutUbEm1X8i8glXammWEOUuh6zzQ3+WS57R1L+Nkr/6WxpgN
w2g7F0+5enUbE1kIdq5OCzY1D0gBpT1o5YlrZgdL2GF5lU1b/lhsGhV6P83fl2Mf
rTGFtg5M1pNgjbUkSH3VHHof35PM9vQZ6lrYbKMCjwymVY+BcR6nsCadfLqjMGnW
NCYeiAmoIVCJX7q0hONww+TevZ3T+SLUjQ2os3WzooPC51MPOAQ=
=5p6V
-----END PGP SIGNATURE-----
Merge 5.4.44 into android-5.4-stable
Changes in 5.4.44
ax25: fix setsockopt(SO_BINDTODEVICE)
dpaa_eth: fix usage as DSA master, try 3
net: don't return invalid table id error when we fall back to PF_UNSPEC
net: dsa: mt7530: fix roaming from DSA user ports
net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend
__netif_receive_skb_core: pass skb by reference
net: inet_csk: Fix so_reuseport bind-address cache in tb->fast*
net: ipip: fix wrong address family in init error path
net/mlx5: Add command entry handling completion
net: mvpp2: fix RX hashing for non-10G ports
net: nlmsg_cancel() if put fails for nhmsg
net: qrtr: Fix passing invalid reference to qrtr_local_enqueue()
net: revert "net: get rid of an signed integer overflow in ip_idents_reserve()"
net sched: fix reporting the first-time use timestamp
net/tls: fix race condition causing kernel panic
nexthop: Fix attribute checking for groups
r8152: support additional Microsoft Surface Ethernet Adapter variant
sctp: Don't add the shutdown timer if its already been added
sctp: Start shutdown on association restart if in SHUTDOWN-SENT state and socket is closed
tipc: block BH before using dst_cache
net/mlx5e: kTLS, Destroy key object after destroying the TIS
net/mlx5e: Fix inner tirs handling
net/mlx5: Fix memory leak in mlx5_events_init
net/mlx5e: Update netdev txq on completions during closure
net/mlx5: Fix error flow in case of function_setup failure
net/mlx5: Annotate mutex destroy for root ns
net/tls: fix encryption error checking
net/tls: free record only on encryption error
net: sun: fix missing release regions in cas_init_one().
net/mlx4_core: fix a memory leak bug.
mlxsw: spectrum: Fix use-after-free of split/unsplit/type_set in case reload fails
ARM: dts: rockchip: fix phy nodename for rk3228-evb
ARM: dts: rockchip: fix phy nodename for rk3229-xms6
arm64: dts: rockchip: fix status for &gmac2phy in rk3328-evb.dts
arm64: dts: rockchip: swap interrupts interrupt-names rk3399 gpu node
ARM: dts: rockchip: swap clock-names of gpu nodes
ARM: dts: rockchip: fix pinctrl sub nodename for spi in rk322x.dtsi
gpio: tegra: mask GPIO IRQs during IRQ shutdown
ALSA: usb-audio: add mapping for ASRock TRX40 Creator
net: microchip: encx24j600: add missed kthread_stop
gfs2: move privileged user check to gfs2_quota_lock_check
gfs2: Grab glock reference sooner in gfs2_add_revoke
drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate
drm/amd/powerplay: perform PG ungate prior to CG ungate
drm/amdgpu: Use GEM obj reference for KFD BOs
cachefiles: Fix race between read_waiter and read_copier involving op->to_do
usb: dwc3: pci: Enable extcon driver for Intel Merrifield
usb: phy: twl6030-usb: Fix a resource leak in an error handling path in 'twl6030_usb_probe()'
usb: gadget: legacy: fix redundant initialization warnings
net: freescale: select CONFIG_FIXED_PHY where needed
IB/i40iw: Remove bogus call to netdev_master_upper_dev_get()
riscv: stacktrace: Fix undefined reference to `walk_stackframe'
clk: ti: am33xx: fix RTC clock parent
csky: Fixup msa highest 3 bits mask
csky: Fixup perf callchain unwind
csky: Fixup remove duplicate irq_disable
hwmon: (nct7904) Fix incorrect range of temperature limit registers
cifs: Fix null pointer check in cifs_read
csky: Fixup raw_copy_from_user()
samples: bpf: Fix build error
drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
Input: usbtouchscreen - add support for BonXeon TP
Input: evdev - call input_flush_device() on release(), not flush()
Input: xpad - add custom init packet for Xbox One S controllers
Input: dlink-dir685-touchkeys - fix a typo in driver name
Input: i8042 - add ThinkPad S230u to i8042 reset list
Input: synaptics-rmi4 - really fix attn_data use-after-free
Input: synaptics-rmi4 - fix error return code in rmi_driver_probe()
ARM: 8970/1: decompressor: increase tag size
ARM: uaccess: consolidate uaccess asm to asm/uaccess-asm.h
ARM: uaccess: integrate uaccess_save and uaccess_restore
ARM: uaccess: fix DACR mismatch with nested exceptions
gpio: exar: Fix bad handling for ida_simple_get error path
arm64: dts: mt8173: fix vcodec-enc clock
soc: mediatek: cmdq: return send msg error code
gpu/drm: Ingenic: Fix opaque pointer casted to wrong type
IB/qib: Call kobject_put() when kobject_init_and_add() fails
ARM: dts/imx6q-bx50v3: Set display interface clock parents
ARM: dts: bcm2835-rpi-zero-w: Fix led polarity
ARM: dts: bcm: HR2: Fix PPI interrupt types
mmc: block: Fix use-after-free issue for rpmb
gpio: pxa: Fix return value of pxa_gpio_probe()
gpio: bcm-kona: Fix return value of bcm_kona_gpio_probe()
RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe()
ALSA: hwdep: fix a left shifting 1 by 31 UB bug
ALSA: hda/realtek - Add a model for Thinkpad T570 without DAC workaround
ALSA: usb-audio: mixer: volume quirk for ESS Technology Asus USB DAC
exec: Always set cap_ambient in cap_bprm_set_creds
clk: qcom: gcc: Fix parent for gpll0_out_even
ALSA: usb-audio: Quirks for Gigabyte TRX40 Aorus Master onboard audio
ALSA: hda/realtek - Add new codec supported for ALC287
libceph: ignore pool overlay and cache logic on redirects
ceph: flush release queue when handling caps for unknown inode
RDMA/core: Fix double destruction of uobject
drm/amd/display: drop cursor position check in atomic test
IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode
mm,thp: stop leaking unreleased file pages
mm: remove VM_BUG_ON(PageSlab()) from page_mapcount()
fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info()
include/asm-generic/topology.h: guard cpumask_of_node() macro argument
Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT"
gpio: fix locking open drain IRQ lines
iommu: Fix reference count leak in iommu_group_alloc.
parisc: Fix kernel panic in mem_init()
cfg80211: fix debugfs rename crash
x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"
mac80211: mesh: fix discovery timer re-arming issue / crash
x86/dma: Fix max PFN arithmetic overflow on 32 bit systems
copy_xstate_to_kernel(): don't leave parts of destination uninitialized
xfrm: allow to accept packets with ipv6 NEXTHDR_HOP in xfrm_input
xfrm: do pskb_pull properly in __xfrm_transport_prep
xfrm: remove the xfrm_state_put call becofe going to out_reset
xfrm: call xfrm_output_gso when inner_protocol is set in xfrm_output
xfrm interface: fix oops when deleting a x-netns interface
xfrm: fix a warning in xfrm_policy_insert_list
xfrm: fix a NULL-ptr deref in xfrm_local_error
xfrm: fix error in comment
ip_vti: receive ipip packet by calling ip_tunnel_rcv
netfilter: nft_reject_bridge: enable reject with bridge vlan
netfilter: ipset: Fix subcounter update skip
netfilter: conntrack: make conntrack userspace helpers work again
netfilter: nfnetlink_cthelper: unbreak userspace helper support
netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code
esp6: get the right proto for transport mode in esp6_gso_encap
bnxt_en: Fix accumulation of bp->net_stats_prev.
ieee80211: Fix incorrect mask for default PE duration
xsk: Add overflow check for u64 division, stored into u32
qlcnic: fix missing release in qlcnic_83xx_interrupt_test.
crypto: chelsio/chtls: properly set tp->lsndtime
nexthops: Move code from remove_nexthop_from_groups to remove_nh_grp_entry
nexthops: don't modify published nexthop groups
nexthop: Expand nexthop_is_multipath in a few places
ipv4: nexthop version of fib_info_nh_uses_dev
net: dsa: declare lockless TX feature for slave ports
bonding: Fix reference count leak in bond_sysfs_slave_add.
netfilter: conntrack: comparison of unsigned in cthelper confirmation
netfilter: conntrack: Pass value of ctinfo to __nf_conntrack_update
netfilter: nf_conntrack_pptp: fix compilation warning with W=1 build
perf: Make perf able to build with latest libbfd
Linux 5.4.44
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Idd547df1abb0bea116f30e3224a80387529adb0b
[ Upstream commit 0cada33241d9de205522e3858b18e506ca5cce2c ]
tls_sw_recvmsg() and tls_decrypt_done() can be run concurrently.
// tls_sw_recvmsg()
if (atomic_read(&ctx->decrypt_pending))
crypto_wait_req(-EINPROGRESS, &ctx->async_wait);
else
reinit_completion(&ctx->async_wait.completion);
//tls_decrypt_done()
pending = atomic_dec_return(&ctx->decrypt_pending);
if (!pending && READ_ONCE(ctx->async_notify))
complete(&ctx->async_wait.completion);
Consider the scenario tls_decrypt_done() is about to run complete()
if (!pending && READ_ONCE(ctx->async_notify))
and tls_sw_recvmsg() reads decrypt_pending == 0, does reinit_completion(),
then tls_decrypt_done() runs complete(). This sequence of execution
results in wrong completion. Consequently, for next decrypt request,
it will not wait for completion, eventually on connection close, crypto
resources freed, there is no way to handle pending decrypt response.
This race condition can be avoided by having atomic_read() mutually
exclusive with atomic_dec_return(),complete().Intoduced spin lock to
ensure the mutual exclution.
Addressed similar problem in tx direction.
v1->v2:
- More readable commit message.
- Corrected the lock to fix new race scenario.
- Removed barrier which is not needed now.
Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Try to mitigate potential future driver core api changes by adding a
padding to a lot of different networking structures:
struct ipv6_devconf
struct proto_ops
struct header_ops
struct napi_struct
struct netdev_queue
struct netdev_rx_queue
struct xfrmdev_ops
struct net_device_ops
struct net_device
struct packet_type
struct sk_buff
struct tlsdev_ops
Based on a change made to the RHEL/CENTOS 8 kernel.
Bug: 151154716
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I590f004754dbc8beafa40e71cac70a0938c38b4a
[ Upstream commit c5daa6cccdc2f94aca2c9b3fa5f94e4469997293 ]
Partially sent record cleanup path increments an SG entry
directly instead of using sg_next(). This should not be a
problem today, as encrypted messages should be always
allocated as arrays. But given this is a cleanup path it's
easy to miss was this ever to change. Use sg_next(), and
simplify the code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9e5ffed37df68d0ccfb2fdc528609e23a1e70ebe ]
Looks like when BPF support was added by commit d3b18ad31f
("tls: add bpf support to sk_msg handling") and
commit d829e9c411 ("tls: convert to generic sk_msg interface")
it broke/removed the support for in-place crypto as added by
commit 4e6d47206c ("tls: Add support for inplace records
encryption").
The inplace_crypto member of struct tls_rec is dead, inited
to zero, and sometimes set to zero again. It used to be
set to 1 when record was allocated, but the skmsg code doesn't
seem to have been written with the idea of in-place crypto
in mind.
Since non trivial effort is required to bring the feature back
and we don't really have the HW to measure the benefit just
remove the left over support for now to avoid confusing readers.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket
with TLS_TX takes the following path:
tcp_bpf_sendmsg_redir
tcp_bpf_push_locked
tcp_bpf_push
kernel_sendpage_locked
sock->ops->sendpage_locked
Also update the flags test in tls_sw_sendpage_locked to allow flag
MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this.
Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t
Link: https://github.com/wdebruij/kerneltools/commits/icept.2
Fixes: 0608c69c9a ("bpf: sk_msg, sock{map|hash} redirect through ULP")
Fixes: f3de19af0f ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS TX needs to release and re-acquire the socket lock if send buffer
fills up.
TLS SW TX path currently depends on only allowing one thread to enter
the function by the abuse of sk_write_pending. If another writer is
already waiting for memory no new ones are allowed in.
This has two problems:
- writers don't wake other threads up when they leave the kernel;
meaning that this scheme works for single extra thread (second
application thread or delayed work) because memory becoming
available will send a wake up request, but as Mallesham and
Pooja report with larger number of threads it leads to threads
being put to sleep indefinitely;
- the delayed work does not get _scheduled_ but it may _run_ when
other writers are present leading to crashes as writers don't
expect state to change under their feet (same records get pushed
and freed multiple times); it's hard to reliably bail from the
work, however, because the mere presence of a writer does not
guarantee that the writer will push pending records before exiting.
Ensuring wakeups always happen will make the code basically open
code a mutex. Just use a mutex.
The TLS HW TX path does not have any locking (not even the
sk_write_pending hack), yet it uses a per-socket sg_tx_data
array to push records.
Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Mallesham Jatharakonda <mallesh537@gmail.com>
Reported-by: Pooja Trivedi <poojatrivedi@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS code has a number of #ifdefs which make the code a little
harder to follow. Recent fixes removed the ifdef around the
TLS_HW define, so we can switch to the often used pattern
of defining tls_device functions as empty static inlines
in the header when CONFIG_TLS_DEVICE=n.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we already have the pointer to the full original sk_proto
stored use that instead of storing all individual callback
pointers as well.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an application configures kernel TLS on top of a TCP socket, it's
now possible for inet_diag_handler() to collect information regarding the
protocol version, the cipher type and TX / RX configuration, in case
INET_DIAG_INFO is requested.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to make sure context does not get freed while diag
code is interrogating it. Free struct tls_context with
kfree_rcu().
We add the __rcu annotation directly in icsk, and cast it
away in the datapath accessor. Presumably all ULPs will
do a similar thing.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Looks like we were slightly overzealous with the shutdown()
cleanup. Even though the sock->sk_state can reach CLOSED again,
socket->state will not got back to SS_UNCONNECTED once
connections is ESTABLISHED. Meaning we will see EISCONN if
we try to reconnect, and EINVAL if we try to listen.
Only listen sockets can be shutdown() and reused, but since
ESTABLISHED sockets can never be re-connected() or used for
listen() we don't need to try to clean up the ULP state early.
Fixes: 32857cf57f ("net/tls: fix transition through disconnect with close")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible (via shutdown()) for TCP socks to go through TCP_CLOSE
state via tcp_disconnect() without actually calling tcp_close which
would then call the tls close callback. Because of this a user could
disconnect a socket then put it in a LISTEN state which would break
our assumptions about sockets always being ESTABLISHED state.
More directly because close() can call unhash() and unhash is
implemented by sockmap if a sockmap socket has TLS enabled we can
incorrectly destroy the psock from unhash() and then call its close
handler again. But because the psock (sockmap socket representation)
is already destroyed we call close handler in sk->prot. However,
in some cases (TLS BASE/BASE case) this will still point at the
sockmap close handler resulting in a circular call and crash reported
by syzbot.
To fix both above issues implement the unhash() routine for TLS.
v4:
- add note about tls offload still needing the fix;
- move sk_proto to the cold cache line;
- split TX context free into "release" and "free",
otherwise the GC work itself is in already freed
memory;
- more TX before RX for consistency;
- reuse tls_ctx_free();
- schedule the GC work after we're done with context
to avoid UAF;
- don't set the unhash in all modes, all modes "inherit"
TLS_BASE's callbacks anyway;
- disable the unhash hook for TLS_HW.
Fixes: 3c4d755915 ("tls: kernel TLS support")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The tls close() callback currently drops the sock lock to call
strp_done(). Split up the RX cleanup into stopping the strparser
and releasing most resources, syncing strparser and finally
freeing the context.
To avoid the need for a strp_done() call on the cleanup path
of device offload make sure we don't arm the strparser until
we are sure init will be successful.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The tls close() callback currently drops the sock lock, makes a
cancel_delayed_work_sync() call, and then relocks the sock.
By restructuring the code we can avoid droping lock and then
reclaiming it. To simplify this we do the following,
tls_sk_proto_close
set_bit(CLOSING)
set_bit(SCHEDULE)
cancel_delay_work_sync() <- cancel workqueue
lock_sock(sk)
...
release_sock(sk)
strp_done()
Setting the CLOSING bit prevents the SCHEDULE bit from being
cleared by any workqueue items e.g. if one happens to be
scheduled and run between when we set SCHEDULE bit and cancel
work. Then because SCHEDULE bit is set now no new work will
be scheduled.
Tested with net selftests and bpf selftests.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
In tls_set_device_offload_rx() we prepare the software context
for RX fallback and proceed to add the connection to the device.
Unfortunately, software context prep includes arming strparser
so in case of a later error we have to release the socket lock
to call strp_done().
In preparation for not releasing the socket lock half way through
callbacks move arming strparser into a separate function.
Following patches will make use of that.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Introduce a return code for the tls_dev_resync callback.
When the driver TX resync fails, kernel can retry the resync again
until it succeeds. This prevents drivers from attempting to offload
TLS packets if the connection is known to be out of sync.
We don't worry about the RX resync since they will be retried naturally
as more encrypted records get received.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 86029d10af ("tls: zero the crypto information from tls_context
before freeing") added memzero_explicit() calls to clear the key material
before freeing struct tls_context, but it missed tls_device.c has its
own way of freeing this structure. Replace the missing free.
Fixes: 86029d10af ("tls: zero the crypto information from tls_context before freeing")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new route handling in ip_mc_finish_output() from 'net' overlapped
with the new support for returning congestion notifications from BPF
programs.
In order to handle this I had to take the dev_loopback_xmit() calls
out of the switch statement.
The aquantia driver conflicts were simple overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS offload drivers keep track of TCP seq numbers to make sure
the packets are fed into the HW in order.
When packets get dropped on the way through the stack, the driver
will get out of sync and have to use fallback encryption, but unless
TCP seq number is resynced it will never match the packets correctly
(or even worse - use incorrect record sequence number after TCP seq
wraps).
Existing drivers (mlx5) feed the entire record on every out-of-order
event, allowing FW/HW to always be in sync.
This patch adds an alternative, more akin to the RX resync. When
driver sees a frame which is past its expected sequence number the
stream must have gotten out of order (if the sequence number is
smaller than expected its likely a retransmission which doesn't
require resync). Driver will ask the stack to perform TX sync
before it submits the next full record, and fall back to software
crypto until stack has performed the sync.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently only RX direction is ever resynced, however, TX may
also get out of sequence if packets get dropped on the way to
the driver. Rename the resync callback and add a direction
parameter.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS offload device may lose sync with the TCP stream if packets
arrive out of order. Drivers can currently request a resync at
a specific TCP sequence number. When a record is found starting
at that sequence number kernel will inform the device of the
corresponding record number.
This requires the device to constantly scan the stream for a
known pattern (constant bytes of the header) after sync is lost.
This patch adds an alternative approach which is entirely under
the control of the kernel. Kernel tracks records it had to fully
decrypt, even though TLS socket is in TLS_HW mode. If multiple
records did not have any decrypted parts - it's a pretty strong
indication that the device is out of sync.
We choose the min number of fully encrypted records to be 2,
which should hopefully be more than will get retransmitted at
a time.
After kernel decides the device is out of sync it schedules a
resync request. If the TCP socket is empty the resync gets
performed immediately. If socket is not empty we leave the
record parser to resync when next record comes.
Before resync in message parser we peek at the TCP socket and
don't attempt the sync if the socket already has some of the
next record queued.
On resync failure (encrypted data continues to flow in) we
retry with exponential backoff, up to once every 128 records
(with a 16k record thats at most once every 2M of data).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
handle_device_resync() doesn't describe the function very well.
The function checks if resync should be issued upon parsing of
a new record.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS offload code casts record number to a u64. The buffer
should be aligned to 8 bytes, but its actually a __be64, and
the rest of the TLS code treats it as big int. Make the
offload callbacks take a byte array, drivers can make the
choice to do the ugly cast if they want to.
Prepare for copying the record number onto the stack by
defining a constant for max size of the byte array.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.
Signed-off-by: David S. Miller <davem@davemloft.net>
While offloading TLS connections, drivers need to handle the case where
out of order packets need to be transmitted.
Other drivers obtain the entire TLS record for the specific skb to
provide as context to hardware for encryption. However, other designs
may also want to keep the hardware state intact and perform the
out of order encryption entirely on the host.
To achieve this, export the already existing software encryption
fallback path so drivers could access this.
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently drivers have to ensure the alignment of their tls state
structure, which leads to unnecessary layers of getters and
encapsulated structures in each driver.
Simplify all this by marking the driver state as aligned (driver_state
members are currently aligned, so no hole is added, besides ALIGN in
TLS_OFFLOAD_CONTEXT_SIZE_RX/TX would reserve this extra space, anyway.)
With that we can add a common accessor to the core.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 bytes of driver state has been enough so far, but for drivers
which have to store 8 byte handle it's no longer practical to
store the state directly in the context.
Drivers generally don't need much extra state on RX side, while
TX side has to be tracking TCP sequence numbers. Split the
lengths of max driver state size on RX and TX.
The struct tls_offload_context_tx currently stands at 616 bytes and
struct tls_offload_context_rx stands at 368 bytes. Upcoming work
will consume extra 8 bytes in both for kernel-driven resync.
This means that we can bump TX side to 16 bytes and still fit
into the same number of cache lines but on RX side we would be 8
bytes over.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All callers pass prot->version as the last parameter
of tls_advance_record_sn(), yet tls_advance_record_sn()
itself needs a pointer to prot. Pass prot from callers.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct tls_context is slightly badly laid out. If we reorder things
right we can save 16 bytes (320 -> 304) but also make all fast path
data fit into two cache lines (one read only and one read/write,
down from four cache lines).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 38030d7cb7 ("net/tls: avoid NULL-deref on resync during device removal")
tried to fix a potential NULL-dereference by taking the
context rwsem. Unfortunately the RX resync may get called
from soft IRQ, so we can't use the rwsem to protect from
the device disappearing. Because we are guaranteed there
can be only one resync at a time (it's called from strparser)
use a bit to indicate resync is busy and make device
removal wait for the bit to get cleared.
Note that there is a leftover "flags" field in struct
tls_context already.
Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid a sparse warning byteswap the be32 sequence number
before it's stored in the atomic value. While at it drop
unnecessary brackets and use kernel's u64 type.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There seems to be no reason for tls_ops to be defined in netdevice.h
which is included in a lot of places. Don't wrap the struct/enum
declaration in ifdefs, it trickles down unnecessary ifdefs into
driver code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tls_device_sk_destruct being set on a socket used to indicate
that socket is a kTLS device one. That is no longer true -
now we use sk_validate_xmit_skb pointer for that purpose.
Remove the export. tls_device_attach() needs to be moved.
While at it, remove the dead declaration of tls_sk_destruct().
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike '&&' operator, the '&' does not have short-circuit
evaluation semantics. IOW both sides of the operator always
get evaluated. Fix the wrong operator in
tls_is_sk_tx_device_offloaded(), which would lead to
out-of-bounds access for for non-full sockets.
Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David reports that tls triggers warnings related to
sk->sk_forward_alloc not being zero at destruction time:
WARNING: CPU: 5 PID: 6831 at net/core/stream.c:206 sk_stream_kill_queues+0x103/0x110
WARNING: CPU: 5 PID: 6831 at net/ipv4/af_inet.c:160 inet_sock_destruct+0x15b/0x170
When sender fills up the write buffer and dies from
SIGPIPE. This is due to the device implementation
not cleaning up the partially_sent_record.
This is because commit a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
moved the partial record cleanup to the SW-only path.
Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added support for AES128-CCM based record encryption. AES128-CCM is
similar to AES128-GCM. Both of them have same salt/iv/mac size. The
notable difference between the two is that while invoking AES128-CCM
operation, the salt||nonce (which is passed as IV) has to be prefixed
with a hardcoded value '2'. Further, CCM implementation in kernel
requires IV passed in crypto_aead_request() to be full '16' bytes.
Therefore, the record structure 'struct tls_rec' has been modified to
reserve '16' bytes for IV. This works for both GCM and CCM based cipher.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS device cannot use the sw context. This patch returns the original
tls device write space handler and moves the sw/device specific portions
to the relevant files.
Also, we remove the write_space call for the tls_sw flow, because it
handles partial records in its delayed tx work handler.
Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch enables returning 'type' in msghdr for records that are
retrieved with MSG_PEEK in recvmsg. Further it prevents records peeked
from socket from getting clubbed with any other record of different
type when records are subsequently dequeued from strparser.
For each record, we now retain its type in sk_buff's control buffer
cb[]. Inside control buffer, record's full length and offset are already
stored by strparser in 'struct strp_msg'. We store record type after
'struct strp_msg' inside 'struct tls_msg'. For tls1.2, the type is
stored just after record dequeue. For tls1.3, the type is stored after
record has been decrypted.
Inside process_rx_list(), before processing a non-data record, we check
that we must be able to return back the record type to the user
application. If not, the decrypted records in tls context's rx_list is
left there without consuming any data.
Fixes: 692d7b5d1f ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each tls context maintains two cipher contexts (one each for tx and rx
directions). For each tls session, the constants such as protocol
version, ciphersuite, iv size, associated data size etc are same for
both the directions and need to be stored only once per tls context.
Hence these are moved from 'struct cipher_context' to 'struct
tls_prot_info' and stored only once in 'struct tls_context'.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>