Commit Graph

95 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
553d3c4173 This is the 5.4.157 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmGBh6EACgkQONu9yGCS
 aT4J7A//f9Hx5zW04Y1HOqF4Cd3zDTjSVLzgArYwHRsO22+jin+SqgxgjeXhW0d8
 3VkZeSaSvEuwWMB8HCuayl88nzDudFNHm/XReCTnt4uKiP8VOFoDMHQGDeGGl6Rr
 U2212K8Q3xIkA5OYa5Oma1/IbnL7XDUnte4iHTvIYiBvNwFFd3rDiCUi9sdFti0P
 SWZI0jFtkZVztohayTdb9y5dcIMiLbvtJEB0aX1XAmHiFWqgD0maVym2fdX2L+5c
 p6O+eZxRH0LEVham6URh61YnD9b1by+bcIUdWlgnmZkPAf3AXskmWBo1bIcISSXC
 M/8RlBqlNgKVXD0Y7890ytkTQF+EQgILj0lR5plaeYIp47YyOTYLFg/Ues7dRhn6
 XeP3sP/viqguYNzE54dX3t5HfYTbW3h/xzEXMoVZPuPRcM2f/YGAiOjxVyjv5hgv
 /4bQ1E9gfkNprXiDAad0VUfokxcqzFQR6s9asqmXaaNbvZ1a0Mk8UeR0qcl0FTvw
 dC6tQZgW2+d0Yi5kAG8pv/RCbZzgJwJa/tJ+I67XYdMUvISXkaGF5hMx6WG7wZBF
 NSW5JsBh0m8b2hKyypA3sktK0DJfx01y3/wZSXgAv+8by66hvQQuDN1mftChQnZH
 SAmQovITD85QXZ3LPiAZPtd2fRKAOWJhSQk7bP4cEmjBGPb4HoU=
 =q1OB
 -----END PGP SIGNATURE-----

Merge 5.4.157 into android11-5.4-lts

Changes in 5.4.157
	ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
	ARM: 9134/1: remove duplicate memcpy() definition
	ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
	ARM: 9141/1: only warn about XIP address when not compile testing
	powerpc/bpf: Fix BPF_MOD when imm == 1
	ipv6: use siphash in rt6_exception_hash()
	ipv4: use siphash instead of Jenkins in fnhe_hashfun()
	usbnet: sanity check for maxpacket
	usbnet: fix error return code in usbnet_probe()
	Revert "pinctrl: bcm: ns: support updated DT binding as syscon subnode"
	ata: sata_mv: Fix the error handling of mv_chip_id()
	nfc: port100: fix using -ERRNO as command type mask
	Revert "net: mdiobus: Fix memory leak in __mdiobus_register"
	net/tls: Fix flipped sign in tls_err_abort() calls
	mmc: vub300: fix control-message timeouts
	mmc: cqhci: clear HALT state after CQE enable
	mmc: dw_mmc: exynos: fix the finding clock sample value
	mmc: sdhci: Map more voltage level to SDHCI_POWER_330
	mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit
	cfg80211: scan: fix RCU in cfg80211_add_nontrans_list()
	net: lan78xx: fix division by zero in send path
	drm/ttm: fix memleak in ttm_transfered_destroy
	tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
	IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
	IB/hfi1: Fix abba locking issue with sc_disable()
	nvmet-tcp: fix data digest pointer calculation
	nvme-tcp: fix data digest pointer calculation
	RDMA/mlx5: Set user priority for DCT
	arm64: dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node
	regmap: Fix possible double-free in regcache_rbtree_exit()
	net: batman-adv: fix error handling
	net: Prevent infinite while loop in skb_tx_hash()
	RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
	nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST
	net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume fails
	net: ethernet: microchip: lan743x: Fix dma allocation failure by using dma_set_mask_and_coherent
	net: nxp: lpc_eth.c: avoid hang when bringing interface down
	net/tls: Fix flipped sign in async_wait.err assignment
	phy: phy_ethtool_ksettings_get: Lock the phy for consistency
	phy: phy_start_aneg: Add an unlocked version
	sctp: use init_tag from inithdr for ABORT chunk
	sctp: fix the processing for INIT_ACK chunk
	sctp: fix the processing for COOKIE_ECHO chunk
	sctp: add vtag check in sctp_sf_violation
	sctp: add vtag check in sctp_sf_do_8_5_1_E_sa
	sctp: add vtag check in sctp_sf_ootb
	net: use netif_is_bridge_port() to check for IFF_BRIDGE_PORT
	cfg80211: correct bridge/4addr mode check
	KVM: s390: clear kicked_mask before sleeping again
	KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu
	perf script: Check session->header.env.arch before using it
	Linux 5.4.157

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8dd3b408b22bc98c06b6a941260157df2a40de00
2021-11-02 20:04:52 +01:00
Daniel Jordan
e0cfd5159f net/tls: Fix flipped sign in tls_err_abort() calls
commit da353fac65fede6b8b4cfe207f0d9408e3121105 upstream.

sk->sk_err appears to expect a positive value, a convention that ktls
doesn't always follow and that leads to memory corruption in other code.
For instance,

    [kworker]
    tls_encrypt_done(..., err=<negative error from crypto request>)
      tls_err_abort(.., err)
        sk->sk_err = err;

    [task]
    splice_from_pipe_feed
      ...
        tls_sw_do_sendpage
          if (sk->sk_err) {
            ret = -sk->sk_err;  // ret is positive

    splice_from_pipe_feed (continued)
      ret = actor(...)  // ret is still positive and interpreted as bytes
                        // written, resulting in underflow of buf->len and
                        // sd->len, leading to huge buf->offset and bogus
                        // addresses computed in later calls to actor()

Fix all tls_err_abort() callers to pass a negative error code
consistently and centralize the error-prone sign flip there, throwing in
a warning to catch future misuse and uninlining the function so it
really does only warn once.

Cc: stable@vger.kernel.org
Fixes: c46234ebb4 ("tls: RX path for ktls")
Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-02 19:46:12 +01:00
Greg Kroah-Hartman
5a742c2b56 This is the 5.4.82 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl/PSigACgkQONu9yGCS
 aT6bSw//eDCpWcnLDa1Rt4bOrnO82484ebr1PZeYPfca/3QVS59j8DsVOf6Xklmz
 z2ponI6SRFxZwO2SmXrfoiOhUVI9Kd3ohTH+LSo3ezpk0klamIf60L914RBc7QFE
 wmVgOPz5LwLxfkU5a148/H4rwLGlM9oBxVcCXpnLkN03Ul4JM/P6A/T3rFrX8ZkW
 3r4NYu3jOHgNz+irosW8zAea+jIf7ALg4Gch3ILwrbM4KSQiyXbAp0mJsY+li7HE
 BSa1RJHBXkqCwK/mWT4LWuJNf871T656kKr04/rxipRu2lEcGCPghO4DGba1mjqR
 NdnuMWBjoxetlRAbWOylWT+2ngQNx+E9hFrBxg1+js/mcHvfpeM4EuSK4YCnI7rO
 6r5JZqYdw7GGHqvy51JPLx1m+NMt8XhTp5+1vOIZhjtdNrcTMBz0kxIiGbvTwdlb
 BbO+LDjmBmQYwmTcadbBPPMRLKnvx5bbNtTAzdwkvYEC8ev5RfxebFO/StTbmVRd
 JIUKkwmNw803OjhMgs+dXVw0lX8C1nLSSROKHf4+lCGFhCDnDhos5DpKpfBIwXxP
 Xv0Uf1YA4ygFVId+kuJOoXWNBkzB6UOlKMxoU1YcuRwpZHFk8b+MvTAzaCbSSl3A
 nJT6CK3K3H6WSiF9PC8i85kFJbAJbwifjx904nGBekaqU0bgI+s=
 =Faec
 -----END PGP SIGNATURE-----

Merge 5.4.82 into android11-5.4-lts

Changes in 5.4.82
	devlink: Hold rtnl lock while reading netdev attributes
	ipv6: addrlabel: fix possible memory leak in ip6addrlbl_net_init
	net/af_iucv: set correct sk_protocol for child sockets
	net/tls: missing received data after fast remote close
	net/tls: Protect from calling tls_dev_del for TLS RX twice
	rose: Fix Null pointer dereference in rose_send_frame()
	sock: set sk_err to ee_errno on dequeue from errq
	tcp: Set INET_ECN_xmit configuration in tcp_reinit_congestion_control
	tun: honor IOCB_NOWAIT flag
	usbnet: ipheth: fix connectivity with iOS 14
	bonding: wait for sysfs kobject destruction before freeing struct slave
	staging/octeon: fix up merge error
	ima: extend boot_aggregate with kernel measurements
	sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list
	netfilter: bridge: reset skb->pkt_type after NF_INET_POST_ROUTING traversal
	ipv4: Fix tos mask in inet_rtm_getroute()
	dt-bindings: net: correct interrupt flags in examples
	chelsio/chtls: fix panic during unload reload chtls
	ibmvnic: Ensure that SCRQ entry reads are correctly ordered
	ibmvnic: Fix TX completion error handling
	inet_ecn: Fix endianness of checksum update when setting ECT(1)
	geneve: pull IP header before ECN decapsulation
	net: ip6_gre: set dev->hard_header_len when using header_ops
	net/x25: prevent a couple of overflows
	cxgb3: fix error return code in t3_sge_alloc_qset()
	net: pasemi: fix error return code in pasemi_mac_open()
	vxlan: fix error return code in __vxlan_dev_create()
	chelsio/chtls: fix a double free in chtls_setkey()
	net: mvpp2: Fix error return code in mvpp2_open()
	net: skbuff: ensure LSE is pullable before decrementing the MPLS ttl
	net: openvswitch: ensure LSE is pullable before reading it
	net/sched: act_mpls: ensure LSE is pullable before reading it
	net/mlx5: DR, Proper handling of unsupported Connect-X6DX SW steering
	net/mlx5: Fix wrong address reclaim when command interface is down
	ALSA: usb-audio: US16x08: fix value count for level meters
	Input: xpad - support Ardwiino Controllers
	Input: i8042 - add ByteSpeed touchpad to noloop table
	tracing: Remove WARN_ON in start_thread()
	RDMA/i40iw: Address an mmap handler exploit in i40iw
	Linux 5.4.82

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie7c035895e3413f7a58012c372cfc64deb2e6081
2020-12-08 11:39:00 +01:00
Maxim Mikityanskiy
f2f25bc797 net/tls: Protect from calling tls_dev_del for TLS RX twice
[ Upstream commit 025cc2fb6a4e84e9a0552c0017dcd1c24b7ac7da ]

tls_device_offload_cleanup_rx doesn't clear tls_ctx->netdev after
calling tls_dev_del if TLX TX offload is also enabled. Clearing
tls_ctx->netdev gets postponed until tls_device_gc_task. It leaves a
time frame when tls_device_down may get called and call tls_dev_del for
RX one extra time, confusing the driver, which may lead to a crash.

This patch corrects this racy behavior by adding a flag to prevent
tls_device_down from calling tls_dev_del the second time.

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20201125221810.69870-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-08 10:40:23 +01:00
Greg Kroah-Hartman
fa46997961 This is the 5.4.48 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl7wXk8ACgkQONu9yGCS
 aT5uyhAA1EoV9ROPRt8Vw1fzlDIrRA5X2T+FCGXskg2kKWehVHAvge4U76nZ16+i
 aYcBX3lAmN7GGVw+/GiRHf9QpiwOUF5f3ZUQZ0KuLS1gcuaXx+VC1h5yyunx3tm1
 CI01B2p+GQ3jABWopnhsujMVAeWjbD18NqY+a+xOzTn8CCyLAli+LiviWCR/apQp
 p4r6++eevWo1yMDlJGNGoMYsFcxChWhtlnDQKWCsIDCN3I1cinGz8wopiv93WqRH
 Sz3wb1YMuhXb10usNZcZFaSvDGf5XSaMxpRkyNSxN7CLv8LzbovXQOE+fFDGAYxd
 lUCjRK0wFBMzRSeZ2iGYqqQf5xyYKb6hNmViGprdqwR2c3MBHN/Xs5aDLqJEgHkr
 OXzZLyHUngRfp3GpagFGV6q06S6fgb9ca/7FuT4Hn8Z3tb5Xt7b/KlPcW3VymiSt
 I37itASNA/Qs6Njl4tDd9GjwbcOAs+s/XabasU+pXscOkf3o8fYMy2krisy176D/
 AXtRTLq4pc42I8c3tv5uCNz7Zje/qytKSPErNRBAedvOu5JX7ab6hgULPH4N7r0N
 Di/LyKqYw+ZBa4AfzcsvlR3wJLWqni+aFj5yppSrNkH7kNzZGLmlw8xIo8v1CFYw
 T86b13WmHPqvyFWQLpX5WCEYu0OCw5YCUyQXSsLZN5oC7gAwC7U=
 =FSdI
 -----END PGP SIGNATURE-----

Merge 5.4.48 into android-5.4-stable

Changes in 5.4.48
	ACPI: GED: use correct trigger type field in _Exx / _Lxx handling
	drm/amdgpu: fix and cleanup amdgpu_gem_object_close v4
	ath10k: Fix the race condition in firmware dump work queue
	drm: bridge: adv7511: Extend list of audio sample rates
	media: staging: imgu: do not hold spinlock during freeing mmu page table
	media: imx: imx7-mipi-csis: Cleanup and fix subdev pad format handling
	crypto: ccp -- don't "select" CONFIG_DMADEVICES
	media: vicodec: Fix error codes in probe function
	media: si2157: Better check for running tuner in init
	objtool: Ignore empty alternatives
	spi: spi-mem: Fix Dual/Quad modes on Octal-capable devices
	drm/amdgpu: Init data to avoid oops while reading pp_num_states.
	arm64/kernel: Fix range on invalidating dcache for boot page tables
	libbpf: Fix memory leak and possible double-free in hashmap__clear
	spi: pxa2xx: Apply CS clk quirk to BXT
	x86,smap: Fix smap_{save,restore}() alternatives
	sched/fair: Refill bandwidth before scaling
	net: atlantic: make hw_get_regs optional
	net: ena: fix error returning in ena_com_get_hash_function()
	efi/libstub/x86: Work around LLVM ELF quirk build regression
	ath10k: remove the max_sched_scan_reqs value
	arm64: cacheflush: Fix KGDB trap detection
	media: staging: ipu3: Fix stale list entries on parameter queue failure
	rtw88: fix an issue about leak system resources
	spi: dw: Zero DMA Tx and Rx configurations on stack
	ACPICA: Dispatcher: add status checks
	block: alloc map and request for new hardware queue
	arm64: insn: Fix two bugs in encoding 32-bit logical immediates
	block: reset mapping if failed to update hardware queue count
	drm: rcar-du: Set primary plane zpos immutably at initializing
	lockdown: Allow unprivileged users to see lockdown status
	ixgbe: Fix XDP redirect on archs with PAGE_SIZE above 4K
	platform/x86: dell-laptop: don't register micmute LED if there is no token
	MIPS: Loongson: Build ATI Radeon GPU driver as module
	Bluetooth: Add SCO fallback for invalid LMP parameters error
	kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
	kgdb: Prevent infinite recursive entries to the debugger
	pmu/smmuv3: Clear IRQ affinity hint on device removal
	ACPI/IORT: Fix PMCG node single ID mapping handling
	mips: Fix cpu_has_mips64r1/2 activation for MIPS32 CPUs
	spi: dw: Enable interrupts in accordance with DMA xfer mode
	clocksource: dw_apb_timer: Make CPU-affiliation being optional
	clocksource: dw_apb_timer_of: Fix missing clockevent timers
	media: dvbdev: Fix tuner->demod media controller link
	btrfs: account for trans_block_rsv in may_commit_transaction
	btrfs: do not ignore error from btrfs_next_leaf() when inserting checksums
	ARM: 8978/1: mm: make act_mm() respect THREAD_SIZE
	batman-adv: Revert "disable ethtool link speed detection when auto negotiation off"
	ice: Fix memory leak
	ice: Fix for memory leaks and modify ICE_FREE_CQ_BUFS
	mmc: meson-mx-sdio: trigger a soft reset after a timeout or CRC error
	Bluetooth: btmtkuart: Improve exception handling in btmtuart_probe()
	spi: dw: Fix Rx-only DMA transfers
	x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit
	net: vmxnet3: fix possible buffer overflow caused by bad DMA value in vmxnet3_get_rss()
	x86: fix vmap arguments in map_irq_stack
	staging: android: ion: use vmap instead of vm_map_ram
	ath10k: fix kernel null pointer dereference
	media: staging/intel-ipu3: Implement lock for stream on/off operations
	spi: Respect DataBitLength field of SpiSerialBusV2() ACPI resource
	brcmfmac: fix wrong location to get firmware feature
	regulator: qcom-rpmh: Fix typos in pm8150 and pm8150l
	tools api fs: Make xxx__mountpoint() more scalable
	e1000: Distribute switch variables for initialization
	dt-bindings: display: mediatek: control dpi pins mode to avoid leakage
	drm/mediatek: set dpi pin mode to gpio low to avoid leakage current
	audit: fix a net reference leak in audit_send_reply()
	media: dvb: return -EREMOTEIO on i2c transfer failure.
	media: platform: fcp: Set appropriate DMA parameters
	MIPS: Make sparse_init() using top-down allocation
	ath10k: add flush tx packets for SDIO chip
	Bluetooth: btbcm: Add 2 missing models to subver tables
	audit: fix a net reference leak in audit_list_rules_send()
	Drivers: hv: vmbus: Always handle the VMBus messages on CPU0
	dpaa2-eth: fix return codes used in ndo_setup_tc
	netfilter: nft_nat: return EOPNOTSUPP if type or flags are not supported
	selftests/bpf: Fix memory leak in extract_build_id()
	net: bcmgenet: set Rx mode before starting netif
	net: bcmgenet: Fix WoL with password after deep sleep
	lib/mpi: Fix 64-bit MIPS build with Clang
	exit: Move preemption fixup up, move blocking operations down
	sched/core: Fix illegal RCU from offline CPUs
	drivers/perf: hisi: Fix typo in events attribute array
	iocost_monitor: drop string wrap around numbers when outputting json
	net: lpc-enet: fix error return code in lpc_mii_init()
	selinux: fix error return code in policydb_read()
	drivers: net: davinci_mdio: fix potential NULL dereference in davinci_mdio_probe()
	media: cec: silence shift wrapping warning in __cec_s_log_addrs()
	net: allwinner: Fix use correct return type for ndo_start_xmit()
	powerpc/spufs: fix copy_to_user while atomic
	libertas_tf: avoid a null dereference in pointer priv
	xfs: clean up the error handling in xfs_swap_extents
	Crypto/chcr: fix for ccm(aes) failed test
	MIPS: Truncate link address into 32bit for 32bit kernel
	mips: cm: Fix an invalid error code of INTVN_*_ERR
	kgdb: Fix spurious true from in_dbg_master()
	xfs: reset buffer write failure state on successful completion
	xfs: fix duplicate verification from xfs_qm_dqflush()
	platform/x86: intel-vbtn: Use acpi_evaluate_integer()
	platform/x86: intel-vbtn: Split keymap into buttons and switches parts
	platform/x86: intel-vbtn: Do not advertise switches to userspace if they are not there
	platform/x86: intel-vbtn: Also handle tablet-mode switch on "Detachable" and "Portable" chassis-types
	iwlwifi: avoid debug max amsdu config overwriting itself
	nvme: refine the Qemu Identify CNS quirk
	nvme-pci: align io queue count with allocted nvme_queue in nvme_probe
	nvme-tcp: use bh_lock in data_ready
	ath10k: Remove msdu from idr when management pkt send fails
	wcn36xx: Fix error handling path in 'wcn36xx_probe()'
	net: qed*: Reduce RX and TX default ring count when running inside kdump kernel
	drm/mcde: dsi: Fix return value check in mcde_dsi_bind()
	mt76: avoid rx reorder buffer overflow
	md: don't flush workqueue unconditionally in md_open
	raid5: remove gfp flags from scribble_alloc()
	iocost: don't let vrate run wild while there's no saturation signal
	veth: Adjust hard_start offset on redirect XDP frames
	net/mlx5e: IPoIB, Drop multicast packets that this interface sent
	rtlwifi: Fix a double free in _rtl_usb_tx_urb_setup()
	mwifiex: Fix memory corruption in dump_station
	kgdboc: Use a platform device to handle tty drivers showing up late
	x86/boot: Correct relocation destination on old linkers
	sched: Defend cfs and rt bandwidth quota against overflow
	mips: MAAR: Use more precise address mask
	mips: Add udelay lpj numbers adjustment
	crypto: stm32/crc32 - fix ext4 chksum BUG_ON()
	crypto: stm32/crc32 - fix run-time self test issue.
	crypto: stm32/crc32 - fix multi-instance
	drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven
	drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
	selftests/bpf: CONFIG_IPV6_SEG6_BPF required for test_seg6_loop.o
	x86/mm: Stop printing BRK addresses
	MIPS: tools: Fix resource leak in elf-entry.c
	m68k: mac: Don't call via_flush_cache() on Mac IIfx
	btrfs: improve global reserve stealing logic
	btrfs: qgroup: mark qgroup inconsistent if we're inherting snapshot to a new qgroup
	macvlan: Skip loopback packets in RX handler
	PCI: Don't disable decoding when mmio_always_on is set
	MIPS: Fix IRQ tracing when call handle_fpe() and handle_msa_fpe()
	bcache: fix refcount underflow in bcache_device_free()
	mmc: sdhci-msm: Set SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 quirk
	staging: greybus: sdio: Respect the cmd->busy_timeout from the mmc core
	mmc: via-sdmmc: Respect the cmd->busy_timeout from the mmc core
	ice: fix potential double free in probe unrolling
	ixgbe: fix signed-integer-overflow warning
	iwlwifi: mvm: fix aux station leak
	mmc: sdhci-esdhc-imx: fix the mask for tuning start point
	spi: dw: Return any value retrieved from the dma_transfer callback
	cpuidle: Fix three reference count leaks
	platform/x86: hp-wmi: Convert simple_strtoul() to kstrtou32()
	platform/x86: intel-hid: Add a quirk to support HP Spectre X2 (2015)
	platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type
	platform/x86: asus_wmi: Reserve more space for struct bias_args
	libbpf: Fix perf_buffer__free() API for sparse allocs
	bpf: Fix map permissions check
	bpf: Refactor sockmap redirect code so its easy to reuse
	bpf: Fix running sk_skb program types with ktls
	selftests/bpf, flow_dissector: Close TAP device FD after the test
	kasan: stop tests being eliminated as dead code with FORTIFY_SOURCE
	string.h: fix incompatibility between FORTIFY_SOURCE and KASAN
	btrfs: free alien device after device add
	btrfs: include non-missing as a qualifier for the latest_bdev
	btrfs: send: emit file capabilities after chown
	btrfs: force chunk allocation if our global rsv is larger than metadata
	btrfs: fix error handling when submitting direct I/O bio
	btrfs: fix wrong file range cleanup after an error filling dealloc range
	btrfs: fix space_info bytes_may_use underflow after nocow buffered write
	btrfs: fix space_info bytes_may_use underflow during space cache writeout
	powerpc/mm: Fix conditions to perform MMU specific management by blocks on PPC32.
	mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()
	mm: initialize deferred pages with interrupts enabled
	mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init
	mm: call cond_resched() from deferred_init_memmap()
	ima: Fix ima digest hash table key calculation
	ima: Switch to ima_hash_algo for boot aggregate
	ima: Evaluate error in init_ima()
	ima: Directly assign the ima_default_policy pointer to ima_rules
	ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()
	ima: Remove __init annotation from ima_pcrread()
	evm: Fix possible memory leak in evm_calc_hmac_or_hash()
	ext4: fix EXT_MAX_EXTENT/INDEX to check for zeroed eh_max
	ext4: fix error pointer dereference
	ext4: fix race between ext4_sync_parent() and rename()
	PCI: Avoid Pericom USB controller OHCI/EHCI PME# defect
	PCI: Avoid FLR for AMD Matisse HD Audio & USB 3.0
	PCI: Avoid FLR for AMD Starship USB 3.0
	PCI: Add ACS quirk for Intel Root Complex Integrated Endpoints
	PCI: vmd: Add device id for VMD device 8086:9A0B
	x86/amd_nb: Add Family 19h PCI IDs
	PCI: Add Loongson vendor ID
	serial: 8250_pci: Move Pericom IDs to pci_ids.h
	x86/amd_nb: Add AMD family 17h model 60h PCI IDs
	ima: Remove redundant policy rule set in add_rules()
	ima: Set again build_ima_appraise variable
	PCI: Program MPS for RCiEP devices
	e1000e: Disable TSO for buffer overrun workaround
	e1000e: Relax condition to trigger reset for ME workaround
	carl9170: remove P2P_GO support
	media: go7007: fix a miss of snd_card_free
	media: cedrus: Program output format during each run
	serial: 8250: Avoid error message on reprobe
	Bluetooth: hci_bcm: fix freeing not-requested IRQ
	b43legacy: Fix case where channel status is corrupted
	b43: Fix connection problem with WPA3
	b43_legacy: Fix connection problem with WPA3
	media: ov5640: fix use of destroyed mutex
	clk: mediatek: assign the initial value to clk_init_data of mtk_mux
	igb: Report speed and duplex as unknown when device is runtime suspended
	hwmon: (k10temp) Add AMD family 17h model 60h PCI match
	EDAC/amd64: Add AMD family 17h model 60h PCI IDs
	power: vexpress: add suppress_bind_attrs to true
	power: supply: core: fix HWMON temperature labels
	power: supply: core: fix memory leak in HWMON error path
	pinctrl: samsung: Correct setting of eint wakeup mask on s5pv210
	pinctrl: samsung: Save/restore eint_mask over suspend for EINT_TYPE GPIOs
	gnss: sirf: fix error return code in sirf_probe()
	sparc32: fix register window handling in genregs32_[gs]et()
	sparc64: fix misuses of access_process_vm() in genregs32_[sg]et()
	dm crypt: avoid truncating the logical block size
	alpha: fix memory barriers so that they conform to the specification
	powerpc/fadump: use static allocation for reserved memory ranges
	powerpc/fadump: consider reserved ranges while reserving memory
	powerpc/fadump: Account for memory_limit while reserving memory
	kernel/cpu_pm: Fix uninitted local in cpu_pm
	ARM: tegra: Correct PL310 Auxiliary Control Register initialization
	soc/tegra: pmc: Select GENERIC_PINCONF
	ARM: dts: exynos: Fix GPIO polarity for thr GalaxyS3 CM36651 sensor's bus
	ARM: dts: at91: sama5d2_ptc_ek: fix vbus pin
	ARM: dts: s5pv210: Set keep-power-in-suspend for SDHCI1 on Aries
	drivers/macintosh: Fix memleak in windfarm_pm112 driver
	powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG
	powerpc/kasan: Fix issues by lowering KASAN_SHADOW_END
	powerpc/kasan: Fix shadow pages allocation failure
	powerpc/32: Disable KASAN with pages bigger than 16k
	powerpc/64s: Don't let DT CPU features set FSCR_DSCR
	powerpc/64s: Save FSCR to init_task.thread.fscr after feature init
	kbuild: force to build vmlinux if CONFIG_MODVERSION=y
	sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
	sunrpc: clean up properly in gss_mech_unregister()
	mtd: rawnand: Fix nand_gpio_waitrdy()
	mtd: rawnand: onfi: Fix redundancy detection check
	mtd: rawnand: brcmnand: fix hamming oob layout
	mtd: rawnand: diskonchip: Fix the probe error path
	mtd: rawnand: sharpsl: Fix the probe error path
	mtd: rawnand: ingenic: Fix the probe error path
	mtd: rawnand: xway: Fix the probe error path
	mtd: rawnand: orion: Fix the probe error path
	mtd: rawnand: socrates: Fix the probe error path
	mtd: rawnand: oxnas: Fix the probe error path
	mtd: rawnand: sunxi: Fix the probe error path
	mtd: rawnand: plat_nand: Fix the probe error path
	mtd: rawnand: pasemi: Fix the probe error path
	mtd: rawnand: mtk: Fix the probe error path
	mtd: rawnand: tmio: Fix the probe error path
	w1: omap-hdq: cleanup to add missing newline for some dev_dbg
	f2fs: fix checkpoint=disable:%u%%
	perf probe: Do not show the skipped events
	perf probe: Fix to check blacklist address correctly
	perf probe: Check address correctness by map instead of _etext
	perf symbols: Fix debuginfo search for Ubuntu
	perf symbols: Fix kernel maps for kcore and eBPF
	Linux 5.4.48

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9954fb3f08956419e8586bcb9078e604df207fb9
2020-06-22 11:43:59 +02:00
John Fastabend
e7b1564a24 bpf: Fix running sk_skb program types with ktls
[ Upstream commit e91de6afa81c10e9f855c5695eb9a53168d96b73 ]

KTLS uses a stream parser to collect TLS messages and send them to
the upper layer tls receive handler. This ensures the tls receiver
has a full TLS header to parse when it is run. However, when a
socket has BPF_SK_SKB_STREAM_VERDICT program attached before KTLS
is enabled we end up with two stream parsers running on the same
socket.

The result is both try to run on the same socket. First the KTLS
stream parser runs and calls read_sock() which will tcp_read_sock
which in turn calls tcp_rcv_skb(). This dequeues the skb from the
sk_receive_queue. When this is done KTLS code then data_ready()
callback which because we stacked KTLS on top of the bpf stream
verdict program has been replaced with sk_psock_start_strp(). This
will in turn kick the stream parser again and eventually do the
same thing KTLS did above calling into tcp_rcv_skb() and dequeuing
a skb from the sk_receive_queue.

At this point the data stream is broke. Part of the stream was
handled by the KTLS side some other bytes may have been handled
by the BPF side. Generally this results in either missing data
or more likely a "Bad Message" complaint from the kTLS receive
handler as the BPF program steals some bytes meant to be in a
TLS header and/or the TLS header length is no longer correct.

We've already broke the idealized model where we can stack ULPs
in any order with generic callbacks on the TX side to handle this.
So in this patch we do the same thing but for RX side. We add
a sk_psock_strp_enabled() helper so TLS can learn a BPF verdict
program is running and add a tls_sw_has_ctx_rx() helper so BPF
side can learn there is a TLS ULP on the socket.

Then on BPF side we omit calling our stream parser to avoid
breaking the data stream for the KTLS receiver. Then on the
KTLS side we call BPF_SK_SKB_STREAM_VERDICT once the KTLS
receiver is done with the packet but before it posts the
msg to userspace. This gives us symmetry between the TX and
RX halfs and IMO makes it usable again. On the TX side we
process packets in this order BPF -> TLS -> TCP and on
the receive side in the reverse order TCP -> TLS -> BPF.

Discovered while testing OpenSSL 3.0 Alpha2.0 release.

Fixes: d829e9c411 ("tls: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159079361946.5745.605854335665044485.stgit@john-Precision-5820-Tower
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-22 09:31:12 +02:00
Greg Kroah-Hartman
f520ca124c This is the 5.4.44 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl7XQXMACgkQONu9yGCS
 aT4OHw//YYuI/61rkff6/3qAE4gDwTZolVywu5HHzT5W7t7qeHPzJin2u04RBiS8
 4S8Mut0RUSK/0IyB0B3S342ntia1v41Q04veWm0K90iAScScjjUapLDXC/P3StA0
 iitGKJ8QFDS49+PFKFYkyXEsv6HYlDbtTmS0yxVoooSr+uqeR7m6rS1jsDsfUTaR
 T4tvfX8VPHkgfkfkOKCUq8/rM3uDW3lSk3JflIbPwRBQo9KvNPnfBetU9p//dCHG
 CB1K9K3sB6xLkKe7Ut7PlwoTq/Lc8qOma535xy3A8Iv6fVq4+hPE2jsB93WGI270
 WoEZbHpon7W6g/bU+C+CGfov2zBtz1dKHfWNcK5+dEkEQjjzKvvigfUvaKjyUUKB
 Vo5rQ3GZQ4JsMkHEJaLOlp3/SkdRd6RV/E0YErBISNeswzqsOgTrX8mz6wfQInwd
 Ww7V9LKdwSD6h2DuzutUbEm1X8i8glXammWEOUuh6zzQ3+WS57R1L+Nkr/6WxpgN
 w2g7F0+5enUbE1kIdq5OCzY1D0gBpT1o5YlrZgdL2GF5lU1b/lhsGhV6P83fl2Mf
 rTGFtg5M1pNgjbUkSH3VHHof35PM9vQZ6lrYbKMCjwymVY+BcR6nsCadfLqjMGnW
 NCYeiAmoIVCJX7q0hONww+TevZ3T+SLUjQ2os3WzooPC51MPOAQ=
 =5p6V
 -----END PGP SIGNATURE-----

Merge 5.4.44 into android-5.4-stable

Changes in 5.4.44
	ax25: fix setsockopt(SO_BINDTODEVICE)
	dpaa_eth: fix usage as DSA master, try 3
	net: don't return invalid table id error when we fall back to PF_UNSPEC
	net: dsa: mt7530: fix roaming from DSA user ports
	net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend
	__netif_receive_skb_core: pass skb by reference
	net: inet_csk: Fix so_reuseport bind-address cache in tb->fast*
	net: ipip: fix wrong address family in init error path
	net/mlx5: Add command entry handling completion
	net: mvpp2: fix RX hashing for non-10G ports
	net: nlmsg_cancel() if put fails for nhmsg
	net: qrtr: Fix passing invalid reference to qrtr_local_enqueue()
	net: revert "net: get rid of an signed integer overflow in ip_idents_reserve()"
	net sched: fix reporting the first-time use timestamp
	net/tls: fix race condition causing kernel panic
	nexthop: Fix attribute checking for groups
	r8152: support additional Microsoft Surface Ethernet Adapter variant
	sctp: Don't add the shutdown timer if its already been added
	sctp: Start shutdown on association restart if in SHUTDOWN-SENT state and socket is closed
	tipc: block BH before using dst_cache
	net/mlx5e: kTLS, Destroy key object after destroying the TIS
	net/mlx5e: Fix inner tirs handling
	net/mlx5: Fix memory leak in mlx5_events_init
	net/mlx5e: Update netdev txq on completions during closure
	net/mlx5: Fix error flow in case of function_setup failure
	net/mlx5: Annotate mutex destroy for root ns
	net/tls: fix encryption error checking
	net/tls: free record only on encryption error
	net: sun: fix missing release regions in cas_init_one().
	net/mlx4_core: fix a memory leak bug.
	mlxsw: spectrum: Fix use-after-free of split/unsplit/type_set in case reload fails
	ARM: dts: rockchip: fix phy nodename for rk3228-evb
	ARM: dts: rockchip: fix phy nodename for rk3229-xms6
	arm64: dts: rockchip: fix status for &gmac2phy in rk3328-evb.dts
	arm64: dts: rockchip: swap interrupts interrupt-names rk3399 gpu node
	ARM: dts: rockchip: swap clock-names of gpu nodes
	ARM: dts: rockchip: fix pinctrl sub nodename for spi in rk322x.dtsi
	gpio: tegra: mask GPIO IRQs during IRQ shutdown
	ALSA: usb-audio: add mapping for ASRock TRX40 Creator
	net: microchip: encx24j600: add missed kthread_stop
	gfs2: move privileged user check to gfs2_quota_lock_check
	gfs2: Grab glock reference sooner in gfs2_add_revoke
	drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate
	drm/amd/powerplay: perform PG ungate prior to CG ungate
	drm/amdgpu: Use GEM obj reference for KFD BOs
	cachefiles: Fix race between read_waiter and read_copier involving op->to_do
	usb: dwc3: pci: Enable extcon driver for Intel Merrifield
	usb: phy: twl6030-usb: Fix a resource leak in an error handling path in 'twl6030_usb_probe()'
	usb: gadget: legacy: fix redundant initialization warnings
	net: freescale: select CONFIG_FIXED_PHY where needed
	IB/i40iw: Remove bogus call to netdev_master_upper_dev_get()
	riscv: stacktrace: Fix undefined reference to `walk_stackframe'
	clk: ti: am33xx: fix RTC clock parent
	csky: Fixup msa highest 3 bits mask
	csky: Fixup perf callchain unwind
	csky: Fixup remove duplicate irq_disable
	hwmon: (nct7904) Fix incorrect range of temperature limit registers
	cifs: Fix null pointer check in cifs_read
	csky: Fixup raw_copy_from_user()
	samples: bpf: Fix build error
	drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
	Input: usbtouchscreen - add support for BonXeon TP
	Input: evdev - call input_flush_device() on release(), not flush()
	Input: xpad - add custom init packet for Xbox One S controllers
	Input: dlink-dir685-touchkeys - fix a typo in driver name
	Input: i8042 - add ThinkPad S230u to i8042 reset list
	Input: synaptics-rmi4 - really fix attn_data use-after-free
	Input: synaptics-rmi4 - fix error return code in rmi_driver_probe()
	ARM: 8970/1: decompressor: increase tag size
	ARM: uaccess: consolidate uaccess asm to asm/uaccess-asm.h
	ARM: uaccess: integrate uaccess_save and uaccess_restore
	ARM: uaccess: fix DACR mismatch with nested exceptions
	gpio: exar: Fix bad handling for ida_simple_get error path
	arm64: dts: mt8173: fix vcodec-enc clock
	soc: mediatek: cmdq: return send msg error code
	gpu/drm: Ingenic: Fix opaque pointer casted to wrong type
	IB/qib: Call kobject_put() when kobject_init_and_add() fails
	ARM: dts/imx6q-bx50v3: Set display interface clock parents
	ARM: dts: bcm2835-rpi-zero-w: Fix led polarity
	ARM: dts: bcm: HR2: Fix PPI interrupt types
	mmc: block: Fix use-after-free issue for rpmb
	gpio: pxa: Fix return value of pxa_gpio_probe()
	gpio: bcm-kona: Fix return value of bcm_kona_gpio_probe()
	RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe()
	ALSA: hwdep: fix a left shifting 1 by 31 UB bug
	ALSA: hda/realtek - Add a model for Thinkpad T570 without DAC workaround
	ALSA: usb-audio: mixer: volume quirk for ESS Technology Asus USB DAC
	exec: Always set cap_ambient in cap_bprm_set_creds
	clk: qcom: gcc: Fix parent for gpll0_out_even
	ALSA: usb-audio: Quirks for Gigabyte TRX40 Aorus Master onboard audio
	ALSA: hda/realtek - Add new codec supported for ALC287
	libceph: ignore pool overlay and cache logic on redirects
	ceph: flush release queue when handling caps for unknown inode
	RDMA/core: Fix double destruction of uobject
	drm/amd/display: drop cursor position check in atomic test
	IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode
	mm,thp: stop leaking unreleased file pages
	mm: remove VM_BUG_ON(PageSlab()) from page_mapcount()
	fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info()
	include/asm-generic/topology.h: guard cpumask_of_node() macro argument
	Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT"
	gpio: fix locking open drain IRQ lines
	iommu: Fix reference count leak in iommu_group_alloc.
	parisc: Fix kernel panic in mem_init()
	cfg80211: fix debugfs rename crash
	x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"
	mac80211: mesh: fix discovery timer re-arming issue / crash
	x86/dma: Fix max PFN arithmetic overflow on 32 bit systems
	copy_xstate_to_kernel(): don't leave parts of destination uninitialized
	xfrm: allow to accept packets with ipv6 NEXTHDR_HOP in xfrm_input
	xfrm: do pskb_pull properly in __xfrm_transport_prep
	xfrm: remove the xfrm_state_put call becofe going to out_reset
	xfrm: call xfrm_output_gso when inner_protocol is set in xfrm_output
	xfrm interface: fix oops when deleting a x-netns interface
	xfrm: fix a warning in xfrm_policy_insert_list
	xfrm: fix a NULL-ptr deref in xfrm_local_error
	xfrm: fix error in comment
	ip_vti: receive ipip packet by calling ip_tunnel_rcv
	netfilter: nft_reject_bridge: enable reject with bridge vlan
	netfilter: ipset: Fix subcounter update skip
	netfilter: conntrack: make conntrack userspace helpers work again
	netfilter: nfnetlink_cthelper: unbreak userspace helper support
	netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code
	esp6: get the right proto for transport mode in esp6_gso_encap
	bnxt_en: Fix accumulation of bp->net_stats_prev.
	ieee80211: Fix incorrect mask for default PE duration
	xsk: Add overflow check for u64 division, stored into u32
	qlcnic: fix missing release in qlcnic_83xx_interrupt_test.
	crypto: chelsio/chtls: properly set tp->lsndtime
	nexthops: Move code from remove_nexthop_from_groups to remove_nh_grp_entry
	nexthops: don't modify published nexthop groups
	nexthop: Expand nexthop_is_multipath in a few places
	ipv4: nexthop version of fib_info_nh_uses_dev
	net: dsa: declare lockless TX feature for slave ports
	bonding: Fix reference count leak in bond_sysfs_slave_add.
	netfilter: conntrack: comparison of unsigned in cthelper confirmation
	netfilter: conntrack: Pass value of ctinfo to __nf_conntrack_update
	netfilter: nf_conntrack_pptp: fix compilation warning with W=1 build
	perf: Make perf able to build with latest libbfd
	Linux 5.4.44

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Idd547df1abb0bea116f30e3224a80387529adb0b
2020-06-04 08:23:35 +02:00
Vinay Kumar Yadav
cf4cc95a15 net/tls: fix race condition causing kernel panic
[ Upstream commit 0cada33241d9de205522e3858b18e506ca5cce2c ]

tls_sw_recvmsg() and tls_decrypt_done() can be run concurrently.
// tls_sw_recvmsg()
	if (atomic_read(&ctx->decrypt_pending))
		crypto_wait_req(-EINPROGRESS, &ctx->async_wait);
	else
		reinit_completion(&ctx->async_wait.completion);

//tls_decrypt_done()
  	pending = atomic_dec_return(&ctx->decrypt_pending);

  	if (!pending && READ_ONCE(ctx->async_notify))
  		complete(&ctx->async_wait.completion);

Consider the scenario tls_decrypt_done() is about to run complete()

	if (!pending && READ_ONCE(ctx->async_notify))

and tls_sw_recvmsg() reads decrypt_pending == 0, does reinit_completion(),
then tls_decrypt_done() runs complete(). This sequence of execution
results in wrong completion. Consequently, for next decrypt request,
it will not wait for completion, eventually on connection close, crypto
resources freed, there is no way to handle pending decrypt response.

This race condition can be avoided by having atomic_read() mutually
exclusive with atomic_dec_return(),complete().Intoduced spin lock to
ensure the mutual exclution.

Addressed similar problem in tx direction.

v1->v2:
- More readable commit message.
- Corrected the lock to fix new race scenario.
- Removed barrier which is not needed now.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03 08:21:01 +02:00
Greg Kroah-Hartman
df04cb4bf9 ANDROID: GKI: networking: add Android ABI padding to a lot of networking structures
Try to mitigate potential future driver core api changes by adding a
padding to a lot of different networking structures:
	struct ipv6_devconf
	struct proto_ops
	struct header_ops
	struct napi_struct
	struct netdev_queue
	struct netdev_rx_queue
	struct xfrmdev_ops
	struct net_device_ops
	struct net_device
	struct packet_type
	struct sk_buff
	struct tlsdev_ops

Based on a change made to the RHEL/CENTOS 8 kernel.

Bug: 151154716
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I590f004754dbc8beafa40e71cac70a0938c38b4a
2020-05-02 12:22:29 +02:00
Jakub Kicinski
569cac5a50 net/tls: use sg_next() to walk sg entries
[ Upstream commit c5daa6cccdc2f94aca2c9b3fa5f94e4469997293 ]

Partially sent record cleanup path increments an SG entry
directly instead of using sg_next(). This should not be a
problem today, as encrypted messages should be always
allocated as arrays. But given this is a cleanup path it's
easy to miss was this ever to change. Use sg_next(), and
simplify the code.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-04 22:31:02 +01:00
Jakub Kicinski
a58365a79a net/tls: remove the dead inplace_crypto code
[ Upstream commit 9e5ffed37df68d0ccfb2fdc528609e23a1e70ebe ]

Looks like when BPF support was added by commit d3b18ad31f
("tls: add bpf support to sk_msg handling") and
commit d829e9c411 ("tls: convert to generic sk_msg interface")
it broke/removed the support for in-place crypto as added by
commit 4e6d47206c ("tls: Add support for inplace records
encryption").

The inplace_crypto member of struct tls_rec is dead, inited
to zero, and sometimes set to zero again. It used to be
set to 1 when record was allocated, but the skmsg code doesn't
seem to have been written with the idea of in-place crypto
in mind.

Since non trivial effort is required to bring the feature back
and we don't really have the HW to measure the benefit just
remove the left over support for now to avoid confusing readers.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-04 22:31:02 +01:00
Willem de Bruijn
d4ffb02dee net/tls: enable sk_msg redirect to tls socket egress
Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket
with TLS_TX takes the following path:

  tcp_bpf_sendmsg_redir
    tcp_bpf_push_locked
      tcp_bpf_push
        kernel_sendpage_locked
          sock->ops->sendpage_locked

Also update the flags test in tls_sw_sendpage_locked to allow flag
MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this.

Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t
Link: https://github.com/wdebruij/kerneltools/commits/icept.2
Fixes: 0608c69c9a ("bpf: sk_msg, sock{map|hash} redirect through ULP")
Fixes: f3de19af0f ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19 15:03:02 -08:00
Jakub Kicinski
79ffe6087e net/tls: add a TX lock
TLS TX needs to release and re-acquire the socket lock if send buffer
fills up.

TLS SW TX path currently depends on only allowing one thread to enter
the function by the abuse of sk_write_pending. If another writer is
already waiting for memory no new ones are allowed in.

This has two problems:
 - writers don't wake other threads up when they leave the kernel;
   meaning that this scheme works for single extra thread (second
   application thread or delayed work) because memory becoming
   available will send a wake up request, but as Mallesham and
   Pooja report with larger number of threads it leads to threads
   being put to sleep indefinitely;
 - the delayed work does not get _scheduled_ but it may _run_ when
   other writers are present leading to crashes as writers don't
   expect state to change under their feet (same records get pushed
   and freed multiple times); it's hard to reliably bail from the
   work, however, because the mere presence of a writer does not
   guarantee that the writer will push pending records before exiting.

Ensuring wakeups always happen will make the code basically open
code a mutex. Just use a mutex.

The TLS HW TX path does not have any locking (not even the
sk_write_pending hack), yet it uses a per-socket sg_tx_data
array to push records.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Mallesham  Jatharakonda <mallesh537@gmail.com>
Reported-by: Pooja Trivedi <poojatrivedi@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:33:32 -08:00
Jakub Kicinski
be2fbc155f net/tls: clean up the number of #ifdefs for CONFIG_TLS_DEVICE
TLS code has a number of #ifdefs which make the code a little
harder to follow. Recent fixes removed the ifdef around the
TLS_HW define, so we can switch to the often used pattern
of defining tls_device functions as empty static inlines
in the header when CONFIG_TLS_DEVICE=n.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:49:49 +02:00
Jakub Kicinski
be7bbea114 net/tls: use the full sk_proto pointer
Since we already have the pointer to the full original sk_proto
stored use that instead of storing all individual callback
pointers as well.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-05 09:49:49 +02:00
Davide Caratti
26811cc9f5 net: tls: export protocol version, cipher, tx_conf/rx_conf to socket diag
When an application configures kernel TLS on top of a TCP socket, it's
now possible for inet_diag_handler() to collect information regarding the
protocol version, the cipher type and TX / RX configuration, in case
INET_DIAG_INFO is requested.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 23:44:28 -07:00
Jakub Kicinski
15a7dea750 net/tls: use RCU protection on icsk->icsk_ulp_data
We need to make sure context does not get freed while diag
code is interrogating it. Free struct tls_context with
kfree_rcu().

We add the __rcu annotation directly in icsk, and cast it
away in the datapath accessor. Presumably all ULPs will
do a similar thing.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-31 23:44:28 -07:00
Jakub Kicinski
5d92e631b8 net/tls: partially revert fix transition through disconnect with close
Looks like we were slightly overzealous with the shutdown()
cleanup. Even though the sock->sk_state can reach CLOSED again,
socket->state will not got back to SS_UNCONNECTED once
connections is ESTABLISHED. Meaning we will see EISCONN if
we try to reconnect, and EINVAL if we try to listen.

Only listen sockets can be shutdown() and reused, but since
ESTABLISHED sockets can never be re-connected() or used for
listen() we don't need to try to clean up the ULP state early.

Fixes: 32857cf57f ("net/tls: fix transition through disconnect with close")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-05 13:15:30 -07:00
John Fastabend
32857cf57f net/tls: fix transition through disconnect with close
It is possible (via shutdown()) for TCP socks to go through TCP_CLOSE
state via tcp_disconnect() without actually calling tcp_close which
would then call the tls close callback. Because of this a user could
disconnect a socket then put it in a LISTEN state which would break
our assumptions about sockets always being ESTABLISHED state.

More directly because close() can call unhash() and unhash is
implemented by sockmap if a sockmap socket has TLS enabled we can
incorrectly destroy the psock from unhash() and then call its close
handler again. But because the psock (sockmap socket representation)
is already destroyed we call close handler in sk->prot. However,
in some cases (TLS BASE/BASE case) this will still point at the
sockmap close handler resulting in a circular call and crash reported
by syzbot.

To fix both above issues implement the unhash() routine for TLS.

v4:
 - add note about tls offload still needing the fix;
 - move sk_proto to the cold cache line;
 - split TX context free into "release" and "free",
   otherwise the GC work itself is in already freed
   memory;
 - more TX before RX for consistency;
 - reuse tls_ctx_free();
 - schedule the GC work after we're done with context
   to avoid UAF;
 - don't set the unhash in all modes, all modes "inherit"
   TLS_BASE's callbacks anyway;
 - disable the unhash hook for TLS_HW.

Fixes: 3c4d755915 ("tls: kernel TLS support")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:17 +02:00
John Fastabend
313ab00480 net/tls: remove sock unlock/lock around strp_done()
The tls close() callback currently drops the sock lock to call
strp_done(). Split up the RX cleanup into stopping the strparser
and releasing most resources, syncing strparser and finally
freeing the context.

To avoid the need for a strp_done() call on the cleanup path
of device offload make sure we don't arm the strparser until
we are sure init will be successful.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:16 +02:00
John Fastabend
f87e62d45e net/tls: remove close callback sock unlock/lock around TX work flush
The tls close() callback currently drops the sock lock, makes a
cancel_delayed_work_sync() call, and then relocks the sock.

By restructuring the code we can avoid droping lock and then
reclaiming it. To simplify this we do the following,

 tls_sk_proto_close
 set_bit(CLOSING)
 set_bit(SCHEDULE)
 cancel_delay_work_sync() <- cancel workqueue
 lock_sock(sk)
 ...
 release_sock(sk)
 strp_done()

Setting the CLOSING bit prevents the SCHEDULE bit from being
cleared by any workqueue items e.g. if one happens to be
scheduled and run between when we set SCHEDULE bit and cancel
work. Then because SCHEDULE bit is set now no new work will
be scheduled.

Tested with net selftests and bpf selftests.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:16 +02:00
Jakub Kicinski
318892ac06 net/tls: don't arm strparser immediately in tls_set_sw_offload()
In tls_set_device_offload_rx() we prepare the software context
for RX fallback and proceed to add the connection to the device.
Unfortunately, software context prep includes arming strparser
so in case of a later error we have to release the socket lock
to call strp_done().

In preparation for not releasing the socket lock half way through
callbacks move arming strparser into a separate function.
Following patches will make use of that.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-22 16:04:16 +02:00
Dirk van der Merwe
b5d9a834f4 net/tls: don't clear TX resync flag on error
Introduce a return code for the tls_dev_resync callback.

When the driver TX resync fails, kernel can retry the resync again
until it succeeds.  This prevents drivers from attempting to offload
TLS packets if the connection is known to be out of sync.

We don't worry about the RX resync since they will be retried naturally
as more encrypted records get received.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 20:21:09 -07:00
David S. Miller
af144a9834 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Two cases of overlapping changes, nothing fancy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08 19:48:57 -07:00
Jakub Kicinski
acd3e96d53 net/tls: make sure offload also gets the keys wiped
Commit 86029d10af ("tls: zero the crypto information from tls_context
before freeing") added memzero_explicit() calls to clear the key material
before freeing struct tls_context, but it missed tls_device.c has its
own way of freeing this structure. Replace the missing free.

Fixes: 86029d10af ("tls: zero the crypto information from tls_context before freeing")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-01 19:22:36 -07:00
David S. Miller
d96ff269a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The new route handling in ip_mc_finish_output() from 'net' overlapped
with the new support for returning congestion notifications from BPF
programs.

In order to handle this I had to take the dev_loopback_xmit() calls
out of the switch statement.

The aquantia driver conflicts were simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 21:06:39 -07:00
Dirk van der Merwe
9354544cbc net/tls: fix page double free on TX cleanup
With commit 94850257cf ("tls: Fix tls_device handling of partial records")
a new path was introduced to cleanup partial records during sk_proto_close.
This path does not handle the SW KTLS tx_list cleanup.

This is unnecessary though since the free_resources calls for both
SW and offload paths will cleanup a partial record.

The visible effect is the following warning, but this bug also causes
a page double free.

    WARNING: CPU: 7 PID: 4000 at net/core/stream.c:206 sk_stream_kill_queues+0x103/0x110
    RIP: 0010:sk_stream_kill_queues+0x103/0x110
    RSP: 0018:ffffb6df87e07bd0 EFLAGS: 00010206
    RAX: 0000000000000000 RBX: ffff8c21db4971c0 RCX: 0000000000000007
    RDX: ffffffffffffffa0 RSI: 000000000000001d RDI: ffff8c21db497270
    RBP: ffff8c21db497270 R08: ffff8c29f4748600 R09: 000000010020001a
    R10: ffffb6df87e07aa0 R11: ffffffff9a445600 R12: 0000000000000007
    R13: 0000000000000000 R14: ffff8c21f03f2900 R15: ffff8c21f03b8df0
    Call Trace:
     inet_csk_destroy_sock+0x55/0x100
     tcp_close+0x25d/0x400
     ? tcp_check_oom+0x120/0x120
     tls_sk_proto_close+0x127/0x1c0
     inet_release+0x3c/0x60
     __sock_release+0x3d/0xb0
     sock_close+0x11/0x20
     __fput+0xd8/0x210
     task_work_run+0x84/0xa0
     do_exit+0x2dc/0xb90
     ? release_sock+0x43/0x90
     do_group_exit+0x3a/0xa0
     get_signal+0x295/0x720
     do_signal+0x36/0x610
     ? SYSC_recvfrom+0x11d/0x130
     exit_to_usermode_loop+0x69/0xb0
     do_syscall_64+0x173/0x180
     entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x7fe9b9abc10d
    RSP: 002b:00007fe9b19a1d48 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
    RAX: fffffffffffffe00 RBX: 0000000000000006 RCX: 00007fe9b9abc10d
    RDX: 0000000000000002 RSI: 0000000000000080 RDI: 00007fe948003430
    RBP: 00007fe948003410 R08: 00007fe948003430 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00005603739d9080
    R13: 00007fe9b9ab9f90 R14: 00007fe948003430 R15: 0000000000000000

Fixes: 94850257cf ("tls: Fix tls_device handling of partial records")
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-24 07:20:45 -07:00
Jakub Kicinski
5018007409 net/tls: add kernel-driven resync mechanism for TX
TLS offload drivers keep track of TCP seq numbers to make sure
the packets are fed into the HW in order.

When packets get dropped on the way through the stack, the driver
will get out of sync and have to use fallback encryption, but unless
TCP seq number is resynced it will never match the packets correctly
(or even worse - use incorrect record sequence number after TCP seq
wraps).

Existing drivers (mlx5) feed the entire record on every out-of-order
event, allowing FW/HW to always be in sync.

This patch adds an alternative, more akin to the RX resync.  When
driver sees a frame which is past its expected sequence number the
stream must have gotten out of order (if the sequence number is
smaller than expected its likely a retransmission which doesn't
require resync).  Driver will ask the stack to perform TX sync
before it submits the next full record, and fall back to software
crypto until stack has performed the sync.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:27 -07:00
Jakub Kicinski
eeb2efaf36 net/tls: generalize the resync callback
Currently only RX direction is ever resynced, however, TX may
also get out of sequence if packets get dropped on the way to
the driver.  Rename the resync callback and add a direction
parameter.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:27 -07:00
Jakub Kicinski
f953d33ba1 net/tls: add kernel-driven TLS RX resync
TLS offload device may lose sync with the TCP stream if packets
arrive out of order.  Drivers can currently request a resync at
a specific TCP sequence number.  When a record is found starting
at that sequence number kernel will inform the device of the
corresponding record number.

This requires the device to constantly scan the stream for a
known pattern (constant bytes of the header) after sync is lost.

This patch adds an alternative approach which is entirely under
the control of the kernel.  Kernel tracks records it had to fully
decrypt, even though TLS socket is in TLS_HW mode.  If multiple
records did not have any decrypted parts - it's a pretty strong
indication that the device is out of sync.

We choose the min number of fully encrypted records to be 2,
which should hopefully be more than will get retransmitted at
a time.

After kernel decides the device is out of sync it schedules a
resync request.  If the TCP socket is empty the resync gets
performed immediately.  If socket is not empty we leave the
record parser to resync when next record comes.

Before resync in message parser we peek at the TCP socket and
don't attempt the sync if the socket already has some of the
next record queued.

On resync failure (encrypted data continues to flow in) we
retry with exponential backoff, up to once every 128 records
(with a 16k record thats at most once every 2M of data).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:26 -07:00
Jakub Kicinski
fe58a5a02c net/tls: rename handle_device_resync()
handle_device_resync() doesn't describe the function very well.
The function checks if resync should be issued upon parsing of
a new record.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:26 -07:00
Jakub Kicinski
89fec474fa net/tls: pass record number as a byte array
TLS offload code casts record number to a u64.  The buffer
should be aligned to 8 bytes, but its actually a __be64, and
the rest of the TLS code treats it as big int.  Make the
offload callbacks take a byte array, drivers can make the
choice to do the ugly cast if they want to.

Prepare for copying the record number onto the stack by
defining a constant for max size of the byte array.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:22:26 -07:00
David S. Miller
a6cdeeb16b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-07 11:00:14 -07:00
Dirk van der Merwe
b9727d7f95 net/tls: export TLS per skb encryption
While offloading TLS connections, drivers need to handle the case where
out of order packets need to be transmitted.

Other drivers obtain the entire TLS record for the specific skb to
provide as context to hardware for encryption. However, other designs
may also want to keep the hardware state intact and perform the
out of order encryption entirely on the host.

To achieve this, export the already existing software encryption
fallback path so drivers could access this.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:40 -07:00
Jakub Kicinski
2e361176ea net/tls: simplify driver context retrieval
Currently drivers have to ensure the alignment of their tls state
structure, which leads to unnecessary layers of getters and
encapsulated structures in each driver.

Simplify all this by marking the driver state as aligned (driver_state
members are currently aligned, so no hole is added, besides ALIGN in
TLS_OFFLOAD_CONTEXT_SIZE_RX/TX would reserve this extra space, anyway.)
With that we can add a common accessor to the core.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:40 -07:00
Jakub Kicinski
2d6b51c692 net/tls: split the TLS_DRIVER_STATE_SIZE and bump TX to 16 bytes
8 bytes of driver state has been enough so far, but for drivers
which have to store 8 byte handle it's no longer practical to
store the state directly in the context.

Drivers generally don't need much extra state on RX side, while
TX side has to be tracking TCP sequence numbers.  Split the
lengths of max driver state size on RX and TX.

The struct tls_offload_context_tx currently stands at 616 bytes and
struct tls_offload_context_rx stands at 368 bytes.  Upcoming work
will consume extra 8 bytes in both for kernel-driven resync.
This means that we can bump TX side to 16 bytes and still fit
into the same number of cache lines but on RX side we would be 8
bytes over.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06 14:13:40 -07:00
Jakub Kicinski
fb0f886fa2 net/tls: don't pass version to tls_advance_record_sn()
All callers pass prot->version as the last parameter
of tls_advance_record_sn(), yet tls_advance_record_sn()
itself needs a pointer to prot.  Pass prot from callers.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 14:33:50 -07:00
Jakub Kicinski
f0aaa2c975 net/tls: reorganize struct tls_context
struct tls_context is slightly badly laid out.  If we reorder things
right we can save 16 bytes (320 -> 304) but also make all fast path
data fit into two cache lines (one read only and one read/write,
down from four cache lines).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 14:33:50 -07:00
Jakub Kicinski
e52972c11d net/tls: replace the sleeping lock around RX resync with a bit lock
Commit 38030d7cb7 ("net/tls: avoid NULL-deref on resync during device removal")
tried to fix a potential NULL-dereference by taking the
context rwsem.  Unfortunately the RX resync may get called
from soft IRQ, so we can't use the rwsem to protect from
the device disappearing.  Because we are guaranteed there
can be only one resync at a time (it's called from strparser)
use a bit to indicate resync is busy and make device
removal wait for the bit to get cleared.

Note that there is a leftover "flags" field in struct
tls_context already.

Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 13:34:37 -07:00
Jakub Kicinski
63a1c95f3f net/tls: byte swap device req TCP seq no upon setting
To avoid a sparse warning byteswap the be32 sequence number
before it's stored in the atomic value.  While at it drop
unnecessary brackets and use kernel's u64 type.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 16:52:21 -04:00
Jakub Kicinski
da68b4ad02 net/tls: move definition of tls ops into net/tls.h
There seems to be no reason for tls_ops to be defined in netdevice.h
which is included in a lot of places.  Don't wrap the struct/enum
declaration in ifdefs, it trickles down unnecessary ifdefs into
driver code.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 16:52:21 -04:00
Jakub Kicinski
9e9957973c net/tls: remove old exports of sk_destruct functions
tls_device_sk_destruct being set on a socket used to indicate
that socket is a kTLS device one.  That is no longer true -
now we use sk_validate_xmit_skb pointer for that purpose.
Remove the export.  tls_device_attach() needs to be moved.

While at it, remove the dead declaration of tls_sk_destruct().

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-27 16:52:21 -04:00
David S. Miller
6b0a7f84ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflict resolution of af_smc.c from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 11:26:25 -07:00
Jakub Kicinski
b4f47f3848 net/tls: prevent bad memory access in tls_is_sk_tx_device_offloaded()
Unlike '&&' operator, the '&' does not have short-circuit
evaluation semantics.  IOW both sides of the operator always
get evaluated.  Fix the wrong operator in
tls_is_sk_tx_device_offloaded(), which would lead to
out-of-bounds access for for non-full sockets.

Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-10 22:42:28 -07:00
Jakub Kicinski
35b71a34ad net/tls: don't leak partially sent record in device mode
David reports that tls triggers warnings related to
sk->sk_forward_alloc not being zero at destruction time:

WARNING: CPU: 5 PID: 6831 at net/core/stream.c:206 sk_stream_kill_queues+0x103/0x110
WARNING: CPU: 5 PID: 6831 at net/ipv4/af_inet.c:160 inet_sock_destruct+0x15b/0x170

When sender fills up the write buffer and dies from
SIGPIPE.  This is due to the device implementation
not cleaning up the partially_sent_record.

This is because commit a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
moved the partial record cleanup to the SW-only path.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-10 13:07:02 -07:00
Vakul Garg
f295b3ae9f net/tls: Add support of AES128-CCM based ciphers
Added support for AES128-CCM based record encryption. AES128-CCM is
similar to AES128-GCM. Both of them have same salt/iv/mac size. The
notable difference between the two is that while invoking AES128-CCM
operation, the salt||nonce (which is passed as IV) has to be prefixed
with a hardcoded value '2'. Further, CCM implementation in kernel
requires IV passed in crypto_aead_request() to be full '16' bytes.
Therefore, the record structure 'struct tls_rec' has been modified to
reserve '16' bytes for IV. This works for both GCM and CCM based cipher.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20 11:02:05 -07:00
Boris Pismenny
7463d3a2db tls: Fix write space handling
TLS device cannot use the sw context. This patch returns the original
tls device write space handler and moves the sw/device specific portions
to the relevant files.

Also, we remove the write_space call for the tls_sw flow, because it
handles partial records in its delayed tx work handler.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:10:16 -08:00
Boris Pismenny
94850257cf tls: Fix tls_device handling of partial records
Cleanup the handling of partial records while fixing a bug where the
tls_push_pending_closed_record function is using the software tls
context instead of the hardware context.

The bug resulted in the following crash:
[   88.791229] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[   88.793271] #PF error: [normal kernel read fault]
[   88.794449] PGD 800000022a426067 P4D 800000022a426067 PUD 22a156067 PMD 0
[   88.795958] Oops: 0000 [#1] SMP PTI
[   88.796884] CPU: 2 PID: 4973 Comm: openssl Not tainted 5.0.0-rc4+ #3
[   88.798314] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   88.800067] RIP: 0010:tls_tx_records+0xef/0x1d0 [tls]
[   88.801256] Code: 00 02 48 89 43 08 e8 a0 0b 96 d9 48 89 df e8 48 dd
4d d9 4c 89 f8 4d 8b bf 98 00 00 00 48 05 98 00 00 00 48 89 04 24 49 39
c7 <49> 8b 1f 4d 89 fd 0f 84 af 00 00 00 41 8b 47 10 85 c0 0f 85 8d 00
[   88.805179] RSP: 0018:ffffbd888186fca8 EFLAGS: 00010213
[   88.806458] RAX: ffff9af1ed657c98 RBX: ffff9af1e88a1980 RCX: 0000000000000000
[   88.808050] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9af1e88a1980
[   88.809724] RBP: ffff9af1e88a1980 R08: 0000000000000017 R09: ffff9af1ebeeb700
[   88.811294] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   88.812917] R13: ffff9af1e88a1980 R14: ffff9af1ec13f800 R15: 0000000000000000
[   88.814506] FS:  00007fcad2240740(0000) GS:ffff9af1f7880000(0000) knlGS:0000000000000000
[   88.816337] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.817717] CR2: 0000000000000000 CR3: 0000000228b3e000 CR4: 00000000001406e0
[   88.819328] Call Trace:
[   88.820123]  tls_push_data+0x628/0x6a0 [tls]
[   88.821283]  ? remove_wait_queue+0x20/0x60
[   88.822383]  ? n_tty_read+0x683/0x910
[   88.823363]  tls_device_sendmsg+0x53/0xa0 [tls]
[   88.824505]  sock_sendmsg+0x36/0x50
[   88.825492]  sock_write_iter+0x87/0x100
[   88.826521]  __vfs_write+0x127/0x1b0
[   88.827499]  vfs_write+0xad/0x1b0
[   88.828454]  ksys_write+0x52/0xc0
[   88.829378]  do_syscall_64+0x5b/0x180
[   88.830369]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   88.831603] RIP: 0033:0x7fcad1451680

[ 1248.470626] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 1248.472564] #PF error: [normal kernel read fault]
[ 1248.473790] PGD 0 P4D 0
[ 1248.474642] Oops: 0000 [#1] SMP PTI
[ 1248.475651] CPU: 3 PID: 7197 Comm: openssl Tainted: G           OE 5.0.0-rc4+ #3
[ 1248.477426] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 1248.479310] RIP: 0010:tls_tx_records+0x110/0x1f0 [tls]
[ 1248.480644] Code: 00 02 48 89 43 08 e8 4f cb 63 d7 48 89 df e8 f7 9c
1b d7 4c 89 f8 4d 8b bf 98 00 00 00 48 05 98 00 00 00 48 89 04 24 49 39
c7 <49> 8b 1f 4d 89 fd 0f 84 af 00 00 00 41 8b 47 10 85 c0 0f 85 8d 00
[ 1248.484825] RSP: 0018:ffffaa0a41543c08 EFLAGS: 00010213
[ 1248.486154] RAX: ffff955a2755dc98 RBX: ffff955a36031980 RCX: 0000000000000006
[ 1248.487855] RDX: 0000000000000000 RSI: 000000000000002b RDI: 0000000000000286
[ 1248.489524] RBP: ffff955a36031980 R08: 0000000000000000 R09: 00000000000002b1
[ 1248.491394] R10: 0000000000000003 R11: 00000000ad55ad55 R12: 0000000000000000
[ 1248.493162] R13: 0000000000000000 R14: ffff955a2abe6c00 R15: 0000000000000000
[ 1248.494923] FS:  0000000000000000(0000) GS:ffff955a378c0000(0000) knlGS:0000000000000000
[ 1248.496847] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1248.498357] CR2: 0000000000000000 CR3: 000000020c40e000 CR4: 00000000001406e0
[ 1248.500136] Call Trace:
[ 1248.500998]  ? tcp_check_oom+0xd0/0xd0
[ 1248.502106]  tls_sk_proto_close+0x127/0x1e0 [tls]
[ 1248.503411]  inet_release+0x3c/0x60
[ 1248.504530]  __sock_release+0x3d/0xb0
[ 1248.505611]  sock_close+0x11/0x20
[ 1248.506612]  __fput+0xb4/0x220
[ 1248.507559]  task_work_run+0x88/0xa0
[ 1248.508617]  do_exit+0x2cb/0xbc0
[ 1248.509597]  ? core_sys_select+0x17a/0x280
[ 1248.510740]  do_group_exit+0x39/0xb0
[ 1248.511789]  get_signal+0x1d0/0x630
[ 1248.512823]  do_signal+0x36/0x620
[ 1248.513822]  exit_to_usermode_loop+0x5c/0xc6
[ 1248.515003]  do_syscall_64+0x157/0x180
[ 1248.516094]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1248.517456] RIP: 0033:0x7fb398bd3f53
[ 1248.518537] Code: Bad RIP value.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:10:16 -08:00
Vakul Garg
2b794c4098 tls: Return type of non-data records retrieved using MSG_PEEK in recvmsg
The patch enables returning 'type' in msghdr for records that are
retrieved with MSG_PEEK in recvmsg. Further it prevents records peeked
from socket from getting clubbed with any other record of different
type when records are subsequently dequeued from strparser.

For each record, we now retain its type in sk_buff's control buffer
cb[]. Inside control buffer, record's full length and offset are already
stored by strparser in 'struct strp_msg'. We store record type after
'struct strp_msg' inside 'struct tls_msg'. For tls1.2, the type is
stored just after record dequeue. For tls1.3, the type is stored after
record has been decrypted.

Inside process_rx_list(), before processing a non-data record, we check
that we must be able to return back the record type to the user
application. If not, the decrypted records in tls context's rx_list is
left there without consuming any data.

Fixes: 692d7b5d1f ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 21:58:38 -08:00
Vakul Garg
4509de1468 net/tls: Move protocol constants from cipher context to tls context
Each tls context maintains two cipher contexts (one each for tx and rx
directions). For each tls session, the constants such as protocol
version, ciphersuite, iv size, associated data size etc are same for
both the directions and need to be stored only once per tls context.
Hence these are moved from 'struct cipher_context' to 'struct
tls_prot_info' and stored only once in 'struct tls_context'.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-19 10:40:36 -08:00