OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Alexander Aring	aee742c992	fs: dlm: fix return -EINTR on recovery stopped This patch will return -EINTR instead of 1 if recovery is stopped. In case of ping_members() the return value will be checked if the error is -EINTR for signaling another recovery was triggered and the whole recovery process will come to a clean end to process the next one. Returning 1 will abort the recovery process and can leave the recovery in a broken state. It was reported with the following kernel log message attached and a gfs2 mount stopped working: "dlm: bobvirt1: dlm_recover_members error 1" whereas 1 was returned because of a conversion of "dlm_recovery_stopped()" to an errno was missing which this patch will introduce. While on it all other possible missing errno conversions at other places were added as they are done as in other places. It might be worth to check the error case at this recovery level, because some of the functionality also returns -ENOBUFS and check why recovery ends in a broken state. However this will fix the issue if another recovery was triggered at some points of recovery handling. Reported-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-08-19 11:33:03 -05:00
Alexander Aring	b97f85259f	fs: dlm: implement delayed ack handling This patch changes that we don't ack each message. Lowcomms will take care about to send an ack back after a bulk of messages was processed. Currently it's only when the whole receive buffer was processed, there might better positions to send an ack back but only the lowcomms implementation know when there are more data to receive. This patch has also disadvantages that we might retransmit more on errors, however this is a very rare case. Tested with make_panic on gfs2 with three nodes by running: trace-cmd record -p function -l 'dlm_send_ack' sleep 100 and trace-cmd report \| wc -l Before patch: - 20548 - 21376 - 21398 After patch: - 18338 - 20679 - 19949 Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-08-19 11:33:03 -05:00
Alexander Aring	62699b3f0a	fs: dlm: move receive loop into receive handler This patch moves the kernel_recvmsg() loop call into the receive_from_sock() function instead of doing the loop outside the function and abort the loop over it's return value. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:56:36 -05:00
Alexander Aring	c51b022179	fs: dlm: fix multiple empty writequeue alloc This patch will add a mutex that a connection can allocate a writequeue entry buffer only at a sleepable context at one time. If multiple caller waits at the writequeue spinlock and the spinlock gets release it could be that multiple new writequeue page buffers were allocated instead of allocate one writequeue page buffer and other waiters will use remaining buffer of it. It will only be the case for sleepable context which is the common case. In non-sleepable contexts like retransmission we just don't care about such behaviour. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:56:35 -05:00
Alexander Aring	8728a455d2	fs: dlm: generic connect func This patch adds a generic connect function for TCP and SCTP. If the connect functionality differs from each other additional callbacks in dlm_proto_ops were added. The sockopts callback handling will guarantee that sockets created by connect() will use the same options as sockets created by accept(). Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:56:34 -05:00
Alexander Aring	90d21fc047	fs: dlm: auto load sctp module This patch adds a "for now" better handling of missing SCTP support in the kernel and try to load the sctp module if SCTP is set. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:56:26 -05:00
Alexander Aring	2dc6b1158c	fs: dlm: introduce generic listen This patch combines each transport layer listen functionality into one listen function. Per transport layer differences are provided by additional callbacks in dlm_proto_ops. This patch drops silently sock_set_keepalive() for listen tcp sockets only. This socket option is not set at connecting sockets, I also don't see the sense of set keepalive for sockets which are created by accept() only. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:53:43 -05:00
Alexander Aring	a66c008cd1	fs: dlm: move to static proto ops This patch moves the per transport socket callbacks to a static const array. We can support only one transport socket for the init namespace which will be determinted by reading the dlm config at lowcomms_start(). Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:53:43 -05:00
Alexander Aring	66d5955a09	fs: dlm: introduce con_next_wq helper This patch introduce a function to determine if something is ready to being send in the writequeue. It's not just that the writequeue is not empty additional the first entry need to have a valid length field. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:53:43 -05:00
Alexander Aring	88aa023a25	fs: dlm: cleanup and remove _send_rcom The _send_rcom() can be removed and we call directly dlm_rcom_out(). As we doing that we removing the struct dlm_ls parameter which isn't used. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:53:43 -05:00
Alexander Aring	052849beea	fs: dlm: clear CF_APP_LIMITED on close If send_to_sock() sets CF_APP_LIMITED limited bit and it has not been cleared by a waiting lowcomms_write_space() yet and a close_connection() apprears we should clear the CF_APP_LIMITED bit again because the connection starts from a new state again at reconnect. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:53:43 -05:00
Alexander Aring	b892e4792c	fs: dlm: fix typo in tlv prefix This patch fixes a small typo in a unused struct field. It should named be t_pad instead of o_pad. Came over this as I updated wireshark dissector. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:53:43 -05:00
Alexander Aring	d921a23f3e	fs: dlm: use READ_ONCE for config var This patch will use READ_ONCE to signal the compiler to read this variable only one time. If we don't do that it could be that the compiler read this value more than one time, because some optimizations, from the configure data which might can be changed during this time. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:53:43 -05:00
Alexander Aring	feb704bd17	fs: dlm: use sk->sk_socket instead of con->sock Instead of dereference "con->sock" we can get the socket structure over "sk->sk_socket" as well. This patch will switch to this behaviour. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-07-19 11:53:43 -05:00
Alexander Aring	957adb68b3	fs: dlm: invalid buffer access in lookup error This patch will evaluate the message length if a dlm opts header can fit in before accessing it if a node lookup fails. The invalid sequence error means that the version detection failed and an unexpected message arrived. For debugging such situation the type of arrived message is important to know. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-06-11 12:44:47 -05:00
Alexander Aring	f5fe8d5107	fs: dlm: fix race in mhandle deletion This patch fixes a race between mhandle deletion in case of receiving an acknowledge and flush of all pending mhandle in cases of an timeout or resetting node states. Fixes: `489d8e559c` ("fs: dlm: add reliable connection if reconnect") Reported-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-06-11 12:44:47 -05:00
Alexander Aring	d10a0b8875	fs: dlm: rename socket and app buffer defines This patch renames DEFAULT_BUFFER_SIZE to DLM_MAX_SOCKET_BUFSIZE and LOWCOMMS_MAX_TX_BUFFER_LEN to DLM_MAX_APP_BUFSIZE as they are proper names to define what's behind those values. The DLM_MAX_SOCKET_BUFSIZE defines the maximum size of buffer which can be handled on socket layer, the DLM_MAX_APP_BUFSIZE defines the maximum size of buffer which can be handled by the DLM application layer. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-06-02 11:53:04 -05:00
Alexander Aring	ac7d5d036d	fs: dlm: introduce proto values Currently the dlm protocol values are that TCP is 0 and everything else is SCTP. This makes it difficult to introduce possible other transport layers. The only one user space tool dlm_controld, which I am aware of, handles the protocol value 1 for SCTP. We change it now to handle SCTP as 1, this will break user space API but it will fix it so we can add possible other transport layers. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-06-02 11:53:04 -05:00
Alexander Aring	9a4139a794	fs: dlm: move dlm allow conn This patch checks if possible allowing new connections is allowed before queueing the listen socket to accept new connections. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-06-02 11:53:04 -05:00
Alexander Aring	6c6a1cc666	fs: dlm: use alloc_ordered_workqueue The proper way to allocate ordered workqueues is to use alloc_ordered_workqueue() function. The current way implies an ordered workqueue which is also required by dlm. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-06-02 11:53:04 -05:00
Alexander Aring	700ab1c363	fs: dlm: fix memory leak when fenced I got some kmemleak report when a node was fenced. The user space tool dlm_controld will therefore run some rmdir() in dlm configfs which was triggering some memleaks. This patch stores the sps and cms attributes which stores some handling for subdirectories of the configfs cluster entry and free them if they get released as the parent directory gets freed. unreferenced object 0xffff88810d9e3e00 (size 192): comm "dlm_controld", pid 342, jiffies 4294698126 (age 55438.801s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 73 70 61 63 65 73 00 00 ........spaces.. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000db8b640b>] make_cluster+0x5d/0x360 [<000000006a571db4>] configfs_mkdir+0x274/0x730 [<00000000b094501c>] vfs_mkdir+0x27e/0x340 [<0000000058b0adaf>] do_mkdirat+0xff/0x1b0 [<00000000d1ffd156>] do_syscall_64+0x40/0x80 [<00000000ab1408c8>] entry_SYSCALL_64_after_hwframe+0x44/0xae unreferenced object 0xffff88810d9e3a00 (size 192): comm "dlm_controld", pid 342, jiffies 4294698126 (age 55438.801s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 63 6f 6d 6d 73 00 00 00 ........comms... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000a7ef6ad2>] make_cluster+0x82/0x360 [<000000006a571db4>] configfs_mkdir+0x274/0x730 [<00000000b094501c>] vfs_mkdir+0x27e/0x340 [<0000000058b0adaf>] do_mkdirat+0xff/0x1b0 [<00000000d1ffd156>] do_syscall_64+0x40/0x80 [<00000000ab1408c8>] entry_SYSCALL_64_after_hwframe+0x44/0xae Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-06-02 11:53:04 -05:00
Alexander Aring	fcef0e6c27	fs: dlm: fix lowcomms_start error case This patch fixes the error path handling in lowcomms_start(). We need to cleanup some static allocated data structure and cleanup possible workqueue if these have started. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-06-02 11:53:04 -05:00
Colin Ian King	7d3848c03e	fs: dlm: Fix spelling mistake "stucked" -> "stuck" There are spelling mistake in log messages. Fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-26 09:49:04 -05:00
Colin Ian King	f6089981d0	fs: dlm: Fix memory leak of object mh There is an error return path that is not kfree'ing mh after it has been successfully allocates. Fix this by moving the call to create_rcom to after the check on rc_in->rc_id check to avoid this. Thanks to Alexander Ahring Oder Aring for suggesting the correct way to fix this. Addresses-Coverity: ("Resource leak") Fixes: `a070a91cf1` ("fs: dlm: add more midcomms hooks") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-26 09:49:04 -05:00
Alexander Aring	706474fbc5	fs: dlm: don't allow half transmitted messages This patch will clean a dirty page buffer if a reconnect occurs. If a page buffer was half transmitted we cannot start inside the middle of a dlm message if a node connects again. I observed invalid length receptions errors and was guessing that this behaviour occurs, after this patch I never saw an invalid message length again. This patch might drops more messages for dlm version 3.1 but 3.1 can't deal with half messages as well, for 3.2 it might trigger more re-transmissions but will not leave dlm in a broken state. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	5b2f981fde	fs: dlm: add midcomms debugfs functionality This patch adds functionality to debug midcomms per connection state inside a comms directory which is similar like dlm configfs. Currently there exists the possibility to read out two attributes which is the send queue counter and the version of each midcomms node state. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	489d8e559c	fs: dlm: add reliable connection if reconnect This patch introduce to make a tcp lowcomms connection reliable even if reconnects occurs. This is done by an application layer re-transmission handling and sequence numbers in dlm protocols. There are three new dlm commands: DLM_OPTS: This will encapsulate an existing dlm message (and rcom message if they don't have an own application side re-transmission handling). As optional handling additional tlv's (type length fields) can be appended. This can be for example a sequence number field. However because in DLM_OPTS the lockspace field is unused and a sequence number is a mandatory field it isn't made as a tlv and we put the sequence number inside the lockspace id. The possibility to add optional options are still there for future purposes. DLM_ACK: Just a dlm header to acknowledge the receive of a DLM_OPTS message to it's sender. DLM_FIN: This provides a 4 way handshake for connection termination inclusive support for half-closed connections. It's provided on application layer because SCTP doesn't support half-closed sockets, the shutdown() call can interrupted by e.g. TCP resets itself and a hard logic to implement it because the othercon paradigm in lowcomms. The 4-way termination handshake also solve problems to synchronize peer EOF arrival and that the cluster manager removes the peer in the node membership handling of DLM. In some cases messages can be still transmitted in this time and we need to wait for the node membership event. To provide a reliable connection the node will retransmit all unacknowledges message to it's peer on reconnect. The receiver will then filtering out the next received message and drop all messages which are duplicates. As RCOM_STATUS and RCOM_NAMES messages are the first messages which are exchanged and they have they own re-transmission handling, there exists logic that these messages must be first. If these messages arrives we store the dlm version field. This handling is on DLM 3.1 and after this patch 3.2 the same. A backwards compatibility handling has been added which seems to work on tests without tcpkill, however it's not recommended to use DLM 3.1 and 3.2 at the same time, because DLM 3.2 tries to fix long term bugs in the DLM protocol. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	8e2e40860c	fs: dlm: add union in dlm header for lockspace id This patch adds union inside the lockspace id to handle it also for another use case for a different dlm command. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	37a247da51	fs: dlm: move out some hash functionality This patch moves out some lowcomms hash functionality into lowcomms header to provide them to other layers like midcomms as well. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	2874d1a68c	fs: dlm: add functionality to re-transmit a message This patch introduces a retransmit functionality for a lowcomms message handle. It's just allocates a new buffer and transmit it again, no special handling about prioritize it because keeping bytestream in order. To avoid another connection look some refactor was done to make a new buffer allocation with a preexisting connection pointer. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	8f2dc78dbc	fs: dlm: make buffer handling per msg This patch makes the void pointer handle for lowcomms functionality per message and not per page allocation entry. A refcount handling for the handle was added to keep the message alive until the user doesn't need it anymore. There exists now a per message callback which will be called when allocating a new buffer. This callback will be guaranteed to be called according the order of the sending buffer, which can be used that the caller increments a sequence number for the dlm message handle. For transition process we cast the dlm_mhandle to dlm_msg and vice versa until the midcomms layer will implement a specific dlm_mhandle structure. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	a070a91cf1	fs: dlm: add more midcomms hooks This patch prepares hooks to redirect to the midcomms layer which will be used by the midcomms re-transmit handling. There exists the new concept of stateless buffers allocation and commits. This can be used to bypass the midcomms re-transmit handling. It is used by RCOM_STATUS and RCOM_NAMES messages, because they have their own ping-like re-transmit handling. As well these two messages will be used to determine the DLM version per node, because these two messages are per observation the first messages which are exchanged. Cluster manager events for node membership are added to add support for half-closed connections in cases that the peer connection get to an end of file but DLM still holds membership of the node. In this time DLM can still trigger new message which we should allow. After the cluster manager node removal event occurs it safe to close the connection. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	6fb5cf9d42	fs: dlm: public header in out utility This patch allows to use header_out() and header_in() outside of dlm util functionality. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	8aa31cbf20	fs: dlm: fix connection tcp EOF handling This patch fixes the EOF handling for TCP that if and EOF is received we will close the socket next time the writequeue runs empty. This is a half-closed socket functionality which doesn't exists in SCTP. The midcomms layer will do a half closed socket functionality on DLM side to solve this problem for the SCTP case. However there is still the last ack flying around but other reset functionality will take care of it if it got lost. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	c6aa00e3d2	fs: dlm: cancel work sync othercon These rx tx flags arguments are for signaling close_connection() from which worker they are called. Obviously the receive worker cannot cancel itself and vice versa for swork. For the othercon the receive worker should only be used, however to avoid deadlocks we should pass the same flags as the original close_connection() was called. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	ba868d9dea	fs: dlm: reconnect if socket error report occurs This patch will change the reconnect handling that if an error occurs if a socket error callback is occurred. This will also handle reconnects in a non blocking connecting case which is currently missing. If error ECONNREFUSED is reported we delay the reconnect by one second. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	7443bc9625	fs: dlm: set is othercon flag There is a is othercon flag which is never used, this patch will set it and printout a warning if the othercon ever sends a dlm message which should never be the case. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	b38bc9c2b3	fs: dlm: fix srcu read lock usage This patch holds the srcu connection read lock in cases where we lookup the connections and accessing it. We don't hold the srcu lock in workers function where the scheduled worker is part of the connection itself. The connection should not be freed if any worker is scheduled or pending. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	2df6b7627a	fs: dlm: add dlm macros for ratelimit log This patch add ratelimit macro to dlm subsystem and will set the connecting log message to ratelimit. In non blocking connecting cases it will print out this message a lot. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Alexander Aring	c937aabbd7	fs: dlm: always run complete for possible waiters This patch changes the ping_members() result that we always run complete() for possible waiters. We handle the -EINTR error code as successful. This error code is returned if the recovery is stopped which is likely that a new recovery is triggered with a new members configuration and ping_members() runs again. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-05-25 09:22:20 -05:00
Yang Yingliang	2fd8db2dd0	fs: dlm: fix missing unlock on error in accept_from_sock() Add the missing unlock before return from accept_from_sock() in the error handling case. Fixes: `6cde210a97` ("fs: dlm: add helper for init connection") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-29 13:28:18 -05:00
Alexander Aring	9d232469bc	fs: dlm: add shutdown hook This patch fixes issues which occurs when dlm lowcomms synchronize their workqueues but dlm application layer already released the lockspace. In such cases messages like: dlm: gfs2: release_lockspace final free dlm: invalid lockspace 3841231384 from 1 cmd 1 type 11 are printed on the kernel log. This patch is solving this issue by introducing a new "shutdown" hook before calling "stop" hook when the lockspace is going to be released finally. This should pretend any dlm messages sitting in the workqueues during or after lockspace removal. It's necessary to call dlm_scand_stop() as I instrumented dlm_lowcomms_get_buffer() code to report a warning after it's called after dlm_midcomms_shutdown() functionality, see below: WARNING: CPU: 1 PID: 3794 at fs/dlm/midcomms.c:1003 dlm_midcomms_get_buffer+0x167/0x180 Modules linked in: joydev iTCO_wdt intel_pmc_bxt iTCO_vendor_support drm_ttm_helper ttm pcspkr serio_raw i2c_i801 i2c_smbus drm_kms_helper virtio_scsi lpc_ich virtio_balloon virtio_console xhci_pci xhci_pci_renesas cec qemu_fw_cfg drm [last unloaded: qxl] CPU: 1 PID: 3794 Comm: dlm_scand Tainted: G W 5.11.0+ #26 Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.13.0-2.module+el8.3.0+7353+9de0a3cc 04/01/2014 RIP: 0010:dlm_midcomms_get_buffer+0x167/0x180 Code: 5d 41 5c 41 5d 41 5e 41 5f c3 0f 0b 45 31 e4 5b 5d 4c 89 e0 41 5c 41 5d 41 5e 41 5f c3 4c 89 e7 45 31 e4 e8 3b f1 ec ff eb 86 <0f> 0b 4c 89 e7 45 31 e4 e8 2c f1 ec ff e9 74 ff ff ff 0f 1f 80 00 RSP: 0018:ffffa81503f8fe60 EFLAGS: 00010202 RAX: 0000000000000008 RBX: ffff8f969827f200 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffffad1e89a0 RDI: ffff8f96a5294160 RBP: 0000000000000001 R08: 0000000000000000 R09: ffff8f96a250bc60 R10: 00000000000045d3 R11: 0000000000000000 R12: ffff8f96a250bc60 R13: ffffa81503f8fec8 R14: 0000000000000070 R15: 0000000000000c40 FS: 0000000000000000(0000) GS:ffff8f96fbc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055aa3351c000 CR3: 000000010bf22000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: dlm_scan_rsbs+0x420/0x670 ? dlm_uevent+0x20/0x20 dlm_scand+0xbf/0xe0 kthread+0x13a/0x150 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 To synchronize all dlm scand messages we stop it right before shutdown hook. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	eec054b5a7	fs: dlm: flush swork on shutdown This patch fixes the flushing of send work before shutdown. The function cancel_work_sync() is not the right workqueue functionality to use here as it would cancel the work if the work queues itself. In cases of EAGAIN in send() for dlm message we need to be sure that everything is send out before. The function flush_work() will ensure that every send work is be done inclusive in EAGAIN cases. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	df9e06b800	fs: dlm: remove unaligned memory access handling This patch removes unaligned memory access handling for receiving midcomms messages. This handling will not fix the unaligned memory access in general. All messages should be length aligned to 8 bytes, there exists cases where this isn't the case. It's part of the sending handling to not send such messages. As the sending handling itself, with the internal allocator of page buffers, can occur in unaligned memory access of dlm message fields we just ignore that problem for now as it seems this code is used by architecture which can handle it. This patch adds a comment to take care about that problem in a major bump of dlm protocol. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	710176e836	fs: dlm: check on minimum msglen size This patch adds an additional check for minimum dlm header size which is an invalid dlm message and signals a broken stream. A msglen field cannot be less than the dlm header size because the field is inclusive header lengths. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	f0747ebf48	fs: dlm: simplify writequeue handling This patch cleans up the current dlm sending allocator handling by using some named macros, list functionality and removes some goto statements. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	e1a7cbce53	fs: dlm: use GFP_ZERO for page buffer This patch uses GFP_ZERO for allocate a page for the internal dlm sending buffer allocator instead of calling memset zero after every allocation. An already allocated space will never be reused again. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	c45674fbdd	fs: dlm: change allocation limits While running tcpkill I experienced invalid header length values while receiving to check that a node doesn't try to send a invalid dlm message we also check on applications minimum allocation limit. Also use DEFAULT_BUFFER_SIZE as maximum allocation limit. The define LOWCOMMS_MAX_TX_BUFFER_LEN is to calculate maximum buffer limits on application layer, future midcomms layer will subtract their needs from this define. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	517461630d	fs: dlm: add check if dlm is currently running This patch adds checks for dlm config attributes regarding to protocol parameters as it makes only sense to change them when dlm is not running. It also adds a check for valid protocol specifiers and return invalid argument if they are not supported. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	8aa9540b49	fs: dlm: add errno handling to check callback This allows to return individual errno values for the config attribute check callback instead of returning invalid argument only. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	e9a470acd9	fs: dlm: set subclass for othercon sock_mutex This patch sets the lockdep subclass for the othercon socket mutex. In various places the connection socket mutex is held while locking the othercon socket mutex. This patch will remove lockdep warnings when such case occurs. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	b30a624f50	fs: dlm: set connected bit after accept This patch sets the CF_CONNECTED bit when dlm accepts a connection from another node. If we don't set this bit, next time if the connection socket gets writable it will assume an event that the connection is successfully connected. However that is only the case when the connection did a connect. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	e125fbeb53	fs: dlm: fix mark setting deadlock This patch fixes an deadlock issue when dlm_lowcomms_close() is called. When dlm_lowcomms_close() is called the clusters_root.subsys.su_mutex is held to remove configfs items. At this time we flushing (e.g. cancel_work_sync()) the workers of send and recv workqueue. Due the fact that we accessing configfs items (mark values), these workers will lock clusters_root.subsys.su_mutex as well which are already hold by dlm_lowcomms_close() and ends in a deadlock situation. [67170.703046] ====================================================== [67170.703965] WARNING: possible circular locking dependency detected [67170.704758] 5.11.0-rc4+ #22 Tainted: G W [67170.705433] ------------------------------------------------------ [67170.706228] dlm_controld/280 is trying to acquire lock: [67170.706915] ffff9f2f475a6948 ((wq_completion)dlm_recv){+.+.}-{0:0}, at: __flush_work+0x203/0x4c0 [67170.708026] but task is already holding lock: [67170.708758] ffffffffa132f878 (&clusters_root.subsys.su_mutex){+.+.}-{3:3}, at: configfs_rmdir+0x29b/0x310 [67170.710016] which lock already depends on the new lock. The new behaviour adds the mark value to the node address configuration which doesn't require to held the clusters_root.subsys.su_mutex by accessing mark values in a separate datastructure. However the mark values can be set now only after a node address was set which is the case when the user is using dlm_controld. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	92c48950b4	fs: dlm: fix debugfs dump This patch fixes the following message which randomly pops up during glocktop call: seq_file: buggy .next function table_seq_next did not update position index The issue is that seq_read_iter() in fs/seq_file.c also needs an increment of the index in an non next record case as well which this patch fixes otherwise seq_read_iter() will print out the above message. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2021-03-09 08:56:42 -06:00
Alexander Aring	4f19d071f9	fs: dlm: check on existing node address This patch checks if we add twice the same address to a per node address array. This should never be the case and we report -EEXIST to the user space. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	40c6b83e5a	fs: dlm: constify addr_compare This patch just constify some function parameter which should be have a read access only. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	1a26bfafbc	fs: dlm: fix check for multi-homed hosts This patch will use the runtime array size dlm_local_count variable to check the actual size of the dlm_local_addr array. There exists currently a cleanup bug, because the tcp_listen_for_all() functionality might check on a dangled pointer. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	d11ccd451b	fs: dlm: listen socket out of connection hash This patch introduces a own connection structure for the listen socket handling instead of handling the listen socket as normal connection structure in the connection hash. We can remove some nodeid equals zero validation checks, because this nodeid should not exists anymore inside the node hash. This patch also removes the sock mutex in accept_from_sock() function because this function can't occur in another parallel context if it's scheduled on only one workqueue. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	13004e8afe	fs: dlm: refactor sctp sock parameter This patch refactors sctp_bind_addrs() to work with a socket parameter instead of a connection parameter. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	42873c903b	fs: dlm: move shutdown action to node creation This patch move the assignment for the shutdown action callback to the node creation functionality. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	0672c3c280	fs: dlm: move connect callback in node creation This patch moves the assignment for the connect callback to the node creation instead of assign some dummy functionality. The assignment which connect functionality will be used will be detected according to the configfs setting. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	6cde210a97	fs: dlm: add helper for init connection This patch will move the connection structure initialization into an own function. This avoids cases to update the othercon initialization. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	19633c7e20	fs: dlm: handle non blocked connect event The manpage of connect shows that in non blocked mode a writeability indicates successful connection event. This patch is handling this event inside the writeability callback. In case of SCTP we use blocking connect functionality which indicates a successful connect when the function returns with a successful return value. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	53a5edaa05	fs: dlm: flush othercon at close This patch ensures we also flush the othercon writequeue when a lowcomms close occurs. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	692f51c8cb	fs: dlm: add get buffer error handling This patch adds an error handling to the get buffer functionality if the user is requesting a buffer length which is more than possible of the internal buffer allocator. This should never happen because specific handling decided by compile time, but will warn if somebody forget about to handle this limitation right. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	9f8f9c774a	fs: dlm: define max send buffer This patch will set the maximum transmit buffer size for rcom messages with "names" to 4096 bytes. It's a leftover change of commit `4798cbbfbd` ("fs: dlm: rework receive handling"). Fact is that we cannot allocate a contiguous transmit buffer length above of 4096 bytes. It seems at some places the upper layer protocol will calculate according to dlm_config.ci_buffer_size the possible payload of a dlm recovery message. As compiler setting we will use now the maximum possible message which dlm can send out. Commit `4e192ee68e` ("fs: dlm: disallow buffer size below default") disallow a buffer setting smaller than the 4096 bytes and above 4096 bytes is definitely wrong because we will then write out of buffer space as we cannot allocate a contiguous buffer above 4096 bytes. The ci_buffer_size is still there to define the possible maximum receive buffer size of a recvmsg() which should be at least the maximum possible dlm message size. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Alexander Aring	5cbec208dc	fs: dlm: fix proper srcu api call This patch will use call_srcu() instead of call_rcu() because the related datastructure resource are handled under srcu context. I assume the current code is fine anyway since free_conn() must be called when the related resource are not in use otherwise. However it will correct the overall handling in a srcu context. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-11-10 12:14:20 -06:00
Linus Torvalds	9ff9b0d392	networking changes for the 5.10 merge window Add redirect_neigh() BPF packet redirect helper, allowing to limit stack traversal in common container configs and improving TCP back-pressure. Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain. Expand netlink policy support and improve policy export to user space. (Ge)netlink core performs request validation according to declared policies. Expand the expressiveness of those policies (min/max length and bitmasks). Allow dumping policies for particular commands. This is used for feature discovery by user space (instead of kernel version parsing or trial and error). Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge. Allow more than 255 IPv4 multicast interfaces. Add support for Type of Service (ToS) reflection in SYN/SYN-ACK packets of TCPv6. In Multi-patch TCP (MPTCP) support concurrent transmission of data on multiple subflows in a load balancing scenario. Enhance advertising addresses via the RM_ADDR/ADD_ADDR options. Support SMC-Dv2 version of SMC, which enables multi-subnet deployments. Allow more calls to same peer in RxRPC. Support two new Controller Area Network (CAN) protocols - CAN-FD and ISO 15765-2:2016. Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit kernel problem. Add TC actions for implementing MPLS L2 VPNs. Improve nexthop code - e.g. handle various corner cases when nexthop objects are removed from groups better, skip unnecessary notifications and make it easier to offload nexthops into HW by converting to a blocking notifier. Support adding and consuming TCP header options by BPF programs, opening the doors for easy experimental and deployment-specific TCP option use. Reorganize TCP congestion control (CC) initialization to simplify life of TCP CC implemented in BPF. Add support for shipping BPF programs with the kernel and loading them early on boot via the User Mode Driver mechanism, hence reusing all the user space infra we have. Support sleepable BPF programs, initially targeting LSM and tracing. Add bpf_d_path() helper for returning full path for given 'struct path'. Make bpf_tail_call compatible with bpf-to-bpf calls. Allow BPF programs to call map_update_elem on sockmaps. Add BPF Type Format (BTF) support for type and enum discovery, as well as support for using BTF within the kernel itself (current use is for pretty printing structures). Support listing and getting information about bpf_links via the bpf syscall. Enhance kernel interfaces around NIC firmware update. Allow specifying overwrite mask to control if settings etc. are reset during update; report expected max time operation may take to users; support firmware activation without machine reboot incl. limits of how much impact reset may have (e.g. dropping link or not). Extend ethtool configuration interface to report IEEE-standard counters, to limit the need for per-vendor logic in user space. Adopt or extend devlink use for debug, monitoring, fw update in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx, dpaa2-eth). In mlxsw expose critical and emergency SFP module temperature alarms. Refactor port buffer handling to make the defaults more suitable and support setting these values explicitly via the DCBNL interface. Add XDP support for Intel's igb driver. Support offloading TC flower classification and filtering rules to mscc_ocelot switches. Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as fixed interval period pulse generator and one-step timestamping in dpaa-eth. Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3) offload. Add Lynx PHY/PCS MDIO module, and convert various drivers which have this HW to use it. Convert mvpp2 to split PCS. Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as 7-port Mediatek MT7531 IP. Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver, and wcn3680 support in wcn36xx. Improve performance for packets which don't require much offloads on recent Mellanox NICs by 20% by making multiple packets share a descriptor entry. Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto subtree to drivers/net. Move MDIO drivers out of the phy directory. Clean up a lot of W=1 warnings, reportedly the actively developed subsections of networking drivers should now build W=1 warning free. Make sure drivers don't use in_interrupt() to dynamically adapt their code. Convert tasklets to use new tasklet_setup API (sadly this conversion is not yet complete). Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+ItRwACgkQMUZtbf5S IrtTMg//UxpdR/MirT1DatBU0K/UGAZY82hV7F/UC8tPgjfHZeHvWlDFxfi3YP81 PtPKbhRZ7DhwBXefUp6nY3UdvjftrJK2lJm8prJUPSsZRye8Wlcb7y65q7/P2y2U Efucyopg6RUrmrM0DUsIGYGJgylQLHnMYUl/keCsD4t5Bp4ksyi9R2t5eitGoWzh r3QGdbSa0AuWx4iu0i+tqp6Tj0ekMBMXLVb35dtU1t0joj2KTNEnSgABN3prOa8E iWYf2erOau68Ogp3yU3miCy0ZU4p/7qGHTtzbcp677692P/ekak6+zmfHLT9/Pjy 2Stq2z6GoKuVxdktr91D9pA3jxG4LxSJmr0TImcGnXbvkMP3Ez3g9RrpV5fn8j6F mZCH8TKZAoD5aJrAJAMkhZmLYE1pvDa7KolSk8WogXrbCnTEb5Nv8FHTS1Qnk3yl wSKXuvutFVNLMEHCnWQLtODbTST9DI/aOi6EctPpuOA/ZyL1v3pl+gfp37S+LUTe owMnT/7TdvKaTD0+gIyU53M6rAWTtr5YyRQorX9awIu/4Ha0F0gYD7BJZQUGtegp HzKt59NiSrFdbSH7UdyemdBF4LuCgIhS7rgfeoUXMXmuPHq7eHXyHZt5dzPPa/xP 81P0MAvdpFVwg8ij2yp2sHS7sISIRKq17fd1tIewUabxQbjXqPc= =bc1U -----END PGP SIGNATURE----- Merge tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: - Add redirect_neigh() BPF packet redirect helper, allowing to limit stack traversal in common container configs and improving TCP back-pressure. Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain. - Expand netlink policy support and improve policy export to user space. (Ge)netlink core performs request validation according to declared policies. Expand the expressiveness of those policies (min/max length and bitmasks). Allow dumping policies for particular commands. This is used for feature discovery by user space (instead of kernel version parsing or trial and error). - Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge. - Allow more than 255 IPv4 multicast interfaces. - Add support for Type of Service (ToS) reflection in SYN/SYN-ACK packets of TCPv6. - In Multi-patch TCP (MPTCP) support concurrent transmission of data on multiple subflows in a load balancing scenario. Enhance advertising addresses via the RM_ADDR/ADD_ADDR options. - Support SMC-Dv2 version of SMC, which enables multi-subnet deployments. - Allow more calls to same peer in RxRPC. - Support two new Controller Area Network (CAN) protocols - CAN-FD and ISO 15765-2:2016. - Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit kernel problem. - Add TC actions for implementing MPLS L2 VPNs. - Improve nexthop code - e.g. handle various corner cases when nexthop objects are removed from groups better, skip unnecessary notifications and make it easier to offload nexthops into HW by converting to a blocking notifier. - Support adding and consuming TCP header options by BPF programs, opening the doors for easy experimental and deployment-specific TCP option use. - Reorganize TCP congestion control (CC) initialization to simplify life of TCP CC implemented in BPF. - Add support for shipping BPF programs with the kernel and loading them early on boot via the User Mode Driver mechanism, hence reusing all the user space infra we have. - Support sleepable BPF programs, initially targeting LSM and tracing. - Add bpf_d_path() helper for returning full path for given 'struct path'. - Make bpf_tail_call compatible with bpf-to-bpf calls. - Allow BPF programs to call map_update_elem on sockmaps. - Add BPF Type Format (BTF) support for type and enum discovery, as well as support for using BTF within the kernel itself (current use is for pretty printing structures). - Support listing and getting information about bpf_links via the bpf syscall. - Enhance kernel interfaces around NIC firmware update. Allow specifying overwrite mask to control if settings etc. are reset during update; report expected max time operation may take to users; support firmware activation without machine reboot incl. limits of how much impact reset may have (e.g. dropping link or not). - Extend ethtool configuration interface to report IEEE-standard counters, to limit the need for per-vendor logic in user space. - Adopt or extend devlink use for debug, monitoring, fw update in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx, dpaa2-eth). - In mlxsw expose critical and emergency SFP module temperature alarms. Refactor port buffer handling to make the defaults more suitable and support setting these values explicitly via the DCBNL interface. - Add XDP support for Intel's igb driver. - Support offloading TC flower classification and filtering rules to mscc_ocelot switches. - Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as fixed interval period pulse generator and one-step timestamping in dpaa-eth. - Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3) offload. - Add Lynx PHY/PCS MDIO module, and convert various drivers which have this HW to use it. Convert mvpp2 to split PCS. - Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as 7-port Mediatek MT7531 IP. - Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver, and wcn3680 support in wcn36xx. - Improve performance for packets which don't require much offloads on recent Mellanox NICs by 20% by making multiple packets share a descriptor entry. - Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto subtree to drivers/net. Move MDIO drivers out of the phy directory. - Clean up a lot of W=1 warnings, reportedly the actively developed subsections of networking drivers should now build W=1 warning free. - Make sure drivers don't use in_interrupt() to dynamically adapt their code. Convert tasklets to use new tasklet_setup API (sadly this conversion is not yet complete). * tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits) Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH" net, sockmap: Don't call bpf_prog_put() on NULL pointer bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo bpf, sockmap: Add locking annotations to iterator netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements net: fix pos incrementment in ipv6_route_seq_next net/smc: fix invalid return code in smcd_new_buf_create() net/smc: fix valid DMBE buffer sizes net/smc: fix use-after-free of delayed events bpfilter: Fix build error with CONFIG_BPFILTER_UMH cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info bpf: Fix register equivalence tracking. rxrpc: Fix loss of final ack on shutdown rxrpc: Fix bundle counting for exclusive connections netfilter: restore NF_INET_NUMHOOKS ibmveth: Identify ingress large send packets. ibmveth: Switch order of ibmveth_helper calls. cxgb4: handle 4-tuple PEDIT to NAT mode translation selftests: Add VRF route leaking tests ...	2020-10-15 18:42:13 -07:00
Linus Torvalds	c024a81125	dlm for 5.10 This set continues the ongoing rework of the low level communication layer in the dlm. The focus here is on improvements to connection handling, and reworking the receiving of messages. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJfhJylAAoJEDgbc8f8gGmqiRYP/R8rHeiAtBuTQebIG2S2FRjT OoCsr6F240SxyNJYAJlsV4kWGLtRQ0qnHhWku6nAreg9Yw/+0C7ZRHwDNoJsE2/9 JuCW6HhqN6jYcWWhV3BZA7wvWzPfzdC7Jnla7f9GGB9ToFlani7CLEj5qzkyEIxh KaXfFGTHBCftM20HaNcExqBwmB0bn7jiavlR2Nqnsh/FW+er1HPa4rIIJqYy6k31 ymf0XJ3kZYgf/I4iArUZkR7FKHHy1GhWW10NSQ/DDfwGtkbQ1Lw8IdBZ/zkyheAG aInFcxEt+hQPTMOBSB4hJn4+QPvyNAd9UxjFLuaHawUNglH73PXBk77kGgj8xJGU qRcaugX5brVV1tpY2UPQO8MC8ITadmKRa7uZkRoI2hIZfsZO2z+TSgRkegsSDIlD wXYLQslSYImZ5k42wHqaONxD4nW/haZxdrhul4sP8Z1+d5WmoPE1UDlXMTvbyp/N iW3+jhvPc1NAyzEPdmMj/y7zmCX+yrlkRrO1YjTkpEOIpN5uUaxg/1g8ok5OProR Xyx4b9gv3r8/3CGQvYTOiNr9ZUj8kPR8rWv4fbQXAt+kQMAKnIjyRKp4SpfFlp1L pKv9UU/sdUsduCoiRDD5+SiqNLGDWSH5UAkYvkH0cz3QDELSEYkLlMA6O00QsDAH o1f+TFcNFKsphk47DXVo =/n6H -----END PGP SIGNATURE----- Merge tag 'dlm-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set continues the ongoing rework of the low level communication layer in the dlm. The focus here is on improvements to connection handling, and reworking the receiving of messages" * tag 'dlm-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: fs: dlm: fix race in nodeid2con fs: dlm: rework receive handling fs: dlm: disallow buffer size below default fs: dlm: handle range check as callback fs: dlm: fix mark per nodeid setting fs: dlm: remove lock dependency warning fs: dlm: use free_con to free connection fs: dlm: handle possible othercon writequeues fs: dlm: move free writequeue into con free fs: dlm: fix configfs memory leak fs: dlm: fix dlm_local_addr memory leak fs: dlm: make connection hash lockless fs: dlm: synchronize dlm before shutdown	2020-10-13 08:59:39 -07:00
Jakub Kicinski	66a9b9287d	genetlink: move to smaller ops wherever possible Bulk of the genetlink users can use smaller ops, move them. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 19:11:11 -07:00
Alexander Aring	4f2b30fd9b	fs: dlm: fix race in nodeid2con This patch fixes a race in nodeid2con in cases that we parallel running a lookup and both will create a connection structure for the same nodeid. It's a rare case to create a new connection structure to keep reader lockless we just do a lookup inside the protection area again and drop previous work if this race happens. Fixes: `a47666eb76` ("fs: dlm: make connection hash lockless") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-10-01 09:25:07 -05:00
Alexander Aring	4798cbbfbd	fs: dlm: rework receive handling This patch reworks the current receive handling of dlm. As I tried to change the send handling to fix reorder issues I took a look into the receive handling and simplified it, it works as the following: Each connection has a preallocated receive buffer with a minimum length of 4096. On receive, the upper layer protocol will process all dlm message until there is not enough data anymore. If there exists "leftover" data at the end of the receive buffer because the dlm message wasn't fully received it will be copied to the begin of the preallocated receive buffer. Next receive more data will be appended to the previous "leftover" data and processing will begin again. This will remove a lot of code of the current mechanism. Inside the processing functionality we will ensure with a memmove() that the dlm message should be memory aligned. To have a dlm message always started at the beginning of the buffer will reduce some amount of memmove() calls because src and dest pointers are the same. The cluster attribute "buffer_size" becomes a new meaning, it's now the size of application layer receive buffer size. If this is changed during runtime the receive buffer will be reallocated. It's important that the receive buffer size has at minimum the size of the maximum possible dlm message size otherwise the received message cannot be placed inside the receive buffer size. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-09-29 14:00:32 -05:00
Alexander Aring	4e192ee68e	fs: dlm: disallow buffer size below default I observed that the upper layer will not send messages above this value. As conclusion the application receive buffer should not below that value, otherwise we are not capable to deliver the dlm message to the upper layer. This patch forbids to set the receive buffer below the maximum possible dlm message size. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-09-29 14:00:32 -05:00
Alexander Aring	e1a0ec30a5	fs: dlm: handle range check as callback This patch adds a callback to CLUSTER_ATTR macro to allow individual callbacks for attributes which might have a more complex attribute range checking just than non zero. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-09-29 14:00:32 -05:00
Alexander Aring	3f78cd7d24	fs: dlm: fix mark per nodeid setting This patch fixes to set per nodeid mark configuration for accepted sockets as well. Before this patch only the listen socket mark value was used for all accepted connections. This patch will ensure that the cluster mark attribute value will be always used for all sockets, if a per nodeid mark value is specified dlm will use this value for the specific node. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-09-29 14:00:32 -05:00
Alexander Aring	0461e0db94	fs: dlm: remove lock dependency warning During my experiments to make dlm robust against tcpkill application I was able to run sometimes in a circular lock dependency warning between clusters_root.subsys.su_mutex and con->sock_mutex. We don't need to held the sock_mutex when getting the mark value which held the clusters_root.subsys.su_mutex. This patch moves the specific handling just before the sock_mutex will be held. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-09-29 14:00:32 -05:00
Alexander Aring	7ae0451e2e	fs: dlm: use free_con to free connection This patch use free_con() functionality to free the listen connection if listen fails. It also fixes an issue that a freed resource is still part of the connection_hash as hlist_del() is not called in this case. The only difference is that free_con() handles othercon as well, but this is never been set for the listen connection. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-27 15:59:09 -05:00
Alexander Aring	948c47e9bc	fs: dlm: handle possible othercon writequeues This patch adds free of possible other writequeue entries in othercon member of struct connection. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-27 15:59:09 -05:00
Alexander Aring	0de984323a	fs: dlm: move free writequeue into con free This patch just move the free of struct connection member writequeue into the functionality when struct connection will be freed instead of doing two iterations. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-27 15:59:09 -05:00
Alexander Aring	3d2825c8c6	fs: dlm: fix configfs memory leak This patch fixes the following memory detected by kmemleak and umount gfs2 filesystem which removed the last lockspace: unreferenced object 0xffff9264f482f600 (size 192): comm "dlm_controld", pid 325, jiffies 4294690276 (age 48.136s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 6e 6f 64 65 73 00 00 00 ........nodes... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000060481d7>] make_space+0x41/0x130 [<000000008d905d46>] configfs_mkdir+0x1a2/0x5f0 [<00000000729502cf>] vfs_mkdir+0x155/0x210 [<000000000369bcf1>] do_mkdirat+0x6d/0x110 [<00000000cc478a33>] do_syscall_64+0x33/0x40 [<00000000ce9ccf01>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 The patch just remembers the "nodes" entry pointer in space as I think it's created as subdirectory when parent "spaces" is created. In function drop_space() we will lost the pointer reference to nds because configfs_remove_default_groups(). However as this subdirectory is always available when "spaces" exists it will just be freed when "spaces" will be freed. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-27 15:59:09 -05:00
Alexander Aring	043697f030	fs: dlm: fix dlm_local_addr memory leak This patch fixes the following memory detected by kmemleak and umount gfs2 filesystem which removed the last lockspace: unreferenced object 0xffff9264f4f48f00 (size 128): comm "mount", pid 425, jiffies 4294690253 (age 48.159s) hex dump (first 32 bytes): 02 00 52 48 c0 a8 7a fb 00 00 00 00 00 00 00 00 ..RH..z......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000067a34940>] kmemdup+0x18/0x40 [<00000000c935f9ab>] init_local+0x4c/0xa0 [<00000000bbd286ef>] dlm_lowcomms_start+0x28/0x160 [<00000000a86625cb>] dlm_new_lockspace+0x7e/0xb80 [<000000008df6cd63>] gdlm_mount+0x1cc/0x5de [<00000000b67df8c7>] gfs2_lm_mount.constprop.0+0x1a3/0x1d3 [<000000006642ac5e>] gfs2_fill_super+0x717/0xba9 [<00000000d3ab7118>] get_tree_bdev+0x17f/0x280 [<000000001975926e>] gfs2_get_tree+0x21/0x90 [<00000000561ce1c4>] vfs_get_tree+0x28/0xc0 [<000000007fecaf63>] path_mount+0x434/0xc00 [<00000000636b9594>] __x64_sys_mount+0xe3/0x120 [<00000000cc478a33>] do_syscall_64+0x33/0x40 [<00000000ce9ccf01>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-27 15:59:09 -05:00
Alexander Aring	a47666eb76	fs: dlm: make connection hash lockless There are some problems with the connections_lock. During my experiements I saw sometimes circular dependencies with sock_lock. The reason here might be code parts which runs nodeid2con() before or after sock_lock is acquired. Another issue are missing locks in for_conn() iteration. Maybe this works fine because for_conn() is running in a context where connection_hash cannot be manipulated by others anymore. However this patch changes the connection_hash to be protected by sleepable rcu. The hotpath function __find_con() is implemented lockless as it is only a reader of connection_hash and this hopefully fixes the circular locking dependencies. The iteration for_conn() will still call some sleepable functionality, that's why we use sleepable rcu in this case. This patch removes the kmemcache functionality as I think I need to make some free() functionality via call_rcu(). However allocation time isn't here an issue. The dlm_allow_con will not be protected by a lock anymore as I think it's enough to just set and flush workqueues afterwards. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-27 15:59:09 -05:00
Alexander Aring	aa7ab1e208	fs: dlm: synchronize dlm before shutdown This patch moves the dlm workqueue dlm synchronization before shutdown handling. The patch just flushes all pending work before starting to shutdown the connection. At least for the send_workqeue we should flush the workqueue to make sure there is no new connection handling going on as dlm_allow_conn switch is turned to false before. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-27 15:59:09 -05:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Linus Torvalds	86cfccb669	dlm for 5.9 This set includes a some improvements to the dlm networking layer: improving the ability to trace dlm messages for debugging, and improved handling of bad messages or disrupted connections. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJfLCPxAAoJEDgbc8f8gGmqz04P/2hvv/4rXo9AOgnnstvZV1Qy Yo01Cy807vB1c3jhIJryM2gG61GNH22RAHc2NcfjJwy04HH/1IEr6P48Po3qYEnS 8fZ8B9msxpsujVOrRoeBuLN8elI1HftyNVWaVjH7xtD+fLCDLu9i10kv3aeS+DiB T6f7yQQv7hgXS3xGvlMr2//aLwGD2ZdcRbkOEGo+k7yUjQbIDH/wdZWcPLh6y4yT p20i2ulYKjEZFmXDMa17diONISeGO6iaDhee24XPDwNDp8qI1iPGJsmxltMmn8Qf d2HPF1IDh4eM8lCwmqBtjYTnJd6rAW0v3+Ek1+wzQKVeXLFiz/MEyuOldtpsqmMO 8Og0vr6zfTCjFo8uvyj+cF7Fcj0yIPWg1yb7EauqqxreK8V9GBA1V2ZXYVd8xwea thrAUaq8f+PYQ9uy1FsN3xaO3BFN1VpcvHu4/3gU3OudnZZt2Ae670RYHKC0bq8D 2tSsqaiDnlvniHgh4xvtNIvRANkDS1ZSbkUPZhMHL7DnRJn66oDIfCr7NMbZwvCa AS0q6suUFyXFbAEJcY6XWxe3aQ3WuxIClT84MgzX/dAK2Qcl8ryWGGSVc0dp4Vl1 cd8MtmpnIWsnxqNRl4jn6cfolDheaxL8nouLtJ+3/dC9VkyDyfmrtnM+8aTZKHoa 3/xrBuVkEJAwkAAr8Pb8 =qgti -----END PGP SIGNATURE----- Merge tag 'dlm-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set includes a some improvements to the dlm networking layer: improving the ability to trace dlm messages for debugging, and improved handling of bad messages or disrupted connections" * tag 'dlm-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: fs: dlm: implement tcp graceful shutdown fs: dlm: change handling of reconnects fs: dlm: don't close socket on invalid message fs: dlm: set skb mark per peer socket fs: dlm: set skb mark for listen socket net: sock: add sock_set_mark dlm: Fix kobject memleak	2020-08-06 19:44:25 -07:00
Alexander Aring	055923bf6b	fs: dlm: implement tcp graceful shutdown During my code inspection I saw there is no implementation of a graceful shutdown for tcp. This patch will introduce a graceful shutdown for tcp connections. The shutdown is implemented synchronized as dlm_lowcomms_stop() is called to end all dlm communication. After shutdown is done, a lot of flush and closing functionality will be called. However I don't see a problem with that. The waitqueue for synchronize the shutdown has a timeout of 10 seconds, if timeout a force close will be exectued. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-06 10:30:54 -05:00
Alexander Aring	ba3ab3ca68	fs: dlm: change handling of reconnects This patch changes the handling of reconnects. At first we only close the connection related to the communication failure. If we get a new connection for an already existing connection we close the existing connection and take the new one. This patch improves significantly the stability of tcp connections while running "tcpkill -9 -i $IFACE port 21064" while generating a lot of dlm messages e.g. on a gfs2 mount with many files. My test setup shows that a deadlock is "more" unlikely. Before this patch I wasn't able to get not a deadlock after 5 seconds. After this patch my observation is that it's more likely to survive after 5 seconds and more, but still a deadlock occurs after certain time. My guess is that there are still "segments" inside the tcp writequeue or retransmit queue which get dropped when receiving a tcp reset [1]. Hard to reproduce because the right message need to be inside these queues, which might even be in the 5 first seconds with this patch. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/tcp_input.c?h=v5.8-rc6#n4122 Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-06 10:30:54 -05:00
Alexander Aring	0ea47e4d21	fs: dlm: don't close socket on invalid message This patch doesn't close sockets when there is an invalid dlm message received. The connection will probably reconnect anyway so. To not close the connection will reduce the number of possible failtures. As we don't have a different strategy to react on such scenario just keep going the connection and ignore the message. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-06 10:30:53 -05:00
Alexander Aring	9c9f168f5b	fs: dlm: set skb mark per peer socket This patch adds support to set the skb mark value for the DLM tcp and sctp socket per peer. The mark value will be offered as per comm value of configfs. At creation time of the peer socket it will be set as socket option. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-06 10:30:52 -05:00
Alexander Aring	a5b7ab6352	fs: dlm: set skb mark for listen socket This patch adds support to set the skb mark value for the DLM listen tcp and sctp sockets. The mark value will be offered as cluster configuration. At creation time of the listen socket it will be set as socket option. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-06 10:30:51 -05:00
Wang Hai	0ffddafc3a	dlm: Fix kobject memleak Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Set do_unreg = 1 before kobject_init_and_add() to ensure that kobject_put() can be called in its error patch. Fixes: `901195ed7f` ("Kobject: change GFS2 to use kobject_init_and_add") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-08-06 10:30:49 -05:00
Kees Cook	3f649ab728	treewide: Remove uninitialized_var() usage Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. In preparation for removing[2] the[3] macro[4], remove all remaining needless uses with the following script: git grep '\buninitialized_var\b' \| cut -d: -f1 \| sort -u \| \ xargs perl -pi -e \ 's/\buninitialized_var$([^$]+)\)/\1/g; s:\s/\ (GCC be quiet\|to make compiler happy) \*/$::g;' drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid pathological white-space. No outstanding warnings were found building allmodconfig with GCC 9.3.0 for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64, alpha, and m68k. [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5 Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs Signed-off-by: Kees Cook <keescook@chromium.org>	2020-07-16 12:35:15 -07:00
Linus Torvalds	e3cea0cad1	dlm for 5.8 This set includes a couple minor cleanups, and dropping the interruptible from a wait_event that waits for an event from the userspace cluster management. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJe2op+AAoJEDgbc8f8gGmqJKkP/jE0BCnE7VbpID0sDaBLpntk +v53jxR/bpw6g+LwlJGBDLCRrcn8MnhC72W1OPZyFJ4AiuN5BHdvTVVSPSCBSHnU WfJKqbVSRt+PRaATJd8iCLdoMj858vLWFo380XVMTRsAaxYJ4dwRUMBUOx/pL8Dy pcDXDgWTC06nuRmSY8eYJYJWX3c6SdpHHbg4IA127Y9oTjIAvYyeFC4ohRtRJqrj bQlpjnfJ9bNwd5N15turGiR5IzY7UXI9wa/qylngr5B5gVxJttFpaE4OFF80HOkY Xxqqqhi9dIcyAAzftJaRQpyAq6kRmFgFAjH5Rvx+7n0BEMe+AAhQC9Coei1Zysh1 fssYrxRC3fqX1kg2alhpdBdCugFq7IUdFuwH044M+w8jcsoggGq8vGRMJEuKgnnm kLLBEi5HtAv0S9rAE9Acnqanc6lRvzJIc0vysG/0LQqzWr+F7HtRnS7kkOkO2+e1 014k20FmooJiu5XGrCz+dvmnIACpr6Z+S119uXntYGDDA+YGcJQTMs6pzLh5UTqB 40aIZPPRRj6K/Vvx2VJgEbzwmwWjjdtsYkcJEq+QPPsAkPa8EnbqICCyJDkUuY5K kltYHQ2IOa466KC8JHBS9jdVnHftk0jIl75eiP7vK+ZUEW+iwRgEXY11oiWeY23J EKfPgi0oikfmlj2Y3jme =VTRw -----END PGP SIGNATURE----- Merge tag 'dlm-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set includes a couple minor cleanups, and dropping the interruptible from a wait_event that waits for an event from the userspace cluster management" * tag 'dlm-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: remove BUG() before panic() dlm: Switch to using wait_event() fs:dlm:remove unneeded semicolon in rcom.c dlm: user: Replace zero-length array with flexible-array member dlm: dlm_internal: Replace zero-length array with flexible-array member	2020-06-05 16:43:16 -07:00
Christoph Hellwig	c0425a4249	net: add a new bind_add method The SCTP protocol allows to bind multiple address to a socket. That feature is currently only exposed as a socket option. Add a bind_add method struct proto that allows to bind additional addresses, and switch the dlm code to use the method instead of going through the socket option from kernel space. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-29 13:10:39 -07:00
Christoph Hellwig	40ef92c6ec	sctp: add sctp_sock_set_nodelay Add a helper to directly set the SCTP_NODELAY sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-29 13:10:39 -07:00
Christoph Hellwig	12abc5ee78	tcp: add tcp_sock_set_nodelay Add a helper to directly set the TCP_NODELAY sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	26cfabf9cd	net: add sock_set_rcvbuf Add a helper to directly set the SO_RCVBUFFORCE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	ce3d9544ce	net: add sock_set_keepalive Add a helper to directly set the SO_KEEPALIVE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	76ee0785f4	net: add sock_set_sndtimeo Add a helper to directly set the SO_SNDTIMEO_NEW sockopt from kernel space without going through a fake uaccess. The interface is simplified to only pass the seconds value, as that is the only thing needed at the moment. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	b58f0e8f38	net: add sock_set_reuseaddr Add a helper to directly set the SO_REUSEADDR sockopt from kernel space without going through a fake uaccess. For this the iscsi target now has to formally depend on inet to avoid a mostly theoretical compile failure. For actual operation it already did depend on having ipv4 or ipv6 support. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	0774dc7643	dlm: use the tcp version of accept_from_sock for sctp as well The only difference between a few missing fixes applied to the SCTP one is that TCP uses ->getpeername to get the remote address, while SCTP uses kernel_getsockopt(.. SCTP_PRIMARY_ADDR). But given that getpeername is defined to return the primary address for sctp, there doesn't seem to be any reason for the different way of quering the peername, or all the code duplication. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-27 15:11:33 -07:00
Arnd Bergmann	fe204591cc	dlm: remove BUG() before panic() Building a kernel with clang sometimes fails with an objtool error in dlm: fs/dlm/lock.o: warning: objtool: revert_lock_pc()+0xbd: can't find jump dest instruction at .text+0xd7fc The problem is that BUG() never returns and the compiler knows that anything after it is unreachable, however the panic still emits some code that does not get fully eliminated. Having both BUG() and panic() is really pointless as the BUG() kills the current process and the subsequent panic() never hits. In most cases, we probably don't really want either and should replace the DLM_ASSERT() statements with WARN_ON(), as has been done for some of them. Remove the BUG() here so the user at least sees the panic message and we can reliably build randconfig kernels. Fixes: `e7fd41792f` ("[DLM] The core of the DLM for GFS2/CLVM") Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: clang-built-linux@googlegroups.com Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Teigland <teigland@redhat.com>	2020-05-12 14:06:18 -05:00
Ross Lagerwall	f084a4f4a1	dlm: Switch to using wait_event() We saw an issue in a production server on a customer deployment where DLM 4.0.7 gets "stuck" and unable to join new lockspaces. There is no useful response for the dlm in do_event() if wait_event_interruptible() is interrupted, so switch to wait_event(). Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-05-12 14:06:17 -05:00
Wu Bo	90db4f8be3	fs:dlm:remove unneeded semicolon in rcom.c Fix the following coccicheck warning: fs/dlm/rcom.c:566:2-3: Unneeded semicolon Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-05-12 14:06:16 -05:00
Gustavo A. R. Silva	3c80d3794d	dlm: user: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-05-12 14:06:15 -05:00
Gustavo A. R. Silva	a4e439a6f6	dlm: dlm_internal: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David Teigland <teigland@redhat.com>	2020-05-12 14:06:14 -05:00
Arnd Bergmann	5311f707b4	dlm: use SO_SNDTIMEO_NEW instead of SO_SNDTIMEO_OLD Eliminate one more use of 'struct timeval' from the kernel so we can eventually remove the definition as well. The kernel supports the new format with a 64-bit time_t version of timeval here, so use that instead of the old timeval. Acked-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2019-12-18 18:07:31 +01:00
Linus Torvalds	964a4eacef	dlm for 5.3 This set removes some unnecessary debugfs error handling, and checks that lowcomms workqueues are not NULL before destroying. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJdKKKoAAoJEDgbc8f8gGmqpHUP/AyGkjRy9Z3AkakAzNXmvvHy r7Jh75WSQCt2VEmlxYdwlgtITL9reBO++pYIYwpdKZOKoijeAgnV56InQF8SA6nq dILfSPs0R6HXITzMdBx47Ib0SX9517UoSoi9GLIVzE2vOKMLfSc3/09QMYvU9nUr C+tVgB1KbNFr4IJ34AWr45lR34NlIetsdKprKOh/HqCUbbNMYynfWGXW09FnrLnd uDRWRXPlo5fq1wSxNoRBljb0Pbl4otBhK6BoySqt3D4Id/UJI2vd2xQJJtbwxdtt dIFu9VbLydXlpevrndAYG6j6G4gZX9IPudDee7C6rX0fV3+a3mUqYn9Spi+c49o2 37Z1J43xW8ct1prCmgFpXJDwbsbudq+cmVGvaHjvRJzySSJoyp/UDVdn2l/CEOim rCIgP/zawOq3bqrDIS0ZJXH9lnsJ55stjDtNW/5Hq0uL4934ttTElJ5vjyvS5ABi 9gsqVCRXvEV00U8AD8a8ahXvTHYUNhhARQaEKovhjq9jSb3/rHqMJ/wogRKb2VpV JspqjYavLcSyrtyG3ncX/khk/f2SjSzwvmEb8l2aN186QMH8jsm0XPekWYFQMhdR CyknNKgO0Ye+g48d1P9euuj6PNF/qqBnx5yWNC6nFDQFa648IV1LOaxKW+eGMF9o UFY42yrSmxWw+gkFxXbo =Mhya -----END PGP SIGNATURE----- Merge tag 'dlm-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set removes some unnecessary debugfs error handling, and checks that lowcomms workqueues are not NULL before destroying" * tag 'dlm-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: no need to check return value of debugfs_create functions dlm: check if workqueues are NULL before flushing/destroying	2019-07-12 17:37:53 -07:00
Linus Torvalds	f632a8170a	Driver Core and debugfs changes for 5.3-rc1 Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Because of this, there is going to be some merge issues with your tree at the moment, I'll follow up with the expected resolutions to make it easier for you. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups (will cause build warnings with s390 and coresight drivers in your tree) - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for. Other than the merge issues, functionality is working properly in linux-next :) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K LpRyb3zX29oChFaZkc5a =XrEZ -----END PGP SIGNATURE----- Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...	2019-07-12 12:24:03 -07:00
Greg Kroah-Hartman	a48f9721e6	dlm: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David Teigland <teigland@redhat.com>	2019-07-11 11:01:58 -05:00
David Windsor	b355516f45	dlm: check if workqueues are NULL before flushing/destroying If the DLM lowcomms stack is shut down before any DLM traffic can be generated, flush_workqueue() and destroy_workqueue() can be called on empty send and/or recv workqueues. Insert guard conditionals to only call flush_workqueue() and destroy_workqueue() on workqueues that are not NULL. Signed-off-by: David Windsor <dwindsor@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2019-07-11 11:01:58 -05:00
Kimberly Brown	c9c5b5e156	dlm: Replace default_attrs in dlm_ktype with default_groups The kobj_type default_attrs field is being replaced by the default_groups field, so replace the default_attrs field in dlm_ktype with default_groups. Use the ATTRIBUTE_GROUPS macro to create dlm_groups. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-13 13:50:22 +02:00
Thomas Gleixner	7336d0e654	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 398 Based on 1 normalized pattern(s): this copyrighted material is made available to anyone wishing to use modify copy or redistribute it subject to the terms and conditions of the gnu general public license version 2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 44 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531081038.653000175@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:12 +02:00
Thomas Gleixner	2522fe45a1	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193 Based on 1 normalized pattern(s): this copyrighted material is made available to anyone wishing to use modify copy or redistribute it subject to the terms and conditions of the gnu general public license v 2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 45 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528170027.342746075@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:29:21 -07:00
Thomas Gleixner	ec8f24b7fa	treewide: Add SPDX license identifier - Makefile/Kconfig Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-21 10:50:46 +02:00
Johannes Berg	ef6243acb4	genetlink: optionally validate strictly/dumps Add options to strictly validate messages and dump messages, sometimes perhaps validating dump messages non-strictly may be required, so add an option for that as well. Since none of this can really be applied to existing commands, set the options everwhere using the following spatch: @@ identifier ops; expression X; @@ struct genl_ops ops[] = { ..., { .cmd = X, + .validate = GENL_DONT_VALIDATE_STRICT \| GENL_DONT_VALIDATE_DUMP, ... }, ... }; For new commands one should just not copy the .validate 'opt-out' flags and thus get strict validation. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-04-27 17:07:22 -04:00
Deepa Dinamani	45bdc66159	socket: Rename SO_RCVTIMEO/ SO_SNDTIMEO with _OLD suffixes SO_RCVTIMEO and SO_SNDTIMEO socket options use struct timeval as the time format. struct timeval is not y2038 safe. The subsequent patches in the series add support for new socket timeout options with _NEW suffix that will use y2038 safe data structures. Although the existing struct timeval layout is sufficiently wide to represent timeouts, because of the way libc will interpret time_t based on user defined flag, these new flags provide a way of having a structure that is the same for all architectures consistently. Rename the existing options with _OLD suffix forms so that the right option is enabled for userspace applications according to the architecture and time_t definition of libc. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Cc: ccaulfie@redhat.com Cc: deller@gmx.de Cc: paulus@samba.org Cc: ralf@linux-mips.org Cc: rth@twiddle.net Cc: cluster-devel@redhat.com Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-03 11:17:31 -08:00
David Teigland	3595c55932	dlm: fix invalid cluster name warning The warning added in commit `3b0e761ba8` "dlm: print log message when cluster name is not set" did not account for the fact that lockspaces created from userland do not supply a cluster name, so bogus warnings are printed every time a userland lockspace is created. Signed-off-by: David Teigland <teigland@redhat.com>	2018-12-03 15:30:24 -06:00
Thomas Meyer	3456880ff3	dlm: NULL check before some freeing functions is not needed NULL check before some freeing functions is not needed. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: David Teigland <teigland@redhat.com>	2018-12-03 10:02:01 -06:00
Wen Yang	f31a896928	dlm: NULL check before kmem_cache_destroy is not needed kmem_cache_destroy(NULL) is safe, so removes NULL check before freeing the mem. This patch also fix ifnullfree.cocci warnings. Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-28 08:45:55 -06:00
David Teigland	8fc6ed9a35	dlm: fix missing idr_destroy for recover_idr Which would leak memory for the idr internals. Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-15 11:21:21 -06:00
Vasily Averin	d47b41acee	dlm: memory leaks on error path in dlm_user_request() According to comment in dlm_user_request() ua should be freed in dlm_free_lkb() after successful attach to lkb. However ua is attached to lkb not in set_lock_args() but later, inside request_lock(). Fixes `597d0cae0f` ("[DLM] dlm: user locks") Cc: stable@kernel.org # 2.6.19 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-15 09:57:22 -06:00
Vasily Averin	c0174726c3	dlm: lost put_lkb on error path in receive_convert() and receive_unlock() Fixes `6d40c4a708` ("dlm: improve error and debug messages") Cc: stable@kernel.org # 3.5 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-15 09:57:22 -06:00
Vasily Averin	23851e978f	dlm: possible memory leak on error path in create_lkb() Fixes `3d6aa675ff` ("dlm: keep lkbs in idr") Cc: stable@kernel.org # 3.1 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-15 09:57:22 -06:00
Vasily Averin	b982896cdb	dlm: fixed memory leaks after failed ls_remove_names allocation If allocation fails on last elements of array need to free already allocated elements. v2: just move existing out_rsbtbl label to right place Fixes 789924ba635f ("dlm: fix race between remove and lookup") Cc: stable@kernel.org # 3.6 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-15 09:57:22 -06:00
Denis V. Lunev	58a923adf4	dlm: fix possible call to kfree() for non-initialized pointer Technically dlm_config_nodes() could return error and keep nodes uninitialized. After that on the fail path of we'll call kfree() for that uninitialized value. The patch is simple - we should just initialize nodes with NULL. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-13 11:41:09 -06:00
Bob Peterson	216f0efd19	dlm: Don't swamp the CPU with callbacks queued during recovery Before this patch, recovery would cause all callbacks to be delayed, put on a queue, and afterward they were all queued to the callback work queue. This patch does the same thing, but occasionally takes a break after 25 of them so it won't swamp the CPU at the expense of other RT processes like corosync. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-08 13:17:00 -06:00
Tycho Andersen	9de30f3f7f	dlm: don't leak kernel pointer to userspace In copy_result_to_user(), we first create a struct dlm_lock_result, which contains a struct dlm_lksb, the last member of which is a pointer to the lvb. Unfortunately, we copy the entire struct dlm_lksb to the result struct, which is then copied to userspace at the end of the function, leaking the contents of sb_lvbptr, which is a valid kernel pointer in some cases (indeed, later in the same function the data it points to is copied to userspace). It is an error to leak kernel pointers to userspace, as it undermines KASLR protections (see e.g. `65eea8edc3` ("floppy: Do not copy a kernel pointer to user memory in FDGETPRM ioctl") for another example of this). Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-07 16:12:45 -06:00
Tycho Andersen	3f0806d259	dlm: don't allow zero length names kobject doesn't like zero length object names, so let's test for that. Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-07 16:10:45 -06:00
Tycho Andersen	d968b4e240	dlm: fix invalid free dlm_config_nodes() does not allocate nodes on failure, so we should not free() nodes when it fails. Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: David Teigland <teigland@redhat.com>	2018-11-07 15:50:34 -06:00
David Howells	aa563d7bca	iov_iter: Separate type from direction and use accessor functions In the iov_iter struct, separate the iterator type from the iterator direction and use accessor functions to access them in most places. Convert a bunch of places to use switch-statements to access them rather then chains of bitwise-AND statements. This makes it easier to add further iterator types. Also, this can be more efficient as to implement a switch of small contiguous integers, the compiler can use ~50% fewer compare instructions than it has to use bitwise-and instructions. Further, cease passing the iterator type into the iterator setup function. The iterator function can set that itself. Only the direction is required. Signed-off-by: David Howells <dhowells@redhat.com>	2018-10-24 00:41:07 +01:00
Kees Cook	42bc47b353	treewide: Use array_size() in vmalloc() The vmalloc() function has no 2-factor argument form, so multiplication factors need to be wrapped in array_size(). This patch replaces cases of: vmalloc(a * b) with: vmalloc(array_size(a, b)) as well as handling cases of: vmalloc(a * b * c) with: vmalloc(array3_size(a, b, c)) This does, however, attempt to ignore constant size factors like: vmalloc(4 * 1024) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( vmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| vmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( vmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| vmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| vmalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| vmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| vmalloc( - sizeof(u8) * COUNT + COUNT , ...) \| vmalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| vmalloc( - sizeof(char) * COUNT + COUNT , ...) \| vmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( vmalloc( - sizeof(TYPE) * (COUNT_ID) + array_size(COUNT_ID, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * COUNT_ID + array_size(COUNT_ID, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * COUNT_CONST + array_size(COUNT_CONST, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(THING) * (COUNT_ID) + array_size(COUNT_ID, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * COUNT_ID + array_size(COUNT_ID, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * COUNT_CONST + array_size(COUNT_CONST, sizeof(THING)) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ vmalloc( - SIZE * COUNT + array_size(COUNT, SIZE) , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( vmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( vmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| vmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| vmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| vmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| vmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| vmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( vmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( vmalloc(C1 * C2 * C3, ...) \| vmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants. @@ expression E1, E2; constant C1, C2; @@ ( vmalloc(C1 * C2, ...) \| vmalloc( - E1 * E2 + array_size(E1, E2) , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Gang He	da3627c30d	dlm: remove O_NONBLOCK flag in sctp_connect_to_sock We should remove O_NONBLOCK flag when calling sock->ops->connect() in sctp_connect_to_sock() function. Why? 1. up to now, sctp socket connect() function ignores the flag argument, that means O_NONBLOCK flag does not take effect, then we should remove it to avoid the confusion (but is not urgent). 2. for the future, there will be a patch to fix this problem, then the flag argument will take effect, the patch has been queued at https://git.kernel.o rg/pub/scm/linux/kernel/git/davem/net.git/commit/net/sctp?id=644fbdeacf1d3ed d366e44b8ba214de9d1dd66a9. But, the O_NONBLOCK flag will make sock->ops->connect() directly return without any wait time, then the connection will not be established, DLM kernel module will call sock->ops->connect() again and again, the bad results are, CPU usage is almost 100%, even trigger soft_lockup problem if the related configurations are enabled, DLM kernel module also prints lots of messages like, [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 The upper application (e.g. ocfs2 mount command) is hanged at new_lockspace(), the whole backtrace is as below, tb0307-nd2:~ # cat /proc/2935/stack [<0>] new_lockspace+0x957/0xac0 [dlm] [<0>] dlm_new_lockspace+0xae/0x140 [dlm] [<0>] user_cluster_connect+0xc3/0x3a0 [ocfs2_stack_user] [<0>] ocfs2_cluster_connect+0x144/0x220 [ocfs2_stackglue] [<0>] ocfs2_dlm_init+0x215/0x440 [ocfs2] [<0>] ocfs2_fill_super+0xcb0/0x1290 [ocfs2] [<0>] mount_bdev+0x173/0x1b0 [<0>] mount_fs+0x35/0x150 [<0>] vfs_kern_mount.part.23+0x54/0x100 [<0>] do_mount+0x59a/0xc40 [<0>] SyS_mount+0x80/0xd0 [<0>] do_syscall_64+0x76/0x140 [<0>] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [<0>] 0xffffffffffffffff So, I think we should remove O_NONBLOCK flag here, since DLM kernel module can not handle non-block sockect in connect() properly. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2018-05-29 10:48:35 -05:00
Gang He	f706d83015	dlm: make sctp_connect_to_sock() return in specified time When the user setup a two-ring cluster, DLM kernel module will automatically selects to use SCTP protocol to communicate between each node. There will be about 5 minute hang in DLM kernel module, in case one ring is broken before switching to another ring, this will potentially affect the dependent upper applications, e.g. ocfs2, gfs2, clvm and clustered-MD, etc. Unfortunately, if the user setup a two-ring cluster, we can not specify DLM communication protocol with TCP explicitly, since DLM kernel module only supports SCTP protocol for multiple ring cluster. Base on my investigation, the time is spent in sock->ops->connect() function before returns ETIMEDOUT(-110) error, since O_NONBLOCK argument in connect() function does not work here, then we should make sock->ops->connect() function return in specified time via setting socket SO_SNDTIMEO atrribute. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2018-05-02 10:28:35 -05:00
Gang He	b09c603ca4	dlm: fix a clerical error when set SCTP_NODELAY There is a clerical error when turn off Nagle's algorithm in sctp_connect_to_sock() function, this results in turn off Nagle's algorithm failure. After this correction, DLM performance will be improved obviously when using SCTP procotol. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David Teigland <teigland@redhat.com>	2018-05-02 10:22:25 -05:00
Denys Vlasenko	9b2c45d479	net: make getname() functions return length rather than use int* parameter Changes since v1: Added changes in these files: drivers/infiniband/hw/usnic/usnic_transport.c drivers/staging/lustre/lnet/lnet/lib-socket.c drivers/target/iscsi/iscsi_target_login.c drivers/vhost/net.c fs/dlm/lowcomms.c fs/ocfs2/cluster/tcp.c security/tomoyo/network.c Before: All these functions either return a negative error indicator, or store length of sockaddr into "int socklen" parameter and return zero on success. "int socklen" parameter is awkward. For example, if caller does not care, it still needs to provide on-stack storage for the value it does not need. None of the many FOO_getname() functions of various protocols ever used old value of *socklen. They always just overwrite it. This change drops this parameter, and makes all these functions, on success, return length of sockaddr. It's always >= 0 and can be differentiated from an error. Tests in callers are changed from "if (err)" to "if (err < 0)", where needed. rpc_sockname() lost "int buflen" parameter, since its only use was to be passed to kernel_getsockname() as &buflen and subsequently not used in any way. Userspace API is not changed. text data bss dec hex filename 30108430 2633624 873672 33615726 200ef6e vmlinux.before.o 30108109 2633612 873672 33615393 200ee21 vmlinux.o Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: David S. Miller <davem@davemloft.net> CC: linux-kernel@vger.kernel.org CC: netdev@vger.kernel.org CC: linux-bluetooth@vger.kernel.org CC: linux-decnet-user@lists.sourceforge.net CC: linux-wireless@vger.kernel.org CC: linux-rdma@vger.kernel.org CC: linux-sctp@vger.kernel.org CC: linux-nfs@vger.kernel.org CC: linux-x25@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-12 14:15:04 -05:00
Linus Torvalds	a9a08845e9	vfs: do bulk POLL* -> EPOLL* replacement This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V \| grep -v '^t' \| grep -v /um/ \| grep -v '^sa' \| grep -v '/poll.h$'\|grep -v '^D'` for f in $L; do sed -i "-es/^$[^\"]$$\<POLL$V\>$/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-02-11 14:34:03 -08:00
Linus Torvalds	1ed2d76e02	Merge branch 'work.sock_recvmsg' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull kern_recvmsg reduction from Al Viro: "kernel_recvmsg() is a set_fs()-using wrapper for sock_recvmsg(). In all but one case that is not needed - use of ITER_KVEC for ->msg_iter takes care of the data and does not care about set_fs(). The only exception is svc_udp_recvfrom() where we want cmsg to be store into kernel object; everything else can just use sock_recvmsg() and be done with that. A followup converting svc_udp_recvfrom() away from set_fs() (and killing kernel_recvmsg() off) is NOT in here - I'd like to hear what netdev folks think of the approach proposed in that followup)" * 'work.sock_recvmsg' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: tipc: switch to sock_recvmsg() smc: switch to sock_recvmsg() ipvs: switch to sock_recvmsg() mISDN: switch to sock_recvmsg() drbd: switch to sock_recvmsg() lustre lnet_sock_read(): switch to sock_recvmsg() cfs2: switch to sock_recvmsg() ncpfs: switch to sock_recvmsg() dlm: switch to sock_recvmsg() svc_recvfrom(): switch to sock_recvmsg()	2018-01-30 18:59:03 -08:00
Al Viro	c8c7840ea9	dlm: switch to sock_recvmsg() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-12-02 20:37:47 -05:00
Al Viro	076ccb76e1	fs: annotate ->poll() instances Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-11-27 16:20:05 -05:00
Linus Torvalds	abc36be236	A couple of configfs cleanups: - proper use of the bool type (Thomas Meyer) - constification of struct config_item_type (Bhumika Goyal) -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAloLSTALHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYNxfhAAv3cunxiEPEAvs+1xuGd3cZYaxz7qinvIODPxIKoF kRWiuy5PUklRMnJ8seOgJ1p1QokX6Sk4cZ8HcctDJVByqODjOq4K5eaKVN1ZqJoz BUzO/gOqfs64r9yaFIlKfe8nFA+gpUftSeWyv3lThxAIJ1iSbue7OZ/A10tTOS1m RWp9FPepFv+nJMfWqeQU64BsoDQ4kgZ2NcEA+jFxNx5dlmIbLD49tk0lfddvZQXr j5WyAH73iugilLtNUGVOqSzHBY4kUvfCKUV7leirCegyMoGhFtA87m6Wzwbo6ZUI DwQLzWvuPaGv1P2PpNEHfKiNbfIEp75DRyyyf87DD3lc5ffAxQSm28mGuwcr7Rn5 Ow/yWL6ERMzCLExoCzEkXYJISy7T5LIzYDgNggKMpeWxysAduF7Onx7KfW1bTuhK mHvY7iOXCjEvaIVaF8uMKE6zvuY1vCMRXaJ+kC9jcIE3gwhg+2hmQvrdJ2uAFXY+ rkeF2Poj/JlblPU4IKWAjiPUbzB7Lv0gkypCB2pD4riaYIN5qCAgF8ULIGQp2hsO lYW1EEgp5FBop85oSO/HAGWeH9dFg0WaV7WqNRVv0AGXhKjgy+bVd7iYPpvs7mGw z9IqSQDORcG2ETLcFhZgiJpCk/itwqXBD+wgMOjJPP8lL+4kZ8FcuhtY9kc9WlJE Tew= =+tMO -----END PGP SIGNATURE----- Merge tag 'configfs-for-4.15' of git://git.infradead.org/users/hch/configfs Pull configfs updates from Christoph Hellwig: "A couple of configfs cleanups: - proper use of the bool type (Thomas Meyer) - constification of struct config_item_type (Bhumika Goyal)" * tag 'configfs-for-4.15' of git://git.infradead.org/users/hch/configfs: RDMA/cma: make config_item_type const stm class: make config_item_type const ACPI: configfs: make config_item_type const nvmet: make config_item_type const usb: gadget: configfs: make config_item_type const PCI: endpoint: make config_item_type const iio: make function argument and some structures const usb: gadget: make config_item_type structures const dlm: make config_item_type const netconsole: make config_item_type const nullb: make config_item_type const ocfs2/cluster: make config_item_type const target: make config_item_type const configfs: make ci_type field, some pointers and function arguments const configfs: make config_item_type const configfs: Fix bool initialization/comparison	2017-11-14 14:44:04 -08:00
Linus Torvalds	f0b60bfa95	dlm for 4.15 This set focuses, as usual, on fixes to the comms layer. New testing of the dlm with ocfs2 uncovered a number of bugs in the TCP connection handling during recovery, starting, and stopping. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJaCeJ8AAoJEDgbc8f8gGmqpqQP/A9GekFRRvm3QfpmHG3Lj6ey O4IKINaB8F46KDBaWzTwKE3fOl0j19qICKuEibBZeJl4lGh7Q5GDOMZ20AfDU4wv Qq40OEfFombCFsVX/Qc4AvdXj7cjpfjJwrZhW6CHOkYGZDaAmsHBeCgBeTvhkqR4 dj4pGFIwBpvV1gQrIteFx110kupeT8DvCSIVzWelD+Jb18vtht7YfKehRyQ3Cyix 8sbEPiKuhmLW/6wbliRqQL5cp9ZyUU5YtBqhmE8r2QIbOOB+k1xFIvVgUylawv3P qi1SpBkX7zRM4BCTP0J3zbUzQHZhgjtgBLVMiSrAWBFb3XtpssEXVczKFDxFafEt YJtPeqHxr8zwzQeF+6MGx6amRWW0T9yHv2sB79wBkz8wL483qL39k9DNa564NoSJ rZtN0bk4g6CuDnHgEM3hzNsVU2sgdaQMZnRWYONHwvDeI+HgKJWD4nedD6wFmXlo kimrQDQCzvx8ZnCKHH0/k23BV2SoYz+80fbW+TeFCWU6gPFGKcJZ12p1e9YYLZJh yeY1Y/kdNhLWyIZlldIK1TtO0645YPBhXcaFBA/RF7g8EbwKrIG8FUZSHzWwQIoJ hGtLBhWT12BGE2NCLHSMCrKZEb+JeXIN+jKxm9g2m5k6D+nQBt5K7Ae6j6n8pwUC hxic9hQmXNxb0R51YD/+ =w3jk -----END PGP SIGNATURE----- Merge tag 'dlm-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set focuses, as usual, on fixes to the comms layer. New testing of the dlm with ocfs2 uncovered a number of bugs in the TCP connection handling during recovery, starting, and stopping" * tag 'dlm-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: remove dlm_send_rcom_lookup_dump dlm: recheck kthread_should_stop() before schedule() DLM: fix NULL pointer dereference in send_to_sock() DLM: fix to reschedule rwork DLM: fix to use sk_callback_lock correctly DLM: fix overflow dlm_cb_seq DLM: fix memory leak in tcp_accept_from_sock() DLM: fix conversion deadlock when DLM_LKF_NODLCKWT flag is set DLM: use CF_CLOSE flag to stop dlm_send correctly DLM: Reanimate CF_WRITE_PENDING flag DLM: fix race condition between dlm_recoverd_stop and dlm_recoverd DLM: close othercon at send/receive error DLM: retry rcom when dlm_wait_function is timed out. DLM: fix to use sock_mutex correctly in xxx_accept_from_sock DLM: fix race condition between dlm_send and dlm_recv DLM: fix double list_del() DLM: fix remove save_cb argument from add_sock() DLM: Fix saving of NULL callbacks DLM: Eliminate CF_WRITE_PENDING flag DLM: Eliminate CF_CONNECT_PENDING flag	2017-11-14 14:06:51 -08:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Bhumika Goyal	761594b741	dlm: make config_item_type const Make config_item_type structures const as they are either passed to a function having the argument as const or stored in the const "ci_type" field of a config_item structure. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-10-19 16:15:22 +02:00
David Teigland	9250e52359	dlm: remove dlm_send_rcom_lookup_dump This function was only for debugging. It would be called in a condition that should not happen, and should probably have been removed from the final version of the original commit. Remove it because it does mutex lock under spin lock. Signed-off-by: David Teigland <teigland@redhat.com>	2017-10-09 09:29:31 -05:00
Guoqing Jiang	9e1b0211c5	dlm: recheck kthread_should_stop() before schedule() Call schedule() here could make the thread miss wake up from kthread_stop(), so it is better to recheck kthread_should_stop() before call schedule(), a symptom happened when I run indefinite test (which mostly created clustered raid1, assemble it in other nodes, then stop them) of clustered raid. $ ps aux\|grep md\|grep D root 4211 0.0 0.0 19760 2220 ? Ds 02:58 0:00 mdadm -Ssq $ cat /proc/4211/stack kthread_stop+0x4d/0x150 dlm_recoverd_stop+0x15/0x20 [dlm] dlm_release_lockspace+0x2ab/0x460 [dlm] leave+0xbf/0x150 [md_cluster] md_cluster_stop+0x18/0x30 [md_mod] bitmap_free+0x12e/0x140 [md_mod] bitmap_destroy+0x7f/0x90 [md_mod] __md_stop+0x21/0xa0 [md_mod] do_md_stop+0x15f/0x5c0 [md_mod] md_ioctl+0xa65/0x18a0 [md_mod] blkdev_ioctl+0x49e/0x8d0 block_ioctl+0x41/0x50 do_vfs_ioctl+0x96/0x5b0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x1e/0xad This maybe not resolve the issue completely since the KTHREAD_SHOULD_STOP flag could be set between "break" and "schedule", but at least the chance for the symptom happen could be reduce a lot (The indefinite test runs more than 20 hours without problem and it happens easily without the change). Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:48:10 -05:00
tsutomu.owa@toshiba.co.jp	26b41099e7	DLM: fix NULL pointer dereference in send_to_sock() The writequeue and writequeue_lock member of othercon was not initialized. If lowcomms_state_change() is called from network layer, othercon->swork may be scheduled. In this case, send_to_sock() will generate a NULL pointer reference. We avoid this problem by correctly initializing writequeue and writequeue_lock member of othercon. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	0aa18464c8	DLM: fix to reschedule rwork When an error occurs in kernel_recvmsg or kernel_sendpage and close_connection is called and receive work is already scheduled, receive work is canceled. In that case, the receive work will not be scheduled forever after reconnection, because CF_READ_PENDING flag is established. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	93eaadebe9	DLM: fix to use sk_callback_lock correctly In the current implementation, we think that exclusion control between processing to set the callback function to the connection structure and processing to refer to the connection structure from the callback function was not enough. We fix them. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00
tsutomu.owa@toshiba.co.jp	ccbbea0432	DLM: fix overflow dlm_cb_seq dlm_cb_seq is 64 bits. If dlm_cb_seq overflows and returns to 0, dlm_rem_lkb_callback() will not work properly. Signed-off-by: Tadashi Miyauchi <miyauchi@toshiba-tops.co.jp> Signed-off-by: Tsutomu Owa <tsutomu.owa@toshiba.co.jp> Signed-off-by: David Teigland <teigland@redhat.com>	2017-09-25 12:45:21 -05:00

1 2 3 4 5 ...

732 Commits