llvm-project/lldb/unittests/Core/CommunicationTest.cpp

//===-- CommunicationTest.cpp ---------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "lldb/Core/Communication.h"
#include "lldb/Host/Config.h"
#include "lldb/Host/ConnectionFileDescriptor.h"
#include "lldb/Host/Pipe.h"
#include "llvm/Testing/Support/Error.h"
#include "gtest/gtest.h"

#include <thread>

#if LLDB_ENABLE_POSIX
#include <fcntl.h>
#endif

using namespace lldb_private;

#ifndef _WIN32
TEST(CommunicationTest, SynchronizeWhileClosing) {
  // Set up a communication object reading from a pipe.
  Pipe pipe;
  ASSERT_THAT_ERROR(pipe.CreateNew(/*child_process_inherit=*/false).ToError(),
                    llvm::Succeeded());

  Communication comm("test");
  comm.SetConnection(std::make_unique<ConnectionFileDescriptor>(
      pipe.ReleaseReadFileDescriptor(), /*owns_fd=*/true));
  comm.SetCloseOnEOF(true);
  ASSERT_TRUE(comm.StartReadThread());

  // Ensure that we can safely synchronize with the read thread while it is
  // closing the read end (in response to us closing the write end).
  pipe.CloseWriteFileDescriptor();
  comm.SynchronizeWithReadThread();

  ASSERT_TRUE(comm.StopReadThread());
}
#endif

#if LLDB_ENABLE_POSIX
TEST(CommunicationTest, WriteAll) {
  Pipe pipe;
  ASSERT_THAT_ERROR(pipe.CreateNew(/*child_process_inherit=*/false).ToError(),
                    llvm::Succeeded());

  // Make the write end non-blocking in order to easily reproduce a partial
  // write.
  int write_fd = pipe.ReleaseWriteFileDescriptor();
  int flags = fcntl(write_fd, F_GETFL);
  ASSERT_NE(flags, -1);
  ASSERT_NE(fcntl(write_fd, F_SETFL, flags | O_NONBLOCK), -1);

  ConnectionFileDescriptor read_conn{pipe.ReleaseReadFileDescriptor(),
                                     /*owns_fd=*/true};
  Communication write_comm("test");
  write_comm.SetConnection(
      std::make_unique<ConnectionFileDescriptor>(write_fd, /*owns_fd=*/true));

  std::thread read_thread{[&read_conn]() {
    // Read using a smaller buffer to increase chances of partial write.
    char buf[128 * 1024];
    lldb::ConnectionStatus conn_status;

    do {
      read_conn.Read(buf, sizeof(buf), std::chrono::seconds(1), conn_status,
                     nullptr);
    } while (conn_status != lldb::eConnectionStatusEndOfFile);
  }};

  // Write 1 MiB of data into the pipe.
  lldb::ConnectionStatus conn_status;
  Status error;
  std::vector<uint8_t> data(1024 * 1024, 0x80);
  EXPECT_EQ(write_comm.WriteAll(data.data(), data.size(), conn_status, &error),
            data.size());
  EXPECT_EQ(conn_status, lldb::eConnectionStatusSuccess);
  EXPECT_FALSE(error.Fail());

  // Close the write end in order to trigger EOF.
  write_comm.Disconnect();
  read_thread.join();
}
#endif
Recommit "[lldb/Core] Fix a race in the Communication class" The synchronization logic in the previous had a subtle bug. Moving of the "m_read_thread_did_exit = true" into the critical section made it possible for some threads calling SynchronizeWithReadThread call to get stuck. This could happen if there were already past the point where they checked this variable. In that case, they would block on waiting for the eBroadcastBitNoMorePendingInput event, which would never come as the read thread was blocked on getting the synchronization mutex. The new version moves that line out of the critical section and before the sending of the eBroadcastBitNoMorePendingInput event, and also adds some comments to explain why the things need to be in this sequence: - m_read_thread_did_exit = true: prevents new threads for waiting on events - eBroadcastBitNoMorePendingInput: unblock any current thread waiting for the event - Disconnect(): close the connection. This is the only bit that needs to be in the critical section, and this is to ensure that we don't close the connection while the synchronizing thread is mucking with it. Original commit message follows: Communication::SynchronizeWithReadThread is called whenever a process stops to ensure that we process all of its stdout before we report the stop. If the process exits, we first call this method, and then close the connection. However, when the child process exits, the thread reading its stdout will usually (but not always) read an EOF because the other end of the pty has been closed. In response to an EOF, the Communication read thread closes it's end of the connection too. This can result in a race where the read thread is closing the connection while the synchronizing thread is attempting to get its attention via Connection::InterruptRead. The fix is to hold the synchronization mutex while closing the connection. I've found this issue while tracking down a rare flake in some of the vscode tests. I am not sure this is the cause of those failures (as I would have expected this issue to manifest itself differently), but it is an issue nonetheless. The attached test demonstrates the steps needed to reproduce the race. It will fail under tsan without this patch. Reviewers: clayborg, JDevlieghere Subscribers: mgorny, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D77295 2020-04-02 19:54:54 +08:00			`//===-- CommunicationTest.cpp ---------------------------------------------===//`
			`//`
			`// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.`
			`// See https://llvm.org/LICENSE.txt for license information.`
			`// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception`
			`//`
			`//===----------------------------------------------------------------------===//`

			`#include "lldb/Core/Communication.h"`
[lldb] [Communication] Add a WriteAll() method that resumes writing Add a Communication::WriteAll() that resumes Write() if the initial call did not write all data. Use it in GDBRemoteCommunication when sending packets in order to fix handling partial writes (i.e. just resume/retry them rather than erring out). This fixes LLDB failures when writing large packets to a pty. Differential Revision: https://reviews.llvm.org/D112169 2021-10-21 03:28:25 +08:00			`#include "lldb/Host/Config.h"`
Recommit "[lldb/Core] Fix a race in the Communication class" The synchronization logic in the previous had a subtle bug. Moving of the "m_read_thread_did_exit = true" into the critical section made it possible for some threads calling SynchronizeWithReadThread call to get stuck. This could happen if there were already past the point where they checked this variable. In that case, they would block on waiting for the eBroadcastBitNoMorePendingInput event, which would never come as the read thread was blocked on getting the synchronization mutex. The new version moves that line out of the critical section and before the sending of the eBroadcastBitNoMorePendingInput event, and also adds some comments to explain why the things need to be in this sequence: - m_read_thread_did_exit = true: prevents new threads for waiting on events - eBroadcastBitNoMorePendingInput: unblock any current thread waiting for the event - Disconnect(): close the connection. This is the only bit that needs to be in the critical section, and this is to ensure that we don't close the connection while the synchronizing thread is mucking with it. Original commit message follows: Communication::SynchronizeWithReadThread is called whenever a process stops to ensure that we process all of its stdout before we report the stop. If the process exits, we first call this method, and then close the connection. However, when the child process exits, the thread reading its stdout will usually (but not always) read an EOF because the other end of the pty has been closed. In response to an EOF, the Communication read thread closes it's end of the connection too. This can result in a race where the read thread is closing the connection while the synchronizing thread is attempting to get its attention via Connection::InterruptRead. The fix is to hold the synchronization mutex while closing the connection. I've found this issue while tracking down a rare flake in some of the vscode tests. I am not sure this is the cause of those failures (as I would have expected this issue to manifest itself differently), but it is an issue nonetheless. The attached test demonstrates the steps needed to reproduce the race. It will fail under tsan without this patch. Reviewers: clayborg, JDevlieghere Subscribers: mgorny, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D77295 2020-04-02 19:54:54 +08:00			`#include "lldb/Host/ConnectionFileDescriptor.h"`
			`#include "lldb/Host/Pipe.h"`
			`#include "llvm/Testing/Support/Error.h"`
			`#include "gtest/gtest.h"`

[lldb] [Communication] Add a WriteAll() method that resumes writing Add a Communication::WriteAll() that resumes Write() if the initial call did not write all data. Use it in GDBRemoteCommunication when sending packets in order to fix handling partial writes (i.e. just resume/retry them rather than erring out). This fixes LLDB failures when writing large packets to a pty. Differential Revision: https://reviews.llvm.org/D112169 2021-10-21 03:28:25 +08:00			`#include <thread>`

			`#if LLDB_ENABLE_POSIX`
			`#include <fcntl.h>`
			`#endif`

Recommit "[lldb/Core] Fix a race in the Communication class" The synchronization logic in the previous had a subtle bug. Moving of the "m_read_thread_did_exit = true" into the critical section made it possible for some threads calling SynchronizeWithReadThread call to get stuck. This could happen if there were already past the point where they checked this variable. In that case, they would block on waiting for the eBroadcastBitNoMorePendingInput event, which would never come as the read thread was blocked on getting the synchronization mutex. The new version moves that line out of the critical section and before the sending of the eBroadcastBitNoMorePendingInput event, and also adds some comments to explain why the things need to be in this sequence: - m_read_thread_did_exit = true: prevents new threads for waiting on events - eBroadcastBitNoMorePendingInput: unblock any current thread waiting for the event - Disconnect(): close the connection. This is the only bit that needs to be in the critical section, and this is to ensure that we don't close the connection while the synchronizing thread is mucking with it. Original commit message follows: Communication::SynchronizeWithReadThread is called whenever a process stops to ensure that we process all of its stdout before we report the stop. If the process exits, we first call this method, and then close the connection. However, when the child process exits, the thread reading its stdout will usually (but not always) read an EOF because the other end of the pty has been closed. In response to an EOF, the Communication read thread closes it's end of the connection too. This can result in a race where the read thread is closing the connection while the synchronizing thread is attempting to get its attention via Connection::InterruptRead. The fix is to hold the synchronization mutex while closing the connection. I've found this issue while tracking down a rare flake in some of the vscode tests. I am not sure this is the cause of those failures (as I would have expected this issue to manifest itself differently), but it is an issue nonetheless. The attached test demonstrates the steps needed to reproduce the race. It will fail under tsan without this patch. Reviewers: clayborg, JDevlieghere Subscribers: mgorny, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D77295 2020-04-02 19:54:54 +08:00			`using namespace lldb_private;`

[lldb] Disable the new Communication test on windows The ConnectionFileDescriptor class on windows does not support interruption (see the BytesAvailable method). Therefore this test makes no sense there. 2020-04-09 21:19:24 +08:00			`#ifndef _WIN32`
Recommit "[lldb/Core] Fix a race in the Communication class" The synchronization logic in the previous had a subtle bug. Moving of the "m_read_thread_did_exit = true" into the critical section made it possible for some threads calling SynchronizeWithReadThread call to get stuck. This could happen if there were already past the point where they checked this variable. In that case, they would block on waiting for the eBroadcastBitNoMorePendingInput event, which would never come as the read thread was blocked on getting the synchronization mutex. The new version moves that line out of the critical section and before the sending of the eBroadcastBitNoMorePendingInput event, and also adds some comments to explain why the things need to be in this sequence: - m_read_thread_did_exit = true: prevents new threads for waiting on events - eBroadcastBitNoMorePendingInput: unblock any current thread waiting for the event - Disconnect(): close the connection. This is the only bit that needs to be in the critical section, and this is to ensure that we don't close the connection while the synchronizing thread is mucking with it. Original commit message follows: Communication::SynchronizeWithReadThread is called whenever a process stops to ensure that we process all of its stdout before we report the stop. If the process exits, we first call this method, and then close the connection. However, when the child process exits, the thread reading its stdout will usually (but not always) read an EOF because the other end of the pty has been closed. In response to an EOF, the Communication read thread closes it's end of the connection too. This can result in a race where the read thread is closing the connection while the synchronizing thread is attempting to get its attention via Connection::InterruptRead. The fix is to hold the synchronization mutex while closing the connection. I've found this issue while tracking down a rare flake in some of the vscode tests. I am not sure this is the cause of those failures (as I would have expected this issue to manifest itself differently), but it is an issue nonetheless. The attached test demonstrates the steps needed to reproduce the race. It will fail under tsan without this patch. Reviewers: clayborg, JDevlieghere Subscribers: mgorny, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D77295 2020-04-02 19:54:54 +08:00			`TEST(CommunicationTest, SynchronizeWhileClosing) {`
			`// Set up a communication object reading from a pipe.`
			`Pipe pipe;`
			`ASSERT_THAT_ERROR(pipe.CreateNew(/child_process_inherit=/false).ToError(),`
			`llvm::Succeeded());`

			`Communication comm("test");`
			`comm.SetConnection(std::make_unique<ConnectionFileDescriptor>(`
			`pipe.ReleaseReadFileDescriptor(), /owns_fd=/true));`
			`comm.SetCloseOnEOF(true);`
			`ASSERT_TRUE(comm.StartReadThread());`

			`// Ensure that we can safely synchronize with the read thread while it is`
			`// closing the read end (in response to us closing the write end).`
			`pipe.CloseWriteFileDescriptor();`
			`comm.SynchronizeWithReadThread();`

			`ASSERT_TRUE(comm.StopReadThread());`
			`}`
[lldb] Disable the new Communication test on windows The ConnectionFileDescriptor class on windows does not support interruption (see the BytesAvailable method). Therefore this test makes no sense there. 2020-04-09 21:19:24 +08:00			`#endif`
[lldb] [Communication] Add a WriteAll() method that resumes writing Add a Communication::WriteAll() that resumes Write() if the initial call did not write all data. Use it in GDBRemoteCommunication when sending packets in order to fix handling partial writes (i.e. just resume/retry them rather than erring out). This fixes LLDB failures when writing large packets to a pty. Differential Revision: https://reviews.llvm.org/D112169 2021-10-21 03:28:25 +08:00
			`#if LLDB_ENABLE_POSIX`
			`TEST(CommunicationTest, WriteAll) {`
			`Pipe pipe;`
			`ASSERT_THAT_ERROR(pipe.CreateNew(/child_process_inherit=/false).ToError(),`
			`llvm::Succeeded());`

			`// Make the write end non-blocking in order to easily reproduce a partial`
			`// write.`
			`int write_fd = pipe.ReleaseWriteFileDescriptor();`
			`int flags = fcntl(write_fd, F_GETFL);`
			`ASSERT_NE(flags, -1);`
			`ASSERT_NE(fcntl(write_fd, F_SETFL, flags \| O_NONBLOCK), -1);`

			`ConnectionFileDescriptor read_conn{pipe.ReleaseReadFileDescriptor(),`
			`/owns_fd=/true};`
			`Communication write_comm("test");`
			`write_comm.SetConnection(`
			`std::make_unique<ConnectionFileDescriptor>(write_fd, /owns_fd=/true));`

			`std::thread read_thread{[&read_conn]() {`
			`// Read using a smaller buffer to increase chances of partial write.`
			`char buf[128 * 1024];`
			`lldb::ConnectionStatus conn_status;`

			`do {`
			`read_conn.Read(buf, sizeof(buf), std::chrono::seconds(1), conn_status,`
			`nullptr);`
			`} while (conn_status != lldb::eConnectionStatusEndOfFile);`
			`}};`

			`// Write 1 MiB of data into the pipe.`
			`lldb::ConnectionStatus conn_status;`
			`Status error;`
[lldb] Fix compiler warning in CommunicationTest "0x80 changes value when converted to char" 2022-01-18 17:30:33 +08:00			`std::vector<uint8_t> data(1024 * 1024, 0x80);`
[lldb] [Communication] Add a WriteAll() method that resumes writing Add a Communication::WriteAll() that resumes Write() if the initial call did not write all data. Use it in GDBRemoteCommunication when sending packets in order to fix handling partial writes (i.e. just resume/retry them rather than erring out). This fixes LLDB failures when writing large packets to a pty. Differential Revision: https://reviews.llvm.org/D112169 2021-10-21 03:28:25 +08:00			`EXPECT_EQ(write_comm.WriteAll(data.data(), data.size(), conn_status, &error),`
			`data.size());`
			`EXPECT_EQ(conn_status, lldb::eConnectionStatusSuccess);`
			`EXPECT_FALSE(error.Fail());`

			`// Close the write end in order to trigger EOF.`
			`write_comm.Disconnect();`
			`read_thread.join();`
			`}`
			`#endif`