llvm-project/clang/lib/Frontend/FrontendAction.cpp

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

1204 lines
46 KiB
C++
Raw Normal View History

//===--- FrontendAction.cpp -----------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#include "clang/Frontend/FrontendAction.h"
#include "clang/AST/ASTConsumer.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/DeclGroup.h"
#include "clang/Basic/Builtins.h"
#include "clang/Basic/LangStandard.h"
#include "clang/Frontend/ASTUnit.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Frontend/FrontendDiagnostic.h"
#include "clang/Frontend/FrontendPluginRegistry.h"
Extend the ExternalASTSource interface to allow the AST source to provide the layout of records, rather than letting Clang compute the layout itself. LLDB provides the motivation for this feature: because various layout-altering attributes (packed, aligned, etc.) don't get reliably get placed into DWARF, the record layouts computed by LLDB from the reconstructed records differ from the actual layouts, and badness occurs. This interface lets the DWARF data drive layout, so we don't need the attributes preserved to get the answer write. The testing methodology for this change is fun. I've introduced a variant of -fdump-record-layouts called -fdump-record-layouts-simple that always has the simple C format and provides size/alignment/field offsets. There is also a -cc1 option -foverride-record-layout=<file> to take the output of -fdump-record-layouts-simple and parse it to produce a set of overridden layouts, which is introduced into the AST via a testing-only ExternalASTSource (called LayoutOverrideSource). Each test contains a number of records to lay out, which use various layout-changing attributes, and then dumps the layouts. We then run the test again, using the preprocessor to eliminate the layout-changing attributes entirely (which would give us different layouts for the records), but supplying the previously-computed record layouts. Finally, we diff the layouts produced from the two runs to be sure that they are identical. Note that this code makes the assumption that we don't *have* to provide the offsets of bases or virtual bases to get the layout right, because the alignment attributes don't affect it. I believe this assumption holds, but if it does not, we can extend LayoutOverrideSource to also provide base offset information. Fixes the Clang side of <rdar://problem/10169539>. llvm-svn: 149055
2012-01-26 15:55:45 +08:00
#include "clang/Frontend/LayoutOverrideSource.h"
#include "clang/Frontend/MultiplexConsumer.h"
#include "clang/Frontend/Utils.h"
#include "clang/Lex/HeaderSearch.h"
#include "clang/Lex/LiteralSupport.h"
#include "clang/Lex/Preprocessor.h"
#include "clang/Lex/PreprocessorOptions.h"
#include "clang/Parse/ParseAST.h"
#include "clang/Serialization/ASTDeserializationListener.h"
#include "clang/Serialization/ASTReader.h"
#include "clang/Serialization/GlobalModuleIndex.h"
#include "llvm/ADT/ScopeExit.h"
#include "llvm/Support/BuryPointer.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Path.h"
#include "llvm/Support/Timer.h"
#include "llvm/Support/raw_ostream.h"
#include <system_error>
using namespace clang;
Reapply r276973 "Adjust Registry interface to not require plugins to export a registry" This differs from the previous version by being more careful about template instantiation/specialization in order to prevent errors when building with clang -Werror. Specifically: * begin is not defined in the template and is instead instantiated when Head is. I think the warning when we don't do that is wrong (PR28815) but for now at least do it this way to avoid the warning. * Instead of performing template specializations in LLVM_INSTANTIATE_REGISTRY instead provide a template definition then do explicit instantiation. No compiler I've tried has problems with doing it the other way, but strictly speaking it's not permitted by the C++ standard so better safe than sorry. Original commit message: Currently the Registry class contains the vestiges of a previous attempt to allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a plugin would have its own copy of a registry and export it to be imported by the tool that's loading the plugin. This only works if the plugin is entirely self-contained with the only interface between the plugin and tool being the registry, and in particular this conflicts with how IR pass plugins work. This patch changes things so that instead the add_node function of the registry is exported by the tool and then imported by the plugin, which solves this problem and also means that instead of every plugin having to export every registry they use instead LLVM only has to export the add_node functions. This allows plugins that use a registry to work on Windows if LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used. llvm-svn: 277806
2016-08-05 19:01:08 +08:00
LLVM_INSTANTIATE_REGISTRY(FrontendPluginRegistry)
namespace {
class DelegatingDeserializationListener : public ASTDeserializationListener {
ASTDeserializationListener *Previous;
bool DeletePrevious;
public:
explicit DelegatingDeserializationListener(
ASTDeserializationListener *Previous, bool DeletePrevious)
: Previous(Previous), DeletePrevious(DeletePrevious) {}
~DelegatingDeserializationListener() override {
if (DeletePrevious)
delete Previous;
}
void ReaderInitialized(ASTReader *Reader) override {
if (Previous)
Previous->ReaderInitialized(Reader);
}
void IdentifierRead(serialization::IdentID ID,
IdentifierInfo *II) override {
if (Previous)
Previous->IdentifierRead(ID, II);
}
void TypeRead(serialization::TypeIdx Idx, QualType T) override {
if (Previous)
Previous->TypeRead(Idx, T);
}
void DeclRead(serialization::DeclID ID, const Decl *D) override {
if (Previous)
Previous->DeclRead(ID, D);
}
void SelectorRead(serialization::SelectorID ID, Selector Sel) override {
if (Previous)
Previous->SelectorRead(ID, Sel);
}
void MacroDefinitionRead(serialization::PreprocessedEntityID PPID,
MacroDefinitionRecord *MD) override {
if (Previous)
Previous->MacroDefinitionRead(PPID, MD);
}
};
/// Dumps deserialized declarations.
class DeserializedDeclsDumper : public DelegatingDeserializationListener {
public:
explicit DeserializedDeclsDumper(ASTDeserializationListener *Previous,
bool DeletePrevious)
: DelegatingDeserializationListener(Previous, DeletePrevious) {}
void DeclRead(serialization::DeclID ID, const Decl *D) override {
llvm::outs() << "PCH DECL: " << D->getDeclKindName();
if (const NamedDecl *ND = dyn_cast<NamedDecl>(D)) {
llvm::outs() << " - ";
ND->printQualifiedName(llvm::outs());
}
llvm::outs() << "\n";
DelegatingDeserializationListener::DeclRead(ID, D);
}
};
/// Checks deserialized declarations and emits error if a name
2012-05-30 01:05:42 +08:00
/// matches one given in command-line using -error-on-deserialized-decl.
class DeserializedDeclsChecker : public DelegatingDeserializationListener {
ASTContext &Ctx;
std::set<std::string> NamesToCheck;
public:
DeserializedDeclsChecker(ASTContext &Ctx,
const std::set<std::string> &NamesToCheck,
ASTDeserializationListener *Previous,
bool DeletePrevious)
: DelegatingDeserializationListener(Previous, DeletePrevious), Ctx(Ctx),
NamesToCheck(NamesToCheck) {}
2012-05-30 01:05:42 +08:00
void DeclRead(serialization::DeclID ID, const Decl *D) override {
2012-05-30 01:05:42 +08:00
if (const NamedDecl *ND = dyn_cast<NamedDecl>(D))
if (NamesToCheck.find(ND->getNameAsString()) != NamesToCheck.end()) {
unsigned DiagID
= Ctx.getDiagnostics().getCustomDiagID(DiagnosticsEngine::Error,
"%0 was deserialized");
Ctx.getDiagnostics().Report(Ctx.getFullLoc(D->getLocation()), DiagID)
<< ND;
2012-05-30 01:05:42 +08:00
}
DelegatingDeserializationListener::DeclRead(ID, D);
}
};
} // end anonymous namespace
FrontendAction::FrontendAction() : Instance(nullptr) {}
FrontendAction::~FrontendAction() {}
void FrontendAction::setCurrentInput(const FrontendInputFile &CurrentInput,
std::unique_ptr<ASTUnit> AST) {
this->CurrentInput = CurrentInput;
CurrentASTUnit = std::move(AST);
}
Module *FrontendAction::getCurrentModule() const {
CompilerInstance &CI = getCompilerInstance();
return CI.getPreprocessor().getHeaderSearchInfo().lookupModule(
CI.getLangOpts().CurrentModule, SourceLocation(), /*AllowSearch*/false);
}
std::unique_ptr<ASTConsumer>
FrontendAction::CreateWrappedASTConsumer(CompilerInstance &CI,
StringRef InFile) {
std::unique_ptr<ASTConsumer> Consumer = CreateASTConsumer(CI, InFile);
if (!Consumer)
return nullptr;
// Validate -add-plugin args.
bool FoundAllPlugins = true;
for (const std::string &Arg : CI.getFrontendOpts().AddPluginActions) {
bool Found = false;
for (const FrontendPluginRegistry::entry &Plugin :
FrontendPluginRegistry::entries()) {
if (Plugin.getName() == Arg)
Found = true;
}
if (!Found) {
CI.getDiagnostics().Report(diag::err_fe_invalid_plugin_name) << Arg;
FoundAllPlugins = false;
}
}
if (!FoundAllPlugins)
return nullptr;
// If there are no registered plugins we don't need to wrap the consumer
if (FrontendPluginRegistry::begin() == FrontendPluginRegistry::end())
return Consumer;
// If this is a code completion run, avoid invoking the plugin consumers
if (CI.hasCodeCompletionConsumer())
return Consumer;
// Collect the list of plugins that go before the main action (in Consumers)
// or after it (in AfterConsumers)
std::vector<std::unique_ptr<ASTConsumer>> Consumers;
std::vector<std::unique_ptr<ASTConsumer>> AfterConsumers;
for (const FrontendPluginRegistry::entry &Plugin :
FrontendPluginRegistry::entries()) {
std::unique_ptr<PluginASTAction> P = Plugin.instantiate();
PluginASTAction::ActionType ActionType = P->getActionType();
if (ActionType == PluginASTAction::CmdlineAfterMainAction ||
ActionType == PluginASTAction::CmdlineBeforeMainAction) {
// This is O(|plugins| * |add_plugins|), but since both numbers are
// way below 50 in practice, that's ok.
if (llvm::any_of(CI.getFrontendOpts().AddPluginActions,
[&](const std::string &PluginAction) {
return PluginAction == Plugin.getName();
})) {
if (ActionType == PluginASTAction::CmdlineBeforeMainAction)
ActionType = PluginASTAction::AddBeforeMainAction;
else
ActionType = PluginASTAction::AddAfterMainAction;
}
}
if ((ActionType == PluginASTAction::AddBeforeMainAction ||
ActionType == PluginASTAction::AddAfterMainAction) &&
P->ParseArgs(
CI,
CI.getFrontendOpts().PluginArgs[std::string(Plugin.getName())])) {
std::unique_ptr<ASTConsumer> PluginConsumer = P->CreateASTConsumer(CI, InFile);
if (ActionType == PluginASTAction::AddBeforeMainAction) {
Consumers.push_back(std::move(PluginConsumer));
} else {
AfterConsumers.push_back(std::move(PluginConsumer));
}
}
}
// Add to Consumers the main consumer, then all the plugins that go after it
Consumers.push_back(std::move(Consumer));
if (!AfterConsumers.empty()) {
// If we have plugins after the main consumer, which may be the codegen
// action, they likely will need the ASTContext, so don't clear it in the
// codegen action.
CI.getCodeGenOpts().ClearASTBeforeBackend = false;
for (auto &C : AfterConsumers)
Consumers.push_back(std::move(C));
}
return std::make_unique<MultiplexConsumer>(std::move(Consumers));
}
/// For preprocessed files, if the first line is the linemarker and specifies
/// the original source file name, use that name as the input file name.
/// Returns the location of the first token after the line marker directive.
///
/// \param CI The compiler instance.
/// \param InputFile Populated with the filename from the line marker.
/// \param IsModuleMap If \c true, add a line note corresponding to this line
/// directive. (We need to do this because the directive will not be
/// visited by the preprocessor.)
static SourceLocation ReadOriginalFileName(CompilerInstance &CI,
std::string &InputFile,
bool IsModuleMap = false) {
auto &SourceMgr = CI.getSourceManager();
auto MainFileID = SourceMgr.getMainFileID();
auto MainFileBuf = SourceMgr.getBufferOrNone(MainFileID);
if (!MainFileBuf)
return SourceLocation();
std::unique_ptr<Lexer> RawLexer(
new Lexer(MainFileID, *MainFileBuf, SourceMgr, CI.getLangOpts()));
// If the first line has the syntax of
//
// # NUM "FILENAME"
//
// we use FILENAME as the input file name.
Token T;
if (RawLexer->LexFromRawLexer(T) || T.getKind() != tok::hash)
return SourceLocation();
if (RawLexer->LexFromRawLexer(T) || T.isAtStartOfLine() ||
T.getKind() != tok::numeric_constant)
return SourceLocation();
unsigned LineNo;
SourceLocation LineNoLoc = T.getLocation();
if (IsModuleMap) {
llvm::SmallString<16> Buffer;
if (Lexer::getSpelling(LineNoLoc, Buffer, SourceMgr, CI.getLangOpts())
.getAsInteger(10, LineNo))
return SourceLocation();
}
RawLexer->LexFromRawLexer(T);
if (T.isAtStartOfLine() || T.getKind() != tok::string_literal)
return SourceLocation();
StringLiteralParser Literal(T, CI.getPreprocessor());
if (Literal.hadError)
return SourceLocation();
RawLexer->LexFromRawLexer(T);
if (T.isNot(tok::eof) && !T.isAtStartOfLine())
return SourceLocation();
InputFile = Literal.GetString().str();
if (IsModuleMap)
CI.getSourceManager().AddLineNote(
LineNoLoc, LineNo, SourceMgr.getLineTableFilenameID(InputFile), false,
false, SrcMgr::C_User_ModuleMap);
return T.getLocation();
}
static SmallVectorImpl<char> &
operator+=(SmallVectorImpl<char> &Includes, StringRef RHS) {
Includes.append(RHS.begin(), RHS.end());
return Includes;
}
static void addHeaderInclude(StringRef HeaderName,
SmallVectorImpl<char> &Includes,
const LangOptions &LangOpts,
bool IsExternC) {
if (IsExternC && LangOpts.CPlusPlus)
Includes += "extern \"C\" {\n";
if (LangOpts.ObjC)
Includes += "#import \"";
else
Includes += "#include \"";
Includes += HeaderName;
Includes += "\"\n";
if (IsExternC && LangOpts.CPlusPlus)
Includes += "}\n";
}
/// Collect the set of header includes needed to construct the given
/// module and update the TopHeaders file set of the module.
///
/// \param Module The module we're collecting includes from.
///
/// \param Includes Will be augmented with the set of \#includes or \#imports
/// needed to load all of the named headers.
Support lazy stat'ing of files referenced by module maps. This patch adds support for a `header` declaration in a module map to specify certain `stat` information (currently, size and mtime) about that header file. This has two purposes: - It removes the need to eagerly `stat` every file referenced by a module map. Instead, we track a list of unresolved header files with each size / mtime (actually, for simplicity, we track submodules with such headers), and when attempting to look up a header file based on a `FileEntry`, we check if there are any unresolved header directives with that `FileEntry`'s size / mtime and perform deferred `stat`s if so. - It permits a preprocessed module to be compiled without the original files being present on disk. The only reason we used to need those files was to get the `stat` information in order to do header -> module lookups when using the module. If we're provided with the `stat` information in the preprocessed module, we can avoid requiring the files to exist. Unlike most `header` directives, if a `header` directive with `stat` information has no corresponding on-disk file the enclosing module is *not* marked unavailable (so that behavior is consistent regardless of whether we've resolved a header directive, and so that preprocessed modules don't get marked unavailable). We could actually do this for all `header` directives: the only reason we mark the module unavailable if headers are missing is to give a diagnostic slightly earlier (rather than waiting until we actually try to build the module / load and validate its .pcm file). Differential Revision: https://reviews.llvm.org/D33703 llvm-svn: 304515
2017-06-02 09:55:39 +08:00
static std::error_code collectModuleHeaderIncludes(
const LangOptions &LangOpts, FileManager &FileMgr, DiagnosticsEngine &Diag,
ModuleMap &ModMap, clang::Module *Module, SmallVectorImpl<char> &Includes) {
// Don't collect any headers for unavailable modules.
if (!Module->isAvailable())
return std::error_code();
Support lazy stat'ing of files referenced by module maps. This patch adds support for a `header` declaration in a module map to specify certain `stat` information (currently, size and mtime) about that header file. This has two purposes: - It removes the need to eagerly `stat` every file referenced by a module map. Instead, we track a list of unresolved header files with each size / mtime (actually, for simplicity, we track submodules with such headers), and when attempting to look up a header file based on a `FileEntry`, we check if there are any unresolved header directives with that `FileEntry`'s size / mtime and perform deferred `stat`s if so. - It permits a preprocessed module to be compiled without the original files being present on disk. The only reason we used to need those files was to get the `stat` information in order to do header -> module lookups when using the module. If we're provided with the `stat` information in the preprocessed module, we can avoid requiring the files to exist. Unlike most `header` directives, if a `header` directive with `stat` information has no corresponding on-disk file the enclosing module is *not* marked unavailable (so that behavior is consistent regardless of whether we've resolved a header directive, and so that preprocessed modules don't get marked unavailable). We could actually do this for all `header` directives: the only reason we mark the module unavailable if headers are missing is to give a diagnostic slightly earlier (rather than waiting until we actually try to build the module / load and validate its .pcm file). Differential Revision: https://reviews.llvm.org/D33703 llvm-svn: 304515
2017-06-02 09:55:39 +08:00
// Resolve all lazy header directives to header files.
ModMap.resolveHeaderDirectives(Module, /*File=*/llvm::None);
Support lazy stat'ing of files referenced by module maps. This patch adds support for a `header` declaration in a module map to specify certain `stat` information (currently, size and mtime) about that header file. This has two purposes: - It removes the need to eagerly `stat` every file referenced by a module map. Instead, we track a list of unresolved header files with each size / mtime (actually, for simplicity, we track submodules with such headers), and when attempting to look up a header file based on a `FileEntry`, we check if there are any unresolved header directives with that `FileEntry`'s size / mtime and perform deferred `stat`s if so. - It permits a preprocessed module to be compiled without the original files being present on disk. The only reason we used to need those files was to get the `stat` information in order to do header -> module lookups when using the module. If we're provided with the `stat` information in the preprocessed module, we can avoid requiring the files to exist. Unlike most `header` directives, if a `header` directive with `stat` information has no corresponding on-disk file the enclosing module is *not* marked unavailable (so that behavior is consistent regardless of whether we've resolved a header directive, and so that preprocessed modules don't get marked unavailable). We could actually do this for all `header` directives: the only reason we mark the module unavailable if headers are missing is to give a diagnostic slightly earlier (rather than waiting until we actually try to build the module / load and validate its .pcm file). Differential Revision: https://reviews.llvm.org/D33703 llvm-svn: 304515
2017-06-02 09:55:39 +08:00
// If any headers are missing, we can't build this module. In most cases,
// diagnostics for this should have already been produced; we only get here
// if explicit stat information was provided.
// FIXME: If the name resolves to a file with different stat information,
// produce a better diagnostic.
if (!Module->MissingHeaders.empty()) {
auto &MissingHeader = Module->MissingHeaders.front();
Diag.Report(MissingHeader.FileNameLoc, diag::err_module_header_missing)
<< MissingHeader.IsUmbrella << MissingHeader.FileName;
return std::error_code();
}
// Add includes for each of these headers.
for (auto HK : {Module::HK_Normal, Module::HK_Private}) {
for (Module::Header &H : Module->Headers[HK]) {
Module->addTopHeader(H.Entry);
// Use the path as specified in the module map file. We'll look for this
// file relative to the module build directory (the directory containing
// the module map file) so this will find the same file that we found
// while parsing the module map.
addHeaderInclude(H.PathRelativeToRootModuleDirectory, Includes, LangOpts,
Module->IsExternC);
}
}
// Note that Module->PrivateHeaders will not be a TopHeader.
if (Module::Header UmbrellaHeader = Module->getUmbrellaHeader()) {
Module->addTopHeader(UmbrellaHeader.Entry);
if (Module->Parent)
// Include the umbrella header for submodules.
addHeaderInclude(UmbrellaHeader.PathRelativeToRootModuleDirectory,
Includes, LangOpts, Module->IsExternC);
} else if (Module::DirectoryName UmbrellaDir = Module->getUmbrellaDir()) {
// Add all of the headers we find in this subdirectory.
std::error_code EC;
SmallString<128> DirNative;
llvm::sys::path::native(UmbrellaDir.Entry->getName(), DirNative);
llvm::vfs::FileSystem &FS = FileMgr.getVirtualFileSystem();
SmallVector<std::pair<std::string, const FileEntry *>, 8> Headers;
for (llvm::vfs::recursive_directory_iterator Dir(FS, DirNative, EC), End;
Dir != End && !EC; Dir.increment(EC)) {
// Check whether this entry has an extension typically associated with
// headers.
if (!llvm::StringSwitch<bool>(llvm::sys::path::extension(Dir->path()))
.Cases(".h", ".H", ".hh", ".hpp", true)
.Default(false))
continue;
auto Header = FileMgr.getFile(Dir->path());
// FIXME: This shouldn't happen unless there is a file system race. Is
// that worth diagnosing?
if (!Header)
continue;
// If this header is marked 'unavailable' in this module, don't include
// it.
if (ModMap.isHeaderUnavailableInModule(*Header, Module))
continue;
// Compute the relative path from the directory to this file.
SmallVector<StringRef, 16> Components;
auto PathIt = llvm::sys::path::rbegin(Dir->path());
for (int I = 0; I != Dir.level() + 1; ++I, ++PathIt)
Components.push_back(*PathIt);
SmallString<128> RelativeHeader(
UmbrellaDir.PathRelativeToRootModuleDirectory);
for (auto It = Components.rbegin(), End = Components.rend(); It != End;
++It)
llvm::sys::path::append(RelativeHeader, *It);
std::string RelName = RelativeHeader.c_str();
Headers.push_back(std::make_pair(RelName, *Header));
}
if (EC)
return EC;
// Sort header paths and make the header inclusion order deterministic
// across different OSs and filesystems.
llvm::sort(Headers.begin(), Headers.end(), [](
const std::pair<std::string, const FileEntry *> &LHS,
const std::pair<std::string, const FileEntry *> &RHS) {
return LHS.first < RHS.first;
});
for (auto &H : Headers) {
// Include this header as part of the umbrella directory.
Module->addTopHeader(H.second);
addHeaderInclude(H.first, Includes, LangOpts, Module->IsExternC);
}
}
// Recurse into submodules.
for (clang::Module::submodule_iterator Sub = Module->submodule_begin(),
SubEnd = Module->submodule_end();
Sub != SubEnd; ++Sub)
if (std::error_code Err = collectModuleHeaderIncludes(
Support lazy stat'ing of files referenced by module maps. This patch adds support for a `header` declaration in a module map to specify certain `stat` information (currently, size and mtime) about that header file. This has two purposes: - It removes the need to eagerly `stat` every file referenced by a module map. Instead, we track a list of unresolved header files with each size / mtime (actually, for simplicity, we track submodules with such headers), and when attempting to look up a header file based on a `FileEntry`, we check if there are any unresolved header directives with that `FileEntry`'s size / mtime and perform deferred `stat`s if so. - It permits a preprocessed module to be compiled without the original files being present on disk. The only reason we used to need those files was to get the `stat` information in order to do header -> module lookups when using the module. If we're provided with the `stat` information in the preprocessed module, we can avoid requiring the files to exist. Unlike most `header` directives, if a `header` directive with `stat` information has no corresponding on-disk file the enclosing module is *not* marked unavailable (so that behavior is consistent regardless of whether we've resolved a header directive, and so that preprocessed modules don't get marked unavailable). We could actually do this for all `header` directives: the only reason we mark the module unavailable if headers are missing is to give a diagnostic slightly earlier (rather than waiting until we actually try to build the module / load and validate its .pcm file). Differential Revision: https://reviews.llvm.org/D33703 llvm-svn: 304515
2017-06-02 09:55:39 +08:00
LangOpts, FileMgr, Diag, ModMap, *Sub, Includes))
return Err;
return std::error_code();
}
static bool loadModuleMapForModuleBuild(CompilerInstance &CI, bool IsSystem,
bool IsPreprocessed,
std::string &PresumedModuleMapFile,
unsigned &Offset) {
auto &SrcMgr = CI.getSourceManager();
HeaderSearch &HS = CI.getPreprocessor().getHeaderSearchInfo();
// Map the current input to a file.
FileID ModuleMapID = SrcMgr.getMainFileID();
const FileEntry *ModuleMap = SrcMgr.getFileEntryForID(ModuleMapID);
// If the module map is preprocessed, handle the initial line marker;
// line directives are not part of the module map syntax in general.
Offset = 0;
if (IsPreprocessed) {
SourceLocation EndOfLineMarker =
ReadOriginalFileName(CI, PresumedModuleMapFile, /*IsModuleMap*/ true);
if (EndOfLineMarker.isValid())
Offset = CI.getSourceManager().getDecomposedLoc(EndOfLineMarker).second;
}
// Load the module map file.
if (HS.loadModuleMapFile(ModuleMap, IsSystem, ModuleMapID, &Offset,
PresumedModuleMapFile))
return true;
if (SrcMgr.getBufferOrFake(ModuleMapID).getBufferSize() == Offset)
Offset = 0;
// Infer framework module if possible.
if (HS.getModuleMap().canInferFrameworkModule(ModuleMap->getDir())) {
SmallString<128> InferredFrameworkPath = ModuleMap->getDir()->getName();
llvm::sys::path::append(InferredFrameworkPath,
CI.getLangOpts().ModuleName + ".framework");
if (auto Dir = CI.getFileManager().getDirectory(InferredFrameworkPath))
(void)HS.getModuleMap().inferFrameworkModule(*Dir, IsSystem, nullptr);
}
return false;
}
static Module *prepareToBuildModule(CompilerInstance &CI,
StringRef ModuleMapFilename) {
if (CI.getLangOpts().CurrentModule.empty()) {
CI.getDiagnostics().Report(diag::err_missing_module_name);
// FIXME: Eventually, we could consider asking whether there was just
// a single module described in the module map, and use that as a
// default. Then it would be fairly trivial to just "compile" a module
// map with a single module (the common case).
return nullptr;
}
// Dig out the module definition.
HeaderSearch &HS = CI.getPreprocessor().getHeaderSearchInfo();
Module *M = HS.lookupModule(CI.getLangOpts().CurrentModule, SourceLocation(),
/*AllowSearch=*/true);
if (!M) {
CI.getDiagnostics().Report(diag::err_missing_module)
<< CI.getLangOpts().CurrentModule << ModuleMapFilename;
return nullptr;
}
// Check whether we can build this module at all.
if (Preprocessor::checkModuleIsAvailable(CI.getLangOpts(), CI.getTarget(),
CI.getDiagnostics(), M))
return nullptr;
// Inform the preprocessor that includes from within the input buffer should
// be resolved relative to the build directory of the module map file.
CI.getPreprocessor().setMainFileDir(M->Directory);
// If the module was inferred from a different module map (via an expanded
// umbrella module definition), track that fact.
// FIXME: It would be preferable to fill this in as part of processing
// the module map, rather than adding it after the fact.
StringRef OriginalModuleMapName = CI.getFrontendOpts().OriginalModuleMap;
if (!OriginalModuleMapName.empty()) {
auto OriginalModuleMap =
CI.getFileManager().getFile(OriginalModuleMapName,
/*openFile*/ true);
if (!OriginalModuleMap) {
CI.getDiagnostics().Report(diag::err_module_map_not_found)
<< OriginalModuleMapName;
return nullptr;
}
if (*OriginalModuleMap != CI.getSourceManager().getFileEntryForID(
CI.getSourceManager().getMainFileID())) {
M->IsInferred = true;
CI.getPreprocessor().getHeaderSearchInfo().getModuleMap()
.setInferredModuleAllowedBy(M, *OriginalModuleMap);
}
}
// If we're being run from the command-line, the module build stack will not
// have been filled in yet, so complete it now in order to allow us to detect
// module cycles.
SourceManager &SourceMgr = CI.getSourceManager();
if (SourceMgr.getModuleBuildStack().empty())
SourceMgr.pushModuleBuildStack(CI.getLangOpts().CurrentModule,
FullSourceLoc(SourceLocation(), SourceMgr));
return M;
}
/// Compute the input buffer that should be used to build the specified module.
static std::unique_ptr<llvm::MemoryBuffer>
getInputBufferForModule(CompilerInstance &CI, Module *M) {
FileManager &FileMgr = CI.getFileManager();
// Collect the set of #includes we need to build the module.
SmallString<256> HeaderContents;
std::error_code Err = std::error_code();
if (Module::Header UmbrellaHeader = M->getUmbrellaHeader())
addHeaderInclude(UmbrellaHeader.PathRelativeToRootModuleDirectory,
HeaderContents, CI.getLangOpts(), M->IsExternC);
Err = collectModuleHeaderIncludes(
Support lazy stat'ing of files referenced by module maps. This patch adds support for a `header` declaration in a module map to specify certain `stat` information (currently, size and mtime) about that header file. This has two purposes: - It removes the need to eagerly `stat` every file referenced by a module map. Instead, we track a list of unresolved header files with each size / mtime (actually, for simplicity, we track submodules with such headers), and when attempting to look up a header file based on a `FileEntry`, we check if there are any unresolved header directives with that `FileEntry`'s size / mtime and perform deferred `stat`s if so. - It permits a preprocessed module to be compiled without the original files being present on disk. The only reason we used to need those files was to get the `stat` information in order to do header -> module lookups when using the module. If we're provided with the `stat` information in the preprocessed module, we can avoid requiring the files to exist. Unlike most `header` directives, if a `header` directive with `stat` information has no corresponding on-disk file the enclosing module is *not* marked unavailable (so that behavior is consistent regardless of whether we've resolved a header directive, and so that preprocessed modules don't get marked unavailable). We could actually do this for all `header` directives: the only reason we mark the module unavailable if headers are missing is to give a diagnostic slightly earlier (rather than waiting until we actually try to build the module / load and validate its .pcm file). Differential Revision: https://reviews.llvm.org/D33703 llvm-svn: 304515
2017-06-02 09:55:39 +08:00
CI.getLangOpts(), FileMgr, CI.getDiagnostics(),
CI.getPreprocessor().getHeaderSearchInfo().getModuleMap(), M,
HeaderContents);
if (Err) {
CI.getDiagnostics().Report(diag::err_module_cannot_create_includes)
<< M->getFullModuleName() << Err.message();
return nullptr;
}
return llvm::MemoryBuffer::getMemBufferCopy(
HeaderContents, Module::getModuleInputBufferName());
}
bool FrontendAction::BeginSourceFile(CompilerInstance &CI,
const FrontendInputFile &RealInput) {
FrontendInputFile Input(RealInput);
assert(!Instance && "Already processing a source file!");
assert(!Input.isEmpty() && "Unexpected empty filename!");
setCurrentInput(Input);
setCompilerInstance(&CI);
bool HasBegunSourceFile = false;
bool ReplayASTFile = Input.getKind().getFormat() == InputKind::Precompiled &&
usesPreprocessorOnly();
// If we fail, reset state since the client will not end up calling the
// matching EndSourceFile(). All paths that return true should release this.
auto FailureCleanup = llvm::make_scope_exit([&]() {
if (HasBegunSourceFile)
CI.getDiagnosticClient().EndSourceFile();
CI.clearOutputFiles(/*EraseFiles=*/true);
CI.getLangOpts().setCompilingModule(LangOptions::CMK_None);
setCurrentInput(FrontendInputFile());
setCompilerInstance(nullptr);
});
if (!BeginInvocation(CI))
return false;
// If we're replaying the build of an AST file, import it and set up
// the initial state from its build.
if (ReplayASTFile) {
IntrusiveRefCntPtr<DiagnosticsEngine> Diags(&CI.getDiagnostics());
// The AST unit populates its own diagnostics engine rather than ours.
IntrusiveRefCntPtr<DiagnosticsEngine> ASTDiags(
new DiagnosticsEngine(Diags->getDiagnosticIDs(),
&Diags->getDiagnosticOptions()));
ASTDiags->setClient(Diags->getClient(), /*OwnsClient*/false);
// FIXME: What if the input is a memory buffer?
StringRef InputFile = Input.getFile();
std::unique_ptr<ASTUnit> AST = ASTUnit::LoadFromASTFile(
std::string(InputFile), CI.getPCHContainerReader(),
ASTUnit::LoadPreprocessorOnly, ASTDiags, CI.getFileSystemOpts(),
CI.getCodeGenOpts().DebugTypeExtRefs);
if (!AST)
return false;
// Options relating to how we treat the input (but not what we do with it)
// are inherited from the AST unit.
CI.getHeaderSearchOpts() = AST->getHeaderSearchOpts();
CI.getPreprocessorOpts() = AST->getPreprocessorOpts();
CI.getLangOpts() = AST->getLangOpts();
// Set the shared objects, these are reset when we finish processing the
// file, otherwise the CompilerInstance will happily destroy them.
CI.setFileManager(&AST->getFileManager());
CI.createSourceManager(CI.getFileManager());
CI.getSourceManager().initializeForReplay(AST->getSourceManager());
// Preload all the module files loaded transitively by the AST unit. Also
// load all module map files that were parsed as part of building the AST
// unit.
if (auto ASTReader = AST->getASTReader()) {
auto &MM = ASTReader->getModuleManager();
auto &PrimaryModule = MM.getPrimaryModule();
for (serialization::ModuleFile &MF : MM)
if (&MF != &PrimaryModule)
CI.getFrontendOpts().ModuleFiles.push_back(MF.FileName);
ASTReader->visitTopLevelModuleMaps(
PrimaryModule, [&](const FileEntry *FE) {
CI.getFrontendOpts().ModuleMapFiles.push_back(
std::string(FE->getName()));
});
}
// Set up the input file for replay purposes.
auto Kind = AST->getInputKind();
if (Kind.getFormat() == InputKind::ModuleMap) {
Module *ASTModule =
AST->getPreprocessor().getHeaderSearchInfo().lookupModule(
AST->getLangOpts().CurrentModule, SourceLocation(),
/*AllowSearch*/ false);
assert(ASTModule && "module file does not define its own module");
Input = FrontendInputFile(ASTModule->PresumedModuleMapFile, Kind);
} else {
auto &OldSM = AST->getSourceManager();
FileID ID = OldSM.getMainFileID();
if (auto *File = OldSM.getFileEntryForID(ID))
Input = FrontendInputFile(File->getName(), Kind);
else
Input = FrontendInputFile(OldSM.getBufferOrFake(ID), Kind);
}
setCurrentInput(Input, std::move(AST));
}
// AST files follow a very different path, since they share objects via the
// AST unit.
if (Input.getKind().getFormat() == InputKind::Precompiled) {
assert(!usesPreprocessorOnly() && "this case was handled above");
assert(hasASTFileSupport() &&
"This action does not have AST file support!");
IntrusiveRefCntPtr<DiagnosticsEngine> Diags(&CI.getDiagnostics());
// FIXME: What if the input is a memory buffer?
StringRef InputFile = Input.getFile();
std::unique_ptr<ASTUnit> AST = ASTUnit::LoadFromASTFile(
std::string(InputFile), CI.getPCHContainerReader(),
ASTUnit::LoadEverything, Diags, CI.getFileSystemOpts(),
CI.getCodeGenOpts().DebugTypeExtRefs);
if (!AST)
return false;
// Inform the diagnostic client we are processing a source file.
CI.getDiagnosticClient().BeginSourceFile(CI.getLangOpts(), nullptr);
HasBegunSourceFile = true;
// Set the shared objects, these are reset when we finish processing the
// file, otherwise the CompilerInstance will happily destroy them.
CI.setFileManager(&AST->getFileManager());
CI.setSourceManager(&AST->getSourceManager());
CI.setPreprocessor(AST->getPreprocessorPtr());
Preprocessor &PP = CI.getPreprocessor();
PP.getBuiltinInfo().initializeBuiltins(PP.getIdentifierTable(),
PP.getLangOpts());
CI.setASTContext(&AST->getASTContext());
setCurrentInput(Input, std::move(AST));
// Initialize the action.
if (!BeginSourceFileAction(CI))
return false;
// Create the AST consumer.
CI.setASTConsumer(CreateWrappedASTConsumer(CI, InputFile));
if (!CI.hasASTConsumer())
return false;
FailureCleanup.release();
return true;
}
// Set up the file and source managers, if needed.
if (!CI.hasFileManager()) {
if (!CI.createFileManager()) {
return false;
}
}
if (!CI.hasSourceManager())
CI.createSourceManager(CI.getFileManager());
// Set up embedding for any specified files. Do this before we load any
// source files, including the primary module map for the compilation.
for (const auto &F : CI.getFrontendOpts().ModulesEmbedFiles) {
if (auto FE = CI.getFileManager().getFile(F, /*openFile*/true))
CI.getSourceManager().setFileIsTransient(*FE);
else
CI.getDiagnostics().Report(diag::err_modules_embed_file_not_found) << F;
}
if (CI.getFrontendOpts().ModulesEmbedAllFiles)
CI.getSourceManager().setAllFilesAreTransient(true);
// IR files bypass the rest of initialization.
if (Input.getKind().getLanguage() == Language::LLVM_IR) {
assert(hasIRSupport() &&
"This action does not have IR file support!");
// Inform the diagnostic client we are processing a source file.
CI.getDiagnosticClient().BeginSourceFile(CI.getLangOpts(), nullptr);
HasBegunSourceFile = true;
// Initialize the action.
if (!BeginSourceFileAction(CI))
return false;
// Initialize the main file entry.
if (!CI.InitializeSourceManager(CurrentInput))
return false;
FailureCleanup.release();
return true;
}
// If the implicit PCH include is actually a directory, rather than
// a single file, search for a suitable PCH file in that directory.
if (!CI.getPreprocessorOpts().ImplicitPCHInclude.empty()) {
FileManager &FileMgr = CI.getFileManager();
PreprocessorOptions &PPOpts = CI.getPreprocessorOpts();
StringRef PCHInclude = PPOpts.ImplicitPCHInclude;
std::string SpecificModuleCachePath = CI.getSpecificModuleCachePath();
if (auto PCHDir = FileMgr.getOptionalDirectoryRef(PCHInclude)) {
std::error_code EC;
SmallString<128> DirNative;
llvm::sys::path::native(PCHDir->getName(), DirNative);
bool Found = false;
llvm::vfs::FileSystem &FS = FileMgr.getVirtualFileSystem();
for (llvm::vfs::directory_iterator Dir = FS.dir_begin(DirNative, EC),
DirEnd;
Dir != DirEnd && !EC; Dir.increment(EC)) {
// Check whether this is an acceptable AST file.
if (ASTReader::isAcceptableASTFile(
Dir->path(), FileMgr, CI.getPCHContainerReader(),
CI.getLangOpts(), CI.getTargetOpts(), CI.getPreprocessorOpts(),
SpecificModuleCachePath)) {
PPOpts.ImplicitPCHInclude = std::string(Dir->path());
Found = true;
break;
}
}
if (!Found) {
CI.getDiagnostics().Report(diag::err_fe_no_pch_in_dir) << PCHInclude;
return false;
}
}
}
Add support for the static analyzer to synthesize function implementations from external model files. Currently the analyzer lazily models some functions using 'BodyFarm', which constructs a fake function implementation that the analyzer can simulate that approximates the semantics of the function when it is called. BodyFarm does this by constructing the AST for such definitions on-the-fly. One strength of BodyFarm is that all symbols and types referenced by synthesized function bodies are contextual adapted to the containing translation unit. The downside is that these ASTs are hardcoded in Clang's own source code. A more scalable model is to allow these models to be defined as source code in separate "model" files and have the analyzer use those definitions lazily when a function body is needed. Among other things, it will allow more customization of the analyzer for specific APIs and platforms. This patch provides the initial infrastructure for this feature. It extends BodyFarm to use an abstract API 'CodeInjector' that can be used to synthesize function bodies. That 'CodeInjector' is implemented using a new 'ModelInjector' in libFrontend, which lazily parses a model file and injects the ASTs into the current translation unit. Models are currently found by specifying a 'model-path' as an analyzer option; if no path is specified the CodeInjector is not used, thus defaulting to the current behavior in the analyzer. Models currently contain a single function definition, and can be found by finding the file <function name>.model. This is an initial starting point for something more rich, but it bootstraps this feature for future evolution. This patch was contributed by Gábor Horváth as part of his Google Summer of Code project. Some notes: - This introduces the notion of a "model file" into FrontendAction and the Preprocessor. This nomenclature is specific to the static analyzer, but possibly could be generalized. Essentially these are sources pulled in exogenously from the principal translation. Preprocessor gets a 'InitializeForModelFile' and 'FinalizeForModelFile' which could possibly be hoisted out of Preprocessor if Preprocessor exposed a new API to change the PragmaHandlers and some other internal pieces. This can be revisited. FrontendAction gets a 'isModelParsingAction()' predicate function used to allow a new FrontendAction to recycle the Preprocessor and ASTContext. This name could probably be made something more general (i.e., not tied to 'model files') at the expense of losing the intent of why it exists. This can be revisited. - This is a moderate sized patch; it has gone through some amount of offline code review. Most of the changes to the non-analyzer parts are fairly small, and would make little sense without the analyzer changes. - Most of the analyzer changes are plumbing, with the interesting behavior being introduced by ModelInjector.cpp and ModelConsumer.cpp. - The new functionality introduced by this change is off-by-default. It requires an analyzer config option to enable. llvm-svn: 216550
2014-08-27 23:14:15 +08:00
// Set up the preprocessor if needed. When parsing model files the
// preprocessor of the original source is reused.
if (!isModelParsingAction())
CI.createPreprocessor(getTranslationUnitKind());
// Inform the diagnostic client we are processing a source file.
CI.getDiagnosticClient().BeginSourceFile(CI.getLangOpts(),
&CI.getPreprocessor());
HasBegunSourceFile = true;
// Handle C++20 header units.
// Here, the user has the option to specify that the header name should be
// looked up in the pre-processor search paths (and the main filename as
// passed by the driver might therefore be incomplete until that look-up).
if (CI.getLangOpts().CPlusPlusModules && Input.getKind().isHeaderUnit() &&
!Input.getKind().isPreprocessed()) {
StringRef FileName = Input.getFile();
InputKind Kind = Input.getKind();
if (Kind.getHeaderUnitKind() != InputKind::HeaderUnit_Abs) {
assert(CI.hasPreprocessor() &&
"trying to build a header unit without a Pre-processor?");
HeaderSearch &HS = CI.getPreprocessor().getHeaderSearchInfo();
// Relative searches begin from CWD.
const DirectoryEntry *Dir = nullptr;
if (auto DirOrErr = CI.getFileManager().getDirectory("."))
Dir = *DirOrErr;
SmallVector<std::pair<const FileEntry *, const DirectoryEntry *>, 1> CWD;
CWD.push_back({nullptr, Dir});
Optional<FileEntryRef> FE =
HS.LookupFile(FileName, SourceLocation(),
/*Angled*/ Input.getKind().getHeaderUnitKind() ==
InputKind::HeaderUnit_System,
nullptr, nullptr, CWD, nullptr, nullptr, nullptr,
nullptr, nullptr, nullptr);
if (!FE) {
CI.getDiagnostics().Report(diag::err_module_header_file_not_found)
<< FileName;
return false;
}
// We now have the filename...
FileName = FE->getFileEntry().getName();
// ... still a header unit, but now use the path as written.
Kind = Input.getKind().withHeaderUnit(InputKind::HeaderUnit_Abs);
Input = FrontendInputFile(FileName, Kind, Input.isSystem());
}
// Unless the user has overridden the name, the header unit module name is
// the pathname for the file.
if (CI.getLangOpts().ModuleName.empty())
CI.getLangOpts().ModuleName = std::string(FileName);
CI.getLangOpts().CurrentModule = CI.getLangOpts().ModuleName;
}
if (!CI.InitializeSourceManager(Input))
return false;
if (CI.getLangOpts().CPlusPlusModules && Input.getKind().isHeaderUnit() &&
Input.getKind().isPreprocessed() && !usesPreprocessorOnly()) {
// We have an input filename like foo.iih, but we want to find the right
// module name (and original file, to build the map entry).
// Check if the first line specifies the original source file name with a
// linemarker.
std::string PresumedInputFile = std::string(getCurrentFileOrBufferName());
ReadOriginalFileName(CI, PresumedInputFile);
// Unless the user overrides this, the module name is the name by which the
// original file was known.
if (CI.getLangOpts().ModuleName.empty())
CI.getLangOpts().ModuleName = std::string(PresumedInputFile);
CI.getLangOpts().CurrentModule = CI.getLangOpts().ModuleName;
}
// For module map files, we first parse the module map and synthesize a
// "<module-includes>" buffer before more conventional processing.
if (Input.getKind().getFormat() == InputKind::ModuleMap) {
CI.getLangOpts().setCompilingModule(LangOptions::CMK_ModuleMap);
std::string PresumedModuleMapFile;
unsigned OffsetToContents;
if (loadModuleMapForModuleBuild(CI, Input.isSystem(),
Input.isPreprocessed(),
PresumedModuleMapFile, OffsetToContents))
return false;
auto *CurrentModule = prepareToBuildModule(CI, Input.getFile());
if (!CurrentModule)
return false;
CurrentModule->PresumedModuleMapFile = PresumedModuleMapFile;
if (OffsetToContents)
// If the module contents are in the same file, skip to them.
CI.getPreprocessor().setSkipMainFilePreamble(OffsetToContents, true);
else {
// Otherwise, convert the module description to a suitable input buffer.
auto Buffer = getInputBufferForModule(CI, CurrentModule);
if (!Buffer)
return false;
// Reinitialize the main file entry to refer to the new input.
auto Kind = CurrentModule->IsSystem ? SrcMgr::C_System : SrcMgr::C_User;
auto &SourceMgr = CI.getSourceManager();
auto BufferID = SourceMgr.createFileID(std::move(Buffer), Kind);
2020-04-06 23:49:51 +08:00
assert(BufferID.isValid() && "couldn't create module buffer ID");
SourceMgr.setMainFileID(BufferID);
}
}
// Initialize the action.
if (!BeginSourceFileAction(CI))
return false;
// If we were asked to load any module map files, do so now.
for (const auto &Filename : CI.getFrontendOpts().ModuleMapFiles) {
if (auto File = CI.getFileManager().getFile(Filename))
CI.getPreprocessor().getHeaderSearchInfo().loadModuleMapFile(
*File, /*IsSystem*/false);
else
CI.getDiagnostics().Report(diag::err_module_map_not_found) << Filename;
}
// Add a module declaration scope so that modules from -fmodule-map-file
// arguments may shadow modules found implicitly in search paths.
CI.getPreprocessor()
.getHeaderSearchInfo()
.getModuleMap()
.finishModuleDeclarationScope();
// Create the AST context and consumer unless this is a preprocessor only
// action.
if (!usesPreprocessorOnly()) {
Add support for the static analyzer to synthesize function implementations from external model files. Currently the analyzer lazily models some functions using 'BodyFarm', which constructs a fake function implementation that the analyzer can simulate that approximates the semantics of the function when it is called. BodyFarm does this by constructing the AST for such definitions on-the-fly. One strength of BodyFarm is that all symbols and types referenced by synthesized function bodies are contextual adapted to the containing translation unit. The downside is that these ASTs are hardcoded in Clang's own source code. A more scalable model is to allow these models to be defined as source code in separate "model" files and have the analyzer use those definitions lazily when a function body is needed. Among other things, it will allow more customization of the analyzer for specific APIs and platforms. This patch provides the initial infrastructure for this feature. It extends BodyFarm to use an abstract API 'CodeInjector' that can be used to synthesize function bodies. That 'CodeInjector' is implemented using a new 'ModelInjector' in libFrontend, which lazily parses a model file and injects the ASTs into the current translation unit. Models are currently found by specifying a 'model-path' as an analyzer option; if no path is specified the CodeInjector is not used, thus defaulting to the current behavior in the analyzer. Models currently contain a single function definition, and can be found by finding the file <function name>.model. This is an initial starting point for something more rich, but it bootstraps this feature for future evolution. This patch was contributed by Gábor Horváth as part of his Google Summer of Code project. Some notes: - This introduces the notion of a "model file" into FrontendAction and the Preprocessor. This nomenclature is specific to the static analyzer, but possibly could be generalized. Essentially these are sources pulled in exogenously from the principal translation. Preprocessor gets a 'InitializeForModelFile' and 'FinalizeForModelFile' which could possibly be hoisted out of Preprocessor if Preprocessor exposed a new API to change the PragmaHandlers and some other internal pieces. This can be revisited. FrontendAction gets a 'isModelParsingAction()' predicate function used to allow a new FrontendAction to recycle the Preprocessor and ASTContext. This name could probably be made something more general (i.e., not tied to 'model files') at the expense of losing the intent of why it exists. This can be revisited. - This is a moderate sized patch; it has gone through some amount of offline code review. Most of the changes to the non-analyzer parts are fairly small, and would make little sense without the analyzer changes. - Most of the analyzer changes are plumbing, with the interesting behavior being introduced by ModelInjector.cpp and ModelConsumer.cpp. - The new functionality introduced by this change is off-by-default. It requires an analyzer config option to enable. llvm-svn: 216550
2014-08-27 23:14:15 +08:00
// Parsing a model file should reuse the existing ASTContext.
if (!isModelParsingAction())
CI.createASTContext();
// For preprocessed files, check if the first line specifies the original
// source file name with a linemarker.
std::string PresumedInputFile = std::string(getCurrentFileOrBufferName());
if (Input.isPreprocessed())
ReadOriginalFileName(CI, PresumedInputFile);
std::unique_ptr<ASTConsumer> Consumer =
CreateWrappedASTConsumer(CI, PresumedInputFile);
if (!Consumer)
return false;
Add support for the static analyzer to synthesize function implementations from external model files. Currently the analyzer lazily models some functions using 'BodyFarm', which constructs a fake function implementation that the analyzer can simulate that approximates the semantics of the function when it is called. BodyFarm does this by constructing the AST for such definitions on-the-fly. One strength of BodyFarm is that all symbols and types referenced by synthesized function bodies are contextual adapted to the containing translation unit. The downside is that these ASTs are hardcoded in Clang's own source code. A more scalable model is to allow these models to be defined as source code in separate "model" files and have the analyzer use those definitions lazily when a function body is needed. Among other things, it will allow more customization of the analyzer for specific APIs and platforms. This patch provides the initial infrastructure for this feature. It extends BodyFarm to use an abstract API 'CodeInjector' that can be used to synthesize function bodies. That 'CodeInjector' is implemented using a new 'ModelInjector' in libFrontend, which lazily parses a model file and injects the ASTs into the current translation unit. Models are currently found by specifying a 'model-path' as an analyzer option; if no path is specified the CodeInjector is not used, thus defaulting to the current behavior in the analyzer. Models currently contain a single function definition, and can be found by finding the file <function name>.model. This is an initial starting point for something more rich, but it bootstraps this feature for future evolution. This patch was contributed by Gábor Horváth as part of his Google Summer of Code project. Some notes: - This introduces the notion of a "model file" into FrontendAction and the Preprocessor. This nomenclature is specific to the static analyzer, but possibly could be generalized. Essentially these are sources pulled in exogenously from the principal translation. Preprocessor gets a 'InitializeForModelFile' and 'FinalizeForModelFile' which could possibly be hoisted out of Preprocessor if Preprocessor exposed a new API to change the PragmaHandlers and some other internal pieces. This can be revisited. FrontendAction gets a 'isModelParsingAction()' predicate function used to allow a new FrontendAction to recycle the Preprocessor and ASTContext. This name could probably be made something more general (i.e., not tied to 'model files') at the expense of losing the intent of why it exists. This can be revisited. - This is a moderate sized patch; it has gone through some amount of offline code review. Most of the changes to the non-analyzer parts are fairly small, and would make little sense without the analyzer changes. - Most of the analyzer changes are plumbing, with the interesting behavior being introduced by ModelInjector.cpp and ModelConsumer.cpp. - The new functionality introduced by this change is off-by-default. It requires an analyzer config option to enable. llvm-svn: 216550
2014-08-27 23:14:15 +08:00
// FIXME: should not overwrite ASTMutationListener when parsing model files?
if (!isModelParsingAction())
CI.getASTContext().setASTMutationListener(Consumer->GetASTMutationListener());
if (!CI.getPreprocessorOpts().ChainedIncludes.empty()) {
// Convert headers to PCH and chain them.
IntrusiveRefCntPtr<ExternalSemaSource> source, FinalReader;
source = createChainedIncludesSource(CI, FinalReader);
if (!source)
return false;
CI.setASTReader(static_cast<ASTReader *>(FinalReader.get()));
CI.getASTContext().setExternalSource(source);
} else if (CI.getLangOpts().Modules ||
!CI.getPreprocessorOpts().ImplicitPCHInclude.empty()) {
// Use PCM or PCH.
assert(hasPCHSupport() && "This action does not have PCH support!");
ASTDeserializationListener *DeserialListener =
Consumer->GetASTDeserializationListener();
bool DeleteDeserialListener = false;
if (CI.getPreprocessorOpts().DumpDeserializedPCHDecls) {
DeserialListener = new DeserializedDeclsDumper(DeserialListener,
DeleteDeserialListener);
DeleteDeserialListener = true;
}
if (!CI.getPreprocessorOpts().DeserializedPCHDeclsToErrorOn.empty()) {
DeserialListener = new DeserializedDeclsChecker(
CI.getASTContext(),
CI.getPreprocessorOpts().DeserializedPCHDeclsToErrorOn,
DeserialListener, DeleteDeserialListener);
DeleteDeserialListener = true;
}
if (!CI.getPreprocessorOpts().ImplicitPCHInclude.empty()) {
CI.createPCHExternalASTSource(
CI.getPreprocessorOpts().ImplicitPCHInclude,
CI.getPreprocessorOpts().DisablePCHOrModuleValidation,
CI.getPreprocessorOpts().AllowPCHWithCompilerErrors,
DeserialListener, DeleteDeserialListener);
if (!CI.getASTContext().getExternalSource())
return false;
}
// If modules are enabled, create the AST reader before creating
// any builtins, so that all declarations know that they might be
// extended by an external source.
if (CI.getLangOpts().Modules || !CI.hasASTContext() ||
!CI.getASTContext().getExternalSource()) {
CI.createASTReader();
CI.getASTReader()->setDeserializationListener(DeserialListener,
DeleteDeserialListener);
}
}
CI.setASTConsumer(std::move(Consumer));
if (!CI.hasASTConsumer())
return false;
}
// Initialize built-in info as long as we aren't using an external AST
// source.
if (CI.getLangOpts().Modules || !CI.hasASTContext() ||
!CI.getASTContext().getExternalSource()) {
Preprocessor &PP = CI.getPreprocessor();
PP.getBuiltinInfo().initializeBuiltins(PP.getIdentifierTable(),
PP.getLangOpts());
If a declaration is loaded, and then a module import adds a redeclaration, then ensure that querying the first declaration for its most recent declaration checks for redeclarations from the imported module. This works as follows: * The 'most recent' pointer on a canonical declaration grows a pointer to the external AST source and a generation number (space- and time-optimized for the case where there is no external source). * Each time the 'most recent' pointer is queried, if it has an external source, we check whether it's up to date, and update it if not. * The ancillary data stored on the canonical declaration is allocated lazily to avoid filling it in for declarations that end up being non-canonical. We'll still perform a redundant (ASTContext) allocation if someone asks for the most recent declaration from a decl before setPreviousDecl is called, but such cases are probably all bugs, and are now easy to find. Some finessing is still in order here -- in particular, we use a very general mechanism for handling the DefinitionData pointer on CXXRecordData, and a more targeted approach would be more compact. Also, the MayHaveOutOfDateDef mechanism should now be expunged, since it was addressing only a corner of the full problem space here. That's not covered by this patch. Early performance benchmarks show that this makes no measurable difference to Clang performance without modules enabled (and fixes a major correctness issue with modules enabled). I'll revert if a full performance comparison shows any problems. llvm-svn: 209046
2014-05-17 07:01:30 +08:00
} else {
// FIXME: If this is a problem, recover from it by creating a multiplex
// source.
assert((!CI.getLangOpts().Modules || CI.getASTReader()) &&
If a declaration is loaded, and then a module import adds a redeclaration, then ensure that querying the first declaration for its most recent declaration checks for redeclarations from the imported module. This works as follows: * The 'most recent' pointer on a canonical declaration grows a pointer to the external AST source and a generation number (space- and time-optimized for the case where there is no external source). * Each time the 'most recent' pointer is queried, if it has an external source, we check whether it's up to date, and update it if not. * The ancillary data stored on the canonical declaration is allocated lazily to avoid filling it in for declarations that end up being non-canonical. We'll still perform a redundant (ASTContext) allocation if someone asks for the most recent declaration from a decl before setPreviousDecl is called, but such cases are probably all bugs, and are now easy to find. Some finessing is still in order here -- in particular, we use a very general mechanism for handling the DefinitionData pointer on CXXRecordData, and a more targeted approach would be more compact. Also, the MayHaveOutOfDateDef mechanism should now be expunged, since it was addressing only a corner of the full problem space here. That's not covered by this patch. Early performance benchmarks show that this makes no measurable difference to Clang performance without modules enabled (and fixes a major correctness issue with modules enabled). I'll revert if a full performance comparison shows any problems. llvm-svn: 209046
2014-05-17 07:01:30 +08:00
"modules enabled but created an external source that "
"doesn't support modules");
}
// If we were asked to load any module files, do so now.
for (const auto &ModuleFile : CI.getFrontendOpts().ModuleFiles)
if (!CI.loadModuleFile(ModuleFile))
return false;
Extend the ExternalASTSource interface to allow the AST source to provide the layout of records, rather than letting Clang compute the layout itself. LLDB provides the motivation for this feature: because various layout-altering attributes (packed, aligned, etc.) don't get reliably get placed into DWARF, the record layouts computed by LLDB from the reconstructed records differ from the actual layouts, and badness occurs. This interface lets the DWARF data drive layout, so we don't need the attributes preserved to get the answer write. The testing methodology for this change is fun. I've introduced a variant of -fdump-record-layouts called -fdump-record-layouts-simple that always has the simple C format and provides size/alignment/field offsets. There is also a -cc1 option -foverride-record-layout=<file> to take the output of -fdump-record-layouts-simple and parse it to produce a set of overridden layouts, which is introduced into the AST via a testing-only ExternalASTSource (called LayoutOverrideSource). Each test contains a number of records to lay out, which use various layout-changing attributes, and then dumps the layouts. We then run the test again, using the preprocessor to eliminate the layout-changing attributes entirely (which would give us different layouts for the records), but supplying the previously-computed record layouts. Finally, we diff the layouts produced from the two runs to be sure that they are identical. Note that this code makes the assumption that we don't *have* to provide the offsets of bases or virtual bases to get the layout right, because the alignment attributes don't affect it. I believe this assumption holds, but if it does not, we can extend LayoutOverrideSource to also provide base offset information. Fixes the Clang side of <rdar://problem/10169539>. llvm-svn: 149055
2012-01-26 15:55:45 +08:00
// If there is a layout overrides file, attach an external AST source that
// provides the layouts from that file.
if (!CI.getFrontendOpts().OverrideRecordLayoutsFile.empty() &&
Extend the ExternalASTSource interface to allow the AST source to provide the layout of records, rather than letting Clang compute the layout itself. LLDB provides the motivation for this feature: because various layout-altering attributes (packed, aligned, etc.) don't get reliably get placed into DWARF, the record layouts computed by LLDB from the reconstructed records differ from the actual layouts, and badness occurs. This interface lets the DWARF data drive layout, so we don't need the attributes preserved to get the answer write. The testing methodology for this change is fun. I've introduced a variant of -fdump-record-layouts called -fdump-record-layouts-simple that always has the simple C format and provides size/alignment/field offsets. There is also a -cc1 option -foverride-record-layout=<file> to take the output of -fdump-record-layouts-simple and parse it to produce a set of overridden layouts, which is introduced into the AST via a testing-only ExternalASTSource (called LayoutOverrideSource). Each test contains a number of records to lay out, which use various layout-changing attributes, and then dumps the layouts. We then run the test again, using the preprocessor to eliminate the layout-changing attributes entirely (which would give us different layouts for the records), but supplying the previously-computed record layouts. Finally, we diff the layouts produced from the two runs to be sure that they are identical. Note that this code makes the assumption that we don't *have* to provide the offsets of bases or virtual bases to get the layout right, because the alignment attributes don't affect it. I believe this assumption holds, but if it does not, we can extend LayoutOverrideSource to also provide base offset information. Fixes the Clang side of <rdar://problem/10169539>. llvm-svn: 149055
2012-01-26 15:55:45 +08:00
CI.hasASTContext() && !CI.getASTContext().getExternalSource()) {
IntrusiveRefCntPtr<ExternalASTSource>
Extend the ExternalASTSource interface to allow the AST source to provide the layout of records, rather than letting Clang compute the layout itself. LLDB provides the motivation for this feature: because various layout-altering attributes (packed, aligned, etc.) don't get reliably get placed into DWARF, the record layouts computed by LLDB from the reconstructed records differ from the actual layouts, and badness occurs. This interface lets the DWARF data drive layout, so we don't need the attributes preserved to get the answer write. The testing methodology for this change is fun. I've introduced a variant of -fdump-record-layouts called -fdump-record-layouts-simple that always has the simple C format and provides size/alignment/field offsets. There is also a -cc1 option -foverride-record-layout=<file> to take the output of -fdump-record-layouts-simple and parse it to produce a set of overridden layouts, which is introduced into the AST via a testing-only ExternalASTSource (called LayoutOverrideSource). Each test contains a number of records to lay out, which use various layout-changing attributes, and then dumps the layouts. We then run the test again, using the preprocessor to eliminate the layout-changing attributes entirely (which would give us different layouts for the records), but supplying the previously-computed record layouts. Finally, we diff the layouts produced from the two runs to be sure that they are identical. Note that this code makes the assumption that we don't *have* to provide the offsets of bases or virtual bases to get the layout right, because the alignment attributes don't affect it. I believe this assumption holds, but if it does not, we can extend LayoutOverrideSource to also provide base offset information. Fixes the Clang side of <rdar://problem/10169539>. llvm-svn: 149055
2012-01-26 15:55:45 +08:00
Override(new LayoutOverrideSource(
CI.getFrontendOpts().OverrideRecordLayoutsFile));
CI.getASTContext().setExternalSource(Override);
}
If a declaration is loaded, and then a module import adds a redeclaration, then ensure that querying the first declaration for its most recent declaration checks for redeclarations from the imported module. This works as follows: * The 'most recent' pointer on a canonical declaration grows a pointer to the external AST source and a generation number (space- and time-optimized for the case where there is no external source). * Each time the 'most recent' pointer is queried, if it has an external source, we check whether it's up to date, and update it if not. * The ancillary data stored on the canonical declaration is allocated lazily to avoid filling it in for declarations that end up being non-canonical. We'll still perform a redundant (ASTContext) allocation if someone asks for the most recent declaration from a decl before setPreviousDecl is called, but such cases are probably all bugs, and are now easy to find. Some finessing is still in order here -- in particular, we use a very general mechanism for handling the DefinitionData pointer on CXXRecordData, and a more targeted approach would be more compact. Also, the MayHaveOutOfDateDef mechanism should now be expunged, since it was addressing only a corner of the full problem space here. That's not covered by this patch. Early performance benchmarks show that this makes no measurable difference to Clang performance without modules enabled (and fixes a major correctness issue with modules enabled). I'll revert if a full performance comparison shows any problems. llvm-svn: 209046
2014-05-17 07:01:30 +08:00
FailureCleanup.release();
return true;
}
llvm::Error FrontendAction::Execute() {
CompilerInstance &CI = getCompilerInstance();
if (CI.hasFrontendTimer()) {
llvm::TimeRegion Timer(CI.getFrontendTimer());
ExecuteAction();
}
else ExecuteAction();
// If we are supposed to rebuild the global module index, do so now unless
// there were any module-build failures.
if (CI.shouldBuildGlobalModuleIndex() && CI.hasFileManager() &&
CI.hasPreprocessor()) {
StringRef Cache =
CI.getPreprocessor().getHeaderSearchInfo().getModuleCachePath();
if (!Cache.empty()) {
if (llvm::Error Err = GlobalModuleIndex::writeIndex(
CI.getFileManager(), CI.getPCHContainerReader(), Cache)) {
// FIXME this drops the error on the floor, but
// Index/pch-from-libclang.c seems to rely on dropping at least some of
// the error conditions!
consumeError(std::move(Err));
}
}
}
return llvm::Error::success();
}
void FrontendAction::EndSourceFile() {
CompilerInstance &CI = getCompilerInstance();
// Inform the diagnostic client we are done with this source file.
CI.getDiagnosticClient().EndSourceFile();
// Inform the preprocessor we are done.
if (CI.hasPreprocessor())
CI.getPreprocessor().EndSourceFile();
// Finalize the action.
EndSourceFileAction();
// Sema references the ast consumer, so reset sema first.
//
// FIXME: There is more per-file stuff we could just drop here?
bool DisableFree = CI.getFrontendOpts().DisableFree;
if (DisableFree) {
CI.resetAndLeakSema();
CI.resetAndLeakASTContext();
llvm::BuryPointer(CI.takeASTConsumer().get());
} else {
CI.setSema(nullptr);
CI.setASTContext(nullptr);
CI.setASTConsumer(nullptr);
}
if (CI.getFrontendOpts().ShowStats) {
llvm::errs() << "\nSTATISTICS FOR '" << getCurrentFile() << "':\n";
CI.getPreprocessor().PrintStats();
CI.getPreprocessor().getIdentifierTable().PrintStats();
CI.getPreprocessor().getHeaderSearchInfo().PrintStats();
CI.getSourceManager().PrintStats();
llvm::errs() << "\n";
}
// Cleanup the output streams, and erase the output files if instructed by the
// FrontendAction.
CI.clearOutputFiles(/*EraseFiles=*/shouldEraseOutputFiles());
if (isCurrentFileAST()) {
if (DisableFree) {
CI.resetAndLeakPreprocessor();
CI.resetAndLeakSourceManager();
CI.resetAndLeakFileManager();
llvm::BuryPointer(std::move(CurrentASTUnit));
} else {
CI.setPreprocessor(nullptr);
CI.setSourceManager(nullptr);
CI.setFileManager(nullptr);
}
}
setCompilerInstance(nullptr);
setCurrentInput(FrontendInputFile());
CI.getLangOpts().setCompilingModule(LangOptions::CMK_None);
}
bool FrontendAction::shouldEraseOutputFiles() {
return getCompilerInstance().getDiagnostics().hasErrorOccurred();
}
//===----------------------------------------------------------------------===//
// Utility Actions
//===----------------------------------------------------------------------===//
void ASTFrontendAction::ExecuteAction() {
CompilerInstance &CI = getCompilerInstance();
if (!CI.hasPreprocessor())
return;
// FIXME: Move the truncation aspect of this into Sema, we delayed this till
// here so the source manager would be initialized.
if (hasCodeCompletionSupport() &&
!CI.getFrontendOpts().CodeCompletionAt.FileName.empty())
CI.createCodeCompletionConsumer();
// Use a code completion consumer?
CodeCompleteConsumer *CompletionConsumer = nullptr;
if (CI.hasCodeCompletionConsumer())
CompletionConsumer = &CI.getCodeCompletionConsumer();
if (!CI.hasSema())
CI.createSema(getTranslationUnitKind(), CompletionConsumer);
ParseAST(CI.getSema(), CI.getFrontendOpts().ShowStats,
CI.getFrontendOpts().SkipFunctionBodies);
}
void PluginASTAction::anchor() { }
std::unique_ptr<ASTConsumer>
PreprocessorFrontendAction::CreateASTConsumer(CompilerInstance &CI,
StringRef InFile) {
llvm_unreachable("Invalid CreateASTConsumer on preprocessor action!");
}
bool WrapperFrontendAction::PrepareToExecuteAction(CompilerInstance &CI) {
return WrappedAction->PrepareToExecuteAction(CI);
}
std::unique_ptr<ASTConsumer>
WrapperFrontendAction::CreateASTConsumer(CompilerInstance &CI,
StringRef InFile) {
return WrappedAction->CreateASTConsumer(CI, InFile);
}
bool WrapperFrontendAction::BeginInvocation(CompilerInstance &CI) {
return WrappedAction->BeginInvocation(CI);
}
bool WrapperFrontendAction::BeginSourceFileAction(CompilerInstance &CI) {
WrappedAction->setCurrentInput(getCurrentInput());
WrappedAction->setCompilerInstance(&CI);
auto Ret = WrappedAction->BeginSourceFileAction(CI);
// BeginSourceFileAction may change CurrentInput, e.g. during module builds.
setCurrentInput(WrappedAction->getCurrentInput());
return Ret;
}
void WrapperFrontendAction::ExecuteAction() {
WrappedAction->ExecuteAction();
}
[clang-repl] Recommit "Land initial infrastructure for incremental parsing" Original commit message: In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have mentioned our plans to make some of the incremental compilation facilities available in llvm mainline. This patch proposes a minimal version of a repl, clang-repl, which enables interpreter-like interaction for C++. For instance: ./bin/clang-repl clang-repl> int i = 42; clang-repl> extern "C" int printf(const char*,...); clang-repl> auto r1 = printf("i=%d\n", i); i=42 clang-repl> quit The patch allows very limited functionality, for example, it crashes on invalid C++. The design of the proposed patch follows closely the design of cling. The idea is to gather feedback and gradually evolve both clang-repl and cling to what the community agrees upon. The IncrementalParser class is responsible for driving the clang parser and codegen and allows the compiler infrastructure to process more than one input. Every input adds to the “ever-growing” translation unit. That model is enabled by an IncrementalAction which prevents teardown when HandleTranslationUnit. The IncrementalExecutor class hides some of the underlying implementation details of the concrete JIT infrastructure. It exposes the minimal set of functionality required by our incremental compiler/interpreter. The Transaction class keeps track of the AST and the LLVM IR for each incremental input. That tracking information will be later used to implement error recovery. The Interpreter class orchestrates the IncrementalParser and the IncrementalExecutor to model interpreter-like behavior. It provides the public API which can be used (in future) when using the interpreter library. Differential revision: https://reviews.llvm.org/D96033
2021-05-13 13:41:44 +08:00
void WrapperFrontendAction::EndSourceFile() { WrappedAction->EndSourceFile(); }
void WrapperFrontendAction::EndSourceFileAction() {
WrappedAction->EndSourceFileAction();
}
bool WrapperFrontendAction::shouldEraseOutputFiles() {
return WrappedAction->shouldEraseOutputFiles();
}
bool WrapperFrontendAction::usesPreprocessorOnly() const {
return WrappedAction->usesPreprocessorOnly();
}
TranslationUnitKind WrapperFrontendAction::getTranslationUnitKind() {
return WrappedAction->getTranslationUnitKind();
}
bool WrapperFrontendAction::hasPCHSupport() const {
return WrappedAction->hasPCHSupport();
}
bool WrapperFrontendAction::hasASTFileSupport() const {
return WrappedAction->hasASTFileSupport();
}
bool WrapperFrontendAction::hasIRSupport() const {
return WrappedAction->hasIRSupport();
}
bool WrapperFrontendAction::hasCodeCompletionSupport() const {
return WrappedAction->hasCodeCompletionSupport();
}
WrapperFrontendAction::WrapperFrontendAction(
std::unique_ptr<FrontendAction> WrappedAction)
: WrappedAction(std::move(WrappedAction)) {}