Skip to content

Commit

Permalink
Switch builtin strings to use string tables (llvm#118734)
Browse files Browse the repository at this point in the history
The Clang binary (and any binary linking Clang as a library), when built
using PIE, ends up with a pretty shocking number of dynamic relocations
to apply to the executable image: roughly 400k.

Each of these takes up binary space in the executable, and perhaps most
interestingly takes start-up time to apply the relocations.

The largest pattern I identified were the strings used to describe
target builtins. The addresses of these string literals were stored into
huge arrays, each one requiring a dynamic relocation. The way to avoid
this is to design the target builtins to use a single large table of
strings and offsets within the table for the individual strings. This
switches the builtin management to such a scheme.

This saves over 100k dynamic relocations by my measurement, an over 25%
reduction. Just looking at byte size improvements, using the `bloaty`
tool to compare a newly built `clang` binary to an old one:

```
    FILE SIZE        VM SIZE
 --------------  --------------
  +1.4%  +653Ki  +1.4%  +653Ki    .rodata
  +0.0%    +960  +0.0%    +960    .text
  +0.0%    +197  +0.0%    +197    .dynstr
  +0.0%    +184  +0.0%    +184    .eh_frame
  +0.0%     +96  +0.0%     +96    .dynsym
  +0.0%     +40  +0.0%     +40    .eh_frame_hdr
  +114%     +32  [ = ]       0    [Unmapped]
  +0.0%     +20  +0.0%     +20    .gnu.hash
  +0.0%      +8  +0.0%      +8    .gnu.version
  +0.9%      +7  +0.9%      +7    [LOAD #2 [R]]
  [ = ]       0 -75.4% -3.00Ki    .relro_padding
 -16.1%  -802Ki -16.1%  -802Ki    .data.rel.ro
 -27.3% -2.52Mi -27.3% -2.52Mi    .rela.dyn
  -1.6% -2.66Mi  -1.6% -2.66Mi    TOTAL
```

We get a 16% reduction in the `.data.rel.ro` section, and nearly 30%
reduction in `.rela.dyn` where those reloctaions are stored.

This is also visible in my benchmarking of binary start-up overhead at
least:

```
Benchmark 1: ./old_clang --version
  Time (mean ± σ):      17.6 ms ±   1.5 ms    [User: 4.1 ms, System: 13.3 ms]
  Range (min … max):    14.2 ms …  22.8 ms    162 runs

Benchmark 2: ./new_clang --version
  Time (mean ± σ):      15.5 ms ±   1.4 ms    [User: 3.6 ms, System: 11.8 ms]
  Range (min … max):    12.4 ms …  20.3 ms    216 runs

Summary
  './new_clang --version' ran
    1.13 ± 0.14 times faster than './old_clang --version'
```

We get about 2ms faster `--version` runs. While there is a lot of noise
in binary execution time, this delta is pretty consistent, and
represents over 10% improvement. This is particularly interesting to me
because for very short source files, repeatedly starting the `clang`
binary is actually the dominant cost. For example, `configure` scripts
running against the `clang` compiler are slow in large part because of
binary start up time, not the time to process the actual inputs to the
compiler.

----

This PR implements the string tables using `constexpr` code and the
existing macro system. I understand that the builtins are moving towards
a TableGen model, and if complete that would provide more options for
modeling this. Unfortunately, that migration isn't complete, and even
the parts that are migrated still rely on the ability to break out of
the TableGen model and directly expand an X-macro style `BUILTIN(...)`
textually. I looked at trying to complete the move to TableGen, but it
would both require the difficult migration of the remaining targets, and
solving some tricky problems with how to move away from any macro-based
expansion.

I was also able to find a reasonably clean and effective way of doing
this with the existing macros and some `constexpr` code that I think is
clean enough to be a pretty good intermediate state, and maybe give a
good target for the eventual TableGen solution. I was also able to
factor the macros into set of consistent patterns that avoids a
significant regression in overall boilerplate.
  • Loading branch information
chandlerc authored Dec 9, 2024
1 parent f6c51ea commit be2df95
Show file tree
Hide file tree
Showing 48 changed files with 606 additions and 309 deletions.
205 changes: 169 additions & 36 deletions clang/include/clang/Basic/Builtins.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ struct HeaderDesc {
#undef HEADER
} ID;

constexpr HeaderDesc() : ID() {}
constexpr HeaderDesc(HeaderID ID) : ID(ID) {}

const char *getName() const;
Expand All @@ -68,23 +69,152 @@ enum ID {
FirstTSBuiltin
};

// The info used to represent each builtin.
struct Info {
llvm::StringLiteral Name;
const char *Type, *Attributes;
const char *Features;
// Rather than store pointers to the string literals describing these four
// aspects of builtins, we store offsets into a common string table.
struct StrOffsets {
int Name;
int Type;
int Attributes;
int Features;
} Offsets;

HeaderDesc Header;
LanguageID Langs;
};

// The storage for `N` builtins. This contains a single pointer to the string
// table used for these builtins and an array of metadata for each builtin.
template <size_t N> struct Storage {
const char *StringTable;

std::array<Info, N> Infos;

// A constexpr function to construct the storage for a a given string table in
// the first argument and an array in the second argument. This is *only*
// expected to be used at compile time, we should mark it `consteval` when
// available.
//
// The `Infos` array is particularly special. This function expects an array
// of `Info` structs, where the string offsets of each entry refer to the
// *sizes* of those strings rather than their offsets, and for the target
// string to be in the provided string table at an offset the sum of all
// previous string sizes. This function walks the `Infos` array computing the
// running sum and replacing the sizes with the actual offsets in the string
// table that should be used. This arrangement is designed to make it easy to
// expand `.def` and `.inc` files with X-macros to construct both the string
// table and the `Info` structs in the arguments to this function.
static constexpr Storage<N> Make(const char *Strings,
std::array<Info, N> Infos) {
// Translate lengths to offsets.
int Offset = 0;
for (auto &I : Infos) {
Info::StrOffsets NewOffsets = {};
NewOffsets.Name = Offset;
Offset += I.Offsets.Name;
NewOffsets.Type = Offset;
Offset += I.Offsets.Type;
NewOffsets.Attributes = Offset;
Offset += I.Offsets.Attributes;
NewOffsets.Features = Offset;
Offset += I.Offsets.Features;
I.Offsets = NewOffsets;
}
return {Strings, Infos};
}
};

// A detail macro used below to emit a string literal that, after string literal
// concatenation, ends up triggering the `-Woverlength-strings` warning. While
// the warning is useful in general to catch accidentally excessive strings,
// here we are creating them intentionally.
//
// This relies on a subtle aspect of `_Pragma`: that the *diagnostic* ones don't
// turn into actual tokens that would disrupt string literal concatenation.
#ifdef __clang__
#define CLANG_BUILTIN_DETAIL_STR_TABLE(S) \
_Pragma("clang diagnostic push") \
_Pragma("clang diagnostic ignored \"-Woverlength-strings\"") \
S _Pragma("clang diagnostic pop")
#else
#define CLANG_BUILTIN_DETAIL_STR_TABLE(S) S
#endif

// A macro that can be used with `Builtins.def` and similar files as an X-macro
// to add the string arguments to a builtin string table. This is typically the
// target for the `BUILTIN`, `LANGBUILTIN`, or `LIBBUILTIN` macros in those
// files.
#define CLANG_BUILTIN_STR_TABLE(ID, TYPE, ATTRS) \
CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" /*FEATURE*/ "\0")

// A macro that can be used with target builtin `.def` and `.inc` files as an
// X-macro to add the string arguments to a builtin string table. this is
// typically the target for the `TARGET_BUILTIN` macro.
#define CLANG_TARGET_BUILTIN_STR_TABLE(ID, TYPE, ATTRS, FEATURE) \
CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" FEATURE "\0")

// A macro that can be used with target builtin `.def` and `.inc` files as an
// X-macro to add the string arguments to a builtin string table. this is
// typically the target for the `TARGET_HEADER_BUILTIN` macro. We can't delegate
// to `TARGET_BUILTIN` because the `FEATURE` string changes position.
#define CLANG_TARGET_HEADER_BUILTIN_STR_TABLE(ID, TYPE, ATTRS, HEADER, LANGS, \
FEATURE) \
CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" FEATURE "\0")

// A detail macro used internally to compute the desired string table
// `StrOffsets` struct for arguments to `Storage::Make`.
#define CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS) \
Builtin::Info::StrOffsets { \
sizeof(#ID), sizeof(TYPE), sizeof(ATTRS), sizeof("") \
}

// A detail macro used internally to compute the desired string table
// `StrOffsets` struct for arguments to `Storage::Make`.
#define CLANG_TARGET_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS, FEATURE) \
Builtin::Info::StrOffsets { \
sizeof(#ID), sizeof(TYPE), sizeof(ATTRS), sizeof(FEATURE) \
}

// A set of macros that can be used with builtin `.def' files as an X-macro to
// create an `Info` struct for a particular builtin. It both computes the
// `StrOffsets` value for the string table (the lengths here, translated to
// offsets by the Storage::Make function), and the other metadata for each
// builtin.
//
// There is a corresponding macro for each of `BUILTIN`, `LANGBUILTIN`,
// `LIBBUILTIN`, `TARGET_BUILTIN`, and `TARGET_HEADER_BUILTIN`.
#define CLANG_BUILTIN_ENTRY(ID, TYPE, ATTRS) \
Builtin::Info{CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS), \
HeaderDesc::NO_HEADER, ALL_LANGUAGES},
#define CLANG_LANGBUILTIN_ENTRY(ID, TYPE, ATTRS, LANG) \
Builtin::Info{CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS), \
HeaderDesc::NO_HEADER, LANG},
#define CLANG_LIBBUILTIN_ENTRY(ID, TYPE, ATTRS, HEADER, LANG) \
Builtin::Info{CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS), \
HeaderDesc::HEADER, LANG},
#define CLANG_TARGET_BUILTIN_ENTRY(ID, TYPE, ATTRS, FEATURE) \
Builtin::Info{ \
CLANG_TARGET_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS, FEATURE), \
HeaderDesc::NO_HEADER, ALL_LANGUAGES},
#define CLANG_TARGET_HEADER_BUILTIN_ENTRY(ID, TYPE, ATTRS, HEADER, LANG, \
FEATURE) \
Builtin::Info{ \
CLANG_TARGET_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS, FEATURE), \
HeaderDesc::HEADER, LANG},

/// Holds information about both target-independent and
/// target-specific builtins, allowing easy queries by clients.
///
/// Builtins from an optional auxiliary target are stored in
/// AuxTSRecords. Their IDs are shifted up by TSRecords.size() and need to
/// be translated back with getAuxBuiltinID() before use.
class Context {
llvm::ArrayRef<Info> TSRecords;
llvm::ArrayRef<Info> AuxTSRecords;
const char *TSStrTable = nullptr;
const char *AuxTSStrTable = nullptr;

llvm::ArrayRef<Info> TSInfos;
llvm::ArrayRef<Info> AuxTSInfos;

public:
Context() = default;
Expand All @@ -100,12 +230,13 @@ class Context {

/// Return the identifier name for the specified builtin,
/// e.g. "__builtin_abs".
llvm::StringRef getName(unsigned ID) const { return getRecord(ID).Name; }
llvm::StringRef getName(unsigned ID) const;

/// Get the type descriptor string for the specified builtin.
const char *getTypeString(unsigned ID) const {
return getRecord(ID).Type;
}
const char *getTypeString(unsigned ID) const;

/// Get the attributes descriptor string for the specified builtin.
const char *getAttributesString(unsigned ID) const;

/// Return true if this function is a target-specific builtin.
bool isTSBuiltin(unsigned ID) const {
Expand All @@ -114,40 +245,40 @@ class Context {

/// Return true if this function has no side effects.
bool isPure(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'U') != nullptr;
return strchr(getAttributesString(ID), 'U') != nullptr;
}

/// Return true if this function has no side effects and doesn't
/// read memory.
bool isConst(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'c') != nullptr;
return strchr(getAttributesString(ID), 'c') != nullptr;
}

/// Return true if we know this builtin never throws an exception.
bool isNoThrow(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'n') != nullptr;
return strchr(getAttributesString(ID), 'n') != nullptr;
}

/// Return true if we know this builtin never returns.
bool isNoReturn(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'r') != nullptr;
return strchr(getAttributesString(ID), 'r') != nullptr;
}

/// Return true if we know this builtin can return twice.
bool isReturnsTwice(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'j') != nullptr;
return strchr(getAttributesString(ID), 'j') != nullptr;
}

/// Returns true if this builtin does not perform the side-effects
/// of its arguments.
bool isUnevaluated(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'u') != nullptr;
return strchr(getAttributesString(ID), 'u') != nullptr;
}

/// Return true if this is a builtin for a libc/libm function,
/// with a "__builtin_" prefix (e.g. __builtin_abs).
bool isLibFunction(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'F') != nullptr;
return strchr(getAttributesString(ID), 'F') != nullptr;
}

/// Determines whether this builtin is a predefined libc/libm
Expand All @@ -158,29 +289,29 @@ class Context {
/// they do not, but they are recognized as builtins once we see
/// a declaration.
bool isPredefinedLibFunction(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'f') != nullptr;
return strchr(getAttributesString(ID), 'f') != nullptr;
}

/// Returns true if this builtin requires appropriate header in other
/// compilers. In Clang it will work even without including it, but we can emit
/// a warning about missing header.
bool isHeaderDependentFunction(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'h') != nullptr;
return strchr(getAttributesString(ID), 'h') != nullptr;
}

/// Determines whether this builtin is a predefined compiler-rt/libgcc
/// function, such as "__clear_cache", where we know the signature a
/// priori.
bool isPredefinedRuntimeFunction(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'i') != nullptr;
return strchr(getAttributesString(ID), 'i') != nullptr;
}

/// Determines whether this builtin is a C++ standard library function
/// that lives in (possibly-versioned) namespace std, possibly a template
/// specialization, where the signature is determined by the standard library
/// declaration.
bool isInStdNamespace(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'z') != nullptr;
return strchr(getAttributesString(ID), 'z') != nullptr;
}

/// Determines whether this builtin can have its address taken with no
Expand All @@ -194,33 +325,33 @@ class Context {

/// Determines whether this builtin has custom typechecking.
bool hasCustomTypechecking(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 't') != nullptr;
return strchr(getAttributesString(ID), 't') != nullptr;
}

/// Determines whether a declaration of this builtin should be recognized
/// even if the type doesn't match the specified signature.
bool allowTypeMismatch(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'T') != nullptr ||
return strchr(getAttributesString(ID), 'T') != nullptr ||
hasCustomTypechecking(ID);
}

/// Determines whether this builtin has a result or any arguments which
/// are pointer types.
bool hasPtrArgsOrResult(unsigned ID) const {
return strchr(getRecord(ID).Type, '*') != nullptr;
return strchr(getTypeString(ID), '*') != nullptr;
}

/// Return true if this builtin has a result or any arguments which are
/// reference types.
bool hasReferenceArgsOrResult(unsigned ID) const {
return strchr(getRecord(ID).Type, '&') != nullptr ||
strchr(getRecord(ID).Type, 'A') != nullptr;
return strchr(getTypeString(ID), '&') != nullptr ||
strchr(getTypeString(ID), 'A') != nullptr;
}

/// If this is a library function that comes from a specific
/// header, retrieve that header name.
const char *getHeaderName(unsigned ID) const {
return getRecord(ID).Header.getName();
return getInfo(ID).Header.getName();
}

/// Determine whether this builtin is like printf in its
Expand All @@ -245,27 +376,25 @@ class Context {
/// Such functions can be const when the MathErrno lang option and FP
/// exceptions are disabled.
bool isConstWithoutErrnoAndExceptions(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'e') != nullptr;
return strchr(getAttributesString(ID), 'e') != nullptr;
}

bool isConstWithoutExceptions(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'g') != nullptr;
return strchr(getAttributesString(ID), 'g') != nullptr;
}

const char *getRequiredFeatures(unsigned ID) const {
return getRecord(ID).Features;
}
const char *getRequiredFeatures(unsigned ID) const;

unsigned getRequiredVectorWidth(unsigned ID) const;

/// Return true if builtin ID belongs to AuxTarget.
bool isAuxBuiltinID(unsigned ID) const {
return ID >= (Builtin::FirstTSBuiltin + TSRecords.size());
return ID >= (Builtin::FirstTSBuiltin + TSInfos.size());
}

/// Return real builtin ID (i.e. ID it would have during compilation
/// for AuxTarget).
unsigned getAuxBuiltinID(unsigned ID) const { return ID - TSRecords.size(); }
unsigned getAuxBuiltinID(unsigned ID) const { return ID - TSInfos.size(); }

/// Returns true if this is a libc/libm function without the '__builtin_'
/// prefix.
Expand All @@ -277,16 +406,20 @@ class Context {

/// Return true if this function can be constant evaluated by Clang frontend.
bool isConstantEvaluated(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'E') != nullptr;
return strchr(getAttributesString(ID), 'E') != nullptr;
}

/// Returns true if this is an immediate (consteval) function
bool isImmediate(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'G') != nullptr;
return strchr(getAttributesString(ID), 'G') != nullptr;
}

private:
const Info &getRecord(unsigned ID) const;
std::pair<const char *, const Info &> getStrTableAndInfo(unsigned ID) const;

const Info &getInfo(unsigned ID) const {
return getStrTableAndInfo(ID).second;
}

/// Helper function for isPrintfLike and isScanfLike.
bool isLike(unsigned ID, unsigned &FormatIdx, bool &HasVAListArg,
Expand Down
1 change: 1 addition & 0 deletions clang/include/clang/Basic/BuiltinsPPC.def
Original file line number Diff line number Diff line change
Expand Up @@ -1138,5 +1138,6 @@ UNALIASED_CUSTOM_BUILTIN(mma_pmxvbf16ger2nn, "vW512*VVi15i15i3", true,
// FIXME: Obviously incomplete.

#undef BUILTIN
#undef TARGET_BUILTIN
#undef CUSTOM_BUILTIN
#undef UNALIASED_CUSTOM_BUILTIN
11 changes: 6 additions & 5 deletions clang/include/clang/Basic/TargetInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

#include "clang/Basic/AddressSpaces.h"
#include "clang/Basic/BitmaskEnum.h"
#include "clang/Basic/Builtins.h"
#include "clang/Basic/CFProtectionOptions.h"
#include "clang/Basic/CodeGenOptions.h"
#include "clang/Basic/LLVM.h"
Expand Down Expand Up @@ -1009,11 +1010,11 @@ class TargetInfo : public TransferrableTargetInfo,
virtual void getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const = 0;


/// Return information about target-specific builtins for
/// the current primary target, and info about which builtins are non-portable
/// across the current set of primary and secondary targets.
virtual ArrayRef<Builtin::Info> getTargetBuiltins() const = 0;
/// Return information about target-specific builtins for the current primary
/// target, and info about which builtins are non-portable across the current
/// set of primary and secondary targets.
virtual std::pair<const char *, ArrayRef<Builtin::Info>>
getTargetBuiltinStorage() const = 0;

/// Returns target-specific min and max values VScale_Range.
virtual std::optional<std::pair<unsigned, unsigned>>
Expand Down
Loading

0 comments on commit be2df95

Please sign in to comment.