Skip to content

Commit

Permalink
Store base pointer offset for map/list/set children of blocks and use…
Browse files Browse the repository at this point in the history
… this to migrate during garbage collection with -O3 (#1026)

Previously there existed a bug that would occur if the following
happened:

1. A garbage collection is triggered with at least two roots:
   a. A map/list/set which is a child of a symbol-layout block
b. A symbol-layout block that recursively points to the symbol-layout
block which is the parent of the map/list/set
2. Garbage collection does not copy the symbol which is the parent of
the map/list/set when tackling root 1a above. It merely tackles the
internal memory allocated by that map/list/set.
3. Garbage collection moves the block to the new allocation semispace.
The root pointer to the map/list/set still points to the collection
semispace.
4. Garbage collection ends, leaking a pointer into the collection
semispace. This memory has been corrupted by storing forwarding pointers
to the allocation semispace.
5. The backend tries to read the map/list/set, but its length field is
corrupted by the forwarding pointer, leading to a segfault.

In order to fix this, we annotate each map/list/set child of a block
with a "base pointer offset". This offset is an integer that, when added
to the address of the map/list/set pointer which is 8 bytes following
it, will return the address of the block header of the containing block.
We do not try to modify `get_value_type` since that would be very
intrusive as a change; instead we only modify the logic which selects
children of blocks using llvm `getelementptr` instructions, as well as
the code that creates map/list/set children in blocks in the first
place.

Once these annotations have been applied, which is the subject of the
first 5 commits in this PR, we modify the garbage collector to use them
in order to reconstruct the base pointer when it receives a garbage
collection root. We refactor out a `migrate_root` function which handles
GC roots, and rename `migrate_roots` to `migrate_static_roots` to avoid
confusion. Then we modify `migrate_root` to reconstruct the base
pointer, migrate the base pointer, and then write the updated derived
pointer back to the array of gc roots.

We no longer need to call `migrate_map`, `migrate_set`, or
`migrate_list` on the resulting map/list/set because this will now occur
for us during evacuation of the relevant parent block.

Code to handle the case where a map/list/set pointer is bare on the
alwaysgcspace rather than a child of a block remains largely unchanged,
except that we need to modify how we convert those collection to the
kore heap in order to accommodate the new memory layout.

Finally, we add a test. I ran the test on the master branch of K and it
failed.
  • Loading branch information
Dwight Guth authored Apr 17, 2024
1 parent 902bc76 commit 8fa45cc
Show file tree
Hide file tree
Showing 15 changed files with 2,630 additions and 56 deletions.
2 changes: 2 additions & 0 deletions include/kllvm/codegen/CreateTerm.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,8 @@ new_module(std::string const &name, llvm::LLVMContext &context);
llvm::StructType *get_block_type(
llvm::Module *module, kore_definition *definition,
kore_symbol const *symbol);
uint64_t get_block_offset(
kore_definition *definition, kore_symbol const *symbol, int idx);
uint64_t get_block_header_val(
llvm::Module *module, kore_symbol const *symbol, llvm::Type *block_type);
llvm::Value *get_block_header(
Expand Down
3 changes: 3 additions & 0 deletions include/kllvm/codegen/Util.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#define KLLVM_UTIL_H

#include <llvm/Config/llvm-config.h>
#include <llvm/IR/Constants.h>
#include <llvm/IR/DerivedTypes.h>
#include <llvm/IR/Function.h>
#include <llvm/IR/Module.h>
Expand All @@ -11,6 +12,8 @@

namespace kllvm {

llvm::Constant *get_offset_of_member(llvm::Module *, llvm::StructType *, int);

// Returns a reference to the function declaration for a memory allocation
// function with the given name, adding a declaration to the current module if
// one does not yet exist
Expand Down
2 changes: 2 additions & 0 deletions include/runtime/collect.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ extern "C" {
extern size_t numBytesLiveAtCollection[1 << AGE_WIDTH];
extern bool collect_old;
size_t get_size(uint64_t, uint16_t);
void migrate_static_roots(void);
void migrate(block **block_ptr);
void migrate_once(block **);
void migrate_list(void *l);
void migrate_map(void *m);
Expand Down
12 changes: 10 additions & 2 deletions lib/codegen/CreateStaticTerm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,18 @@ llvm::Constant *create_static_term::not_injection_case(
block_vals.push_back(llvm::ConstantArray::get(
empty_array_type, llvm::ArrayRef<llvm::Constant *>()));

int idx = 2;
int idx = 0;
for (auto const &child : constructor->get_arguments()) {
auto *sort = dynamic_cast<kore_composite_sort *>(
symbol->get_arguments()[idx].get());
auto cat = sort->get_category(definition_);
if (is_collection_sort(cat)) {
block_vals.push_back(get_offset_of_member(
module_, block_header_type,
get_block_offset(definition_, symbol, idx)));
}
llvm::Constant *child_value = nullptr;
if (idx++ == 2 && val != nullptr) {
if (idx++ == 0 && val != nullptr) {
child_value = val;
} else {
child_value = (*this)(child.get()).first;
Expand Down
43 changes: 38 additions & 5 deletions lib/codegen/CreateTerm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -200,12 +200,39 @@ llvm::StructType *get_block_type(
types.push_back(empty_array_type);
for (auto const &arg : symbol->get_arguments()) {
auto *sort = dynamic_cast<kore_composite_sort *>(arg.get());
auto cat = sort->get_category(definition);
if (is_collection_sort(cat)) {
types.push_back(llvm::Type::getInt64Ty(module->getContext()));
}
llvm::Type *type = getvalue_type(sort->get_category(definition), module);
types.push_back(type);
}
return llvm::StructType::get(module->getContext(), types);
}

uint64_t get_block_offset(
kore_definition *definition, kore_symbol const *symbol, int idx) {
uint64_t result = 2;
int i = 0;
for (auto const &arg : symbol->get_arguments()) {
auto *sort = dynamic_cast<kore_composite_sort *>(arg.get());
auto cat = sort->get_category(definition);
if (is_collection_sort(cat)) {
if (i == idx) {
return result + 1;
}
result += 2;
} else {
if (i == idx) {
return result;
}
result += 1;
}
i++;
}
throw std::invalid_argument("idx not within bounds of symbol");
}

uint64_t get_block_header_val(
llvm::Module *module, kore_symbol const *symbol, llvm::Type *block_type) {
uint64_t header_val = symbol->get_tag();
Expand Down Expand Up @@ -779,24 +806,30 @@ llvm::Value *create_term::not_injection_case(
llvm::StructType *block_type = get_block_type(module_, definition_, symbol);
llvm::Value *block_header
= get_block_header(module_, definition_, symbol, block_type);
int idx = 2;
int idx = 0;
std::vector<llvm::Value *> children;
for (auto const &child : constructor->get_arguments()) {
auto *sort = dynamic_cast<kore_composite_sort *>(child->get_sort().get());
auto cat = sort->get_category(definition_);
if (is_collection_sort(cat)) {
children.push_back(get_offset_of_member(
module_, block_type, get_block_offset(definition_, symbol, idx)));
}
llvm::Value *child_value = nullptr;
if (idx == 2 && val != nullptr) {
if (idx == 0 && val != nullptr) {
child_value = val;
} else {
std::string new_location = fmt::format("{}:{}", location_stack, idx - 2);
std::string new_location = fmt::format("{}:{}", location_stack, idx);
if (is_injection_symbol(child.get(), definition_->get_inj_symbol())) {
new_location = location_stack;
}
child_value = create_allocation(child.get(), new_location).first;
}

auto *sort = dynamic_cast<kore_composite_sort *>(child->get_sort().get());
if (sort && is_collection_sort(sort->get_category(definition_))) {
child_value = new llvm::LoadInst(
block_type->elements()[idx], child_value, "", current_block_);
block_type->elements()[get_block_offset(definition_, symbol, idx)],
child_value, "", current_block_);
}
children.push_back(child_value);
idx++;
Expand Down
7 changes: 5 additions & 2 deletions lib/codegen/Decision.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,9 @@ void switch_node::codegen(decision *d) {
block_type, cast,
{llvm::ConstantInt::get(llvm::Type::getInt64Ty(d->ctx_), 0),
llvm::ConstantInt::get(
llvm::Type::getInt32Ty(d->ctx_), offset + 2)},
llvm::Type::getInt32Ty(d->ctx_),
get_block_offset(
d->definition_, switch_case.get_constructor(), offset))},
"", d->current_block_);

llvm::Value *child = nullptr;
Expand Down Expand Up @@ -851,7 +853,8 @@ void abort_when_stuck(
llvm::Value *child_ptr = llvm::GetElementPtrInst::CreateInBounds(
block_type, block,
{llvm::ConstantInt::get(llvm::Type::getInt64Ty(ctx), 0),
llvm::ConstantInt::get(llvm::Type::getInt32Ty(ctx), idx + 2)},
llvm::ConstantInt::get(
llvm::Type::getInt32Ty(ctx), get_block_offset(d, symbol, idx))},
"", current_block);
if (is_collection_sort(cat)) {
child_value = new llvm::LoadInst(
Expand Down
27 changes: 8 additions & 19 deletions lib/codegen/EmitConfigParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -859,7 +859,9 @@ static void get_store(
llvm::Type *arg_ty = get_arg_type(cat, module);
llvm::Value *child_ptr = llvm::GetElementPtrInst::CreateInBounds(
block_type, cast,
{zero, llvm::ConstantInt::get(llvm::Type::getInt32Ty(ctx), idx++ + 2)},
{zero, llvm::ConstantInt::get(
llvm::Type::getInt32Ty(ctx),
get_block_offset(definition, symbol, idx++))},
"", case_block);
if (is_collection_sort(cat)) {
arg = new llvm::LoadInst(arg_ty, arg, "", case_block);
Expand Down Expand Up @@ -954,7 +956,9 @@ static void get_visitor(
value_type cat = composite_sort->get_category(definition);
llvm::Value *child_ptr = llvm::GetElementPtrInst::CreateInBounds(
block_type, cast,
{zero, llvm::ConstantInt::get(llvm::Type::getInt32Ty(ctx), idx++ + 2)},
{zero, llvm::ConstantInt::get(
llvm::Type::getInt32Ty(ctx),
get_block_offset(definition, symbol, idx++))},
"", case_block);
llvm::Value *child = new llvm::LoadInst(
getvalue_type(cat, module), child_ptr, "", case_block);
Expand Down Expand Up @@ -1114,34 +1118,19 @@ static void get_visitor(
}
}

static llvm::Constant *get_offset_of_member(
[[maybe_unused]] llvm::Module *mod, llvm::StructType *struct_ty,
int nth_member) {
#if LLVM_VERSION_MAJOR >= 17
auto offset
= llvm::DataLayout(mod).getStructLayout(struct_ty)->getElementOffset(
nth_member);
auto *offset_ty = llvm::Type::getInt64Ty(mod->getContext());
return llvm::ConstantInt::get(offset_ty, offset);
#else
return llvm::ConstantExpr::getOffsetOf(struct_ty, nth_member);
#endif
}

static llvm::Constant *get_layout_data(
uint16_t layout, kore_symbol *symbol, llvm::Module *module,
kore_definition *def) {
uint8_t len = symbol->get_arguments().size();
std::vector<llvm::Constant *> elements;
llvm::LLVMContext &ctx = module->getContext();
auto *block_type = get_block_type(module, def, symbol);
int i = 2;
int i = 0;
for (auto const &sort : symbol->get_arguments()) {
value_type cat
= dynamic_cast<kore_composite_sort *>(sort.get())->get_category(def);
auto *offset = get_offset_of_member(
module, block_type,
i++); //llvm::ConstantExpr::getOffsetOf(BlockType, i++);
module, block_type, get_block_offset(def, symbol, i++));
elements.push_back(llvm::ConstantStruct::get(
llvm::StructType::getTypeByName(
module->getContext(), layoutitem_struct),
Expand Down
14 changes: 14 additions & 0 deletions lib/codegen/Util.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,18 @@ llvm::Function *kore_heap_alloc(std::string const &name, llvm::Module *module) {
return get_or_insert_function(module, name, alloc_type);
}

llvm::Constant *get_offset_of_member(
[[maybe_unused]] llvm::Module *mod, llvm::StructType *struct_ty,
int nth_member) {
#if LLVM_VERSION_MAJOR >= 17
auto offset
= llvm::DataLayout(mod).getStructLayout(struct_ty)->getElementOffset(
nth_member);
auto *offset_ty = llvm::Type::getInt64Ty(mod->getContext());
return llvm::ConstantInt::get(offset_ty, offset);
#else
return llvm::ConstantExpr::getOffsetOf(struct_ty, nth_member);
#endif
}

} // namespace kllvm
2 changes: 1 addition & 1 deletion runtime/collect/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
add_library(collect STATIC
collect.cpp
migrate_roots.cpp
migrate_static_roots.cpp
migrate_collection.cpp
)

Expand Down
Loading

0 comments on commit 8fa45cc

Please sign in to comment.