-
Notifications
You must be signed in to change notification settings - Fork 85
[RFC] #[ramfunc]
#100
base: master
Are you sure you want to change the base?
[RFC] #[ramfunc]
#100
Conversation
this attribute lets you place functions in RAM closes #42
Oh and I have not tested this on hardware. I have only looked at the output of |
@japaric This is fantastic! Especially the ability to have exception/interrupt handlers without veneer.
Yes, we should do that. |
macros/src/lib.rs
Outdated
/// ``` | ||
/// # use cortex_m_rt_macros::ramfunc; | ||
/// #[ramfunc] | ||
/// unsafe fn computation_heavy_function() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why unsafe
? Also I think it would be better do a computation heavy function example with something that actually consumes and/or produces something via parameters and/or return values. ;) Showing how to apply this to an interrupt/exception handler would also be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why unsafe?
Oh, I literally copy pasted the #[pre_init]
example and forgot to remove the unsafe keyword. Will fix it in a bit.
Something that should be mentioned in the docs is that #[ramfunc]
fn foo() {
// ..
bar();
// ..
}
fn bar() { .. } Could result in Sure you can add |
// function is in so we can't use a mangled version of the path to the function -- we don't know | ||
// its full path! Instead we'll use a random string for the section name of each function. | ||
let mut rng = rand::thread_rng(); | ||
let hash = (0..16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a RNG in a procedural macro doesn't fell quite right to me but I don't see any other option to generate a unique section name from only the AST of function. Also, I have no idea if the RNG negatively impacts incremental compilation. I don't think it affects the reproducibility of the builds because the random section names won't make it to the final binary -- the linker merge all of them and puts them under .ramfunc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember reading somewhere that think symbol names can affect link order, so it maybe can break reproducibility... Not sure.
Is it possible to hash the TokenStream instead? this would guarantee builds are reproducible.
*(.ramfunc.*); | ||
|
||
. = ALIGN(4); /* 4-byte align the end (VMA) of this section */ | ||
} > RAM AT > FLASH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want it at RAM? It is common to put the ramfunc
in CCM (core coupled memory) RAM to not contend with the data RAM or vice versa.
Can we have make this user specifiable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have make this user specifiable?
In general, we don't have good support for multiple memory regions -- we hardcode most things to Flash and RAM (e.g. we don't have a way to say put this static mut
in CCRAM and that one in RAM).
By user specifiable do you mean that (a) the user should be able to (a) say pull all #[ramfunc]
functions into CCRAM or put all of them in RAM; or (b) that the user should be specify that this one function goes in CCRAM and that one goes in RAM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preferable for each one, but if we can make all settable that would be a good improvement.
Also I agree with @therealprof, a |
Sounds good to me. I'm quite used to this sort of thing from working with big-name RTOS from a vendor that rhymes with "Nentor" and the little pitfalls tend to be fairly well known by people who have a need for this. |
👍
I'm fine with
Is that actually a problem? Presumably if the instantiation was generated it's because it's in use, so wouldn't be eliminated by the linker later anyway, but I might be misunderstanding how the generic instantiations get made. In any event it seems like it might be more useful to be able to have ramfunc generics than not. Since calling a function from a ramfunc doesn't make the called function live in RAM unless it gets inlined, you can't even make your own type-specific instantiations which call the generic and be confident it will still be in RAM. Otherwise this looks good. As @korken89 mentioned it would be really nice if we could specify what section ramfuncs end up in, but I think we also want that for statics, and for things like #59, and even being able to move stacks to CCM. The same consideration about being able to initialise in a single loop would apply to non- |
Continuing thread #100 (comment) here because review comments get hidden when new commits are pushed. We could support placing functions in CCRAM by adding a An alternative is to extend the This can be further generalized to allow placing (*) Using (**) Similarly the attribute doesn't have enough information to determine if something should go in On the one hand, support for multiple memory regions looks like a breaking change so it would be good to batch it with breaking addition of attributes. On the other hand, I don't have the bandwidth to sit down and design something reasonable at the moment so I would prefer to postpone it for v0.7.0. Disclaimer: I never had the need to call stuff from RAM so take my comment with a grain of salt. I'm wondering if having #[ramfunc]
fn foo() {
// ..
bar(args);
// ..
}
#[ramfunc]
fn bar(args: Args) { .. } If OTOH, if Or, instead of that guideline we could add a // `ramcall!(fun)` expands into:
{
#[ramfunc]
#[inline(never)]
fn thunk() { fun() } // LLVM will likely inline `fun` into `thunk`
thunk();
} Then the recommendation would be to use |
True, but why would you do that? If you wanted to make sure it's inlined, why not declare it just |
On inline: On sections: I think we'd really need to be able to differentiate There are related issues around setting up memory protection units with details of the sections and which ones should have which attributes too. |
Because you are not 100% sure whether you inlining it is better or not -- that's the job of LLVM. What you want to be sure of is that if it's not inlined then
That might work. It seems
But having more memory regions is not zero cost. The loop for initializing Manually placing things in .bss is unsafe (e.g. placing re: generics
Not necessarily. The default is to use multiple codegen units -- basically the Rust source code is split in chunks and each one is codegen-ed independently of the other. A simple example of instances lowered to machine code but still GC-ed is: you have The scenario you are picturing is probably only achievable using fat LTO (codegen-units = 1), but I'm not 100% sure it's guaranteed to occur. |
I don't buy it. People who fiddle with such optimisations should also be capable of figuring out whether inlining is worth it or not and not leave such a decision to LLVM... After all small changes in rustc, LLVM or the program may change the inlining decision, so in doubt an automatic decision shouldn't be trusted anyway. |
The main aspect here is respecting the declared wish of the developer: if a function is declared |
@japaric wrote this:
I recently had the need to write a function that is guaranteed to run from RAM (no access to Flash allowed before the function is finished). I found it impossible to do that in Rust, for the reason mentioned in the quote. Unfortunately that makes this feature useless for my use case. I don't know what could be done to change that. I left a pull experience report here: https://github.com/rust-embedded/cortex-m-rt/issues/42#issuecomment-559061416 |
Context_for_my_usecase - just wanted to see if there are updates for this thread. I happen to be working on a driver that writes a
|
This is a 2 in 1 RFC and implementation PR. This builds on top of #90 so you can ignore the first commit. See #42 for some background information.
@rust-embedded/cortex-m Please review this RFC and if you are in favor approve this PR.
Summary
Add a
#[ramfunc]
attribute to this crate. This attribute can be applied tofunctions to place them in RAM.
Motivation
Running functions from RAM may result in faster execution times. Thus placing
"hot code" in RAM may improve the performance of an application.
Design and rationale
Expansion
The
#[ramfunc]
attribute will expand into a#[link_section = ..]
attribute that will place the function in a
.ramfunc.$hash
section. Each#[ramfunc]
function will be placed in a different linker section (in asimilar fashion to
-ffunction-sections
) to let the linker GC (Garbage Collect)the unused functions . To this end,
#[ramfunc]
won't accept generic functionsbecause if the generic function has several instantiations all of them would end
in the same linker section.
link.x
will be tweaked to instruct the linker to collect all the inputsections of the form
.ramfunc.*
into a single.ramfunc
output section thatwill placed in RAM.
The
Reset
handler will be updated to initialize the.ramfunc
section beforethe user entry point is called.
Why an attribute?
This is implemented as an attribute to make it composable with the attributes to
be added in RFC #90. For example you can place an exception handler in RAM. The
compiler won't generate a veener in this case as the address to the RAM location
will be placed in the vector table.
Why a separate linker section?
Using a separate section lets the compiler / linker assign the right ELF flags
to .ramfunc (i.e. "AX"). That way
.ramfunc
ends up showing in the output ofobjdump -C
.This also means that
.data
would only contain ... data so the section can beinspected (not disassembled) separately.
The downside (?) is that the default format of
size
will report.ramfunc
under text.
But in any case it's (usually) better to look at the output in System V format.
Unresolved questions?
Is there a reliable way to initialize both
.data
and.ramfunc
in a singlefor loop? (see
init_data
in theReset
function) Right now I initializeeach one separately and this produces more code. I haven't tried initializing
them in a single pass.
Should
#[ramfunc]
imply#[inline(never)]
? Right now the compiler is freeto inline functions marked as
#[ramfunc]
into other functions. That couldcause the function to not end in RAM.
Bikeshed the attribute name and linker section name? IAR and TI use the name
ramfunc
for this functionality.