-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design choices for a proc macro implementation #423
Comments
There is
Given that we don't output this module, I think it violates the Rust API guideline that input syntax is evocative of the output (C-EVOCATIVE). What's the problem with just continuing to use the existing
I don't think we should try to support this, since we want to change the naming scheme in
This is already possible to implement whether one uses proc-macros or not, but the reason I didn't do it is because it will only be a textual match;
I don't really want the user to be able to specify explicit retain semantics; if their use-case requires it, they can just define the method manually (see LAYERED_SAFETY.md for a bit more on the philosophy that we don't have to handle every use-case under the sun because the user can do it themselves).
Now that we can use proc-macros to name the internal method something else, I'd rather just do something like: // Input
#[method(test)]
pub fn test(&self) -> i32 {
1
}
// Output
builder.add_method(sel!(test), Self::__test);
fn __test(&self, _cmd: Sel) -> i32 {
1
}
pub fn test(&self) -> i32 {
unsafe { msg_send![self, test] }
}
Agreed,
Heh, yeah, no-one ever said this task would be easy - but I think macro implementation difficulty should not really be considered when trying to find ideal syntax.
Yeah, perhaps it would indeed be smart to try to take ideas from Also, I would like to note, I think the
Yeah, except it somewhat conflicts with #345 (unless we start requiring
I don't think those need to be implemented using a proc-macro, or even be
At the very least, I think the name Also, you should check out |
Hmm, that is a point to consider. I'm not entirely sure if I would agree that it violates those principles though. Conceptually, a module is used as input, and everything contained is produced as output, it's just that the scope of everything in the module is imported afterwards, rendering the module redundant (except for grouping input). Alternatively, we could leave the module in place in the output, but what purpose would that really serve?
There are really a few points here that I think make proc macros interesting to consider:
I don't think the current approach is bad though, I mean it works quite well so far. Maybe in the end it's better to just keep it that way, I don't know.
True, but maybe it would still be useful for people who are using it for their own code outside of
Yeah, I eventually realized that what you describe would be an issue, but I thought I should at least try to implement it as an experiment. I suppose my impression (which wasn't tested) was that the error message from a hypothetical case like that would still be clear enough that the user would have some idea of what was happening. But I certainly agree it's a hack and that handling it through the type system would be far better if feasible. This is also probably the most complex part of the macro implementation, so removing it would be nice.
Okay, that seems reasonable to me.
I actually agree that that example is a little confusing. The
Something like the latter part is what I was imagining long term I guess.
Yeah, something like that could work too. The point about being more descriptive is probably a good idea.
That looks interesting. The macro design and grammar looks nice but I don't really understand what the implementation is doing (there's a lot of low-level stuff in there). Also, they seem to manage to decouple macro invocations for the class |
Well, we could probably just wrap things in // In user-code:
#[objc2::declare]
unsafe extern "Objective-C" {
#[super = NSObject]
// #[inherits = ...]
#[name = "MyApplicationDelegate"]
struct Delegate {
text_field: IvarDrop<Id<NSTextField>>,
web_view: IvarDrop<Id<WKWebView>>,
}
impl Delegate {
#[method(initWithTextField:andWebView:)]
unsafe fn initWithTextField_andWebView(
this: Allocated<Self>,
text_field: *mut NSTextField,
web_view: *mut WKWebView,
) -> Option<&mut Self> {
let this = unsafe { msg_send![super(this), init]? };
Ivar::write(&mut this.text_field, unsafe { Id::retain(text_field) }?);
Ivar::write(&mut this.web_view, unsafe { Id::retain(web_view) }?);
Some(this)
}
}
unsafe impl NSApplicationDelegate for Delegate {
// ...
}
}
// In `icrate`:
#[objc2::bridge]
unsafe extern "Objective-C" {
/// Doc comment
#[derive(PartialEq, Eq, Hash, Debug)]
#[super = NSObject]
// #[name = "NSString"]
pub type NSString;
impl NSString {
#[method(length)]
fn length() -> usize;
}
/// Doc comment
// #[name = "NSLocking"]
pub unsafe trait NSLocking {
#[method(lock)]
unsafe fn lock(&self);
#[method(unlock)]
unsafe fn unlock(&self);
}
} (Assuming that works with
Sorry, I meant: Convert
That's totally cool, and a reasonable point - I'd be fine with either way in the initial implementation. Later on, once we actually have some knowledge of the compile-time cost it would impose, I could do the type-system parts for you (or mentor you in it).
Ah, okay.
Yeah, their implementation is doing things with |
Also, noting the general principles I try to follow when designing macros (which I've already somewhat stated above, but just for completeness):
Also, I've explicitly opted for having the user manually specify that they implement |
Unfortunately, I don't think we can get that to work. The problem is that the items allowed in There is a verbatim item there, which allows us to at least represent arbitrary syntax (if we produce the item ourselves, instead of parsing it), but I don't think we can actually parse it that way as input to the macro without getting an error from the compiler first. You can try some examples with ast explorer, which uses Those few examples I was using (e.g., I think we're probably stuck with |
Ah yeah, you're right, I thought proc-macro attributes allowed arbitrary token streams as their input. In that case, I think keeping Perhaps I could be persuaded to instead allow something like |
Oh, I guess I hadn't considered a function-like proc macro replacement for I guess the disadvantage to that approach, versus attributes, is that it looks less natural if you want to pass arguments to it beyond the syntactic input.
This could also work. I think I tried that with only a single
That would be interesting, and probably useful for other projects too, but I'm not sure what the chances are that such a change would be approved by the compiler team. If you think it's worth exploring though I wouldn't be against it. |
Given the recent discussion with regard to how to group the items and the issues with both It seems we could actually use that for building the I haven't tried a full implementation, but based on this small example which compiles, it seems that it should be possible: #![deny(unsafe_op_in_unsafe_fn)]
use icrate::{
objc2::{
declare::{Ivar, IvarDrop},
declare_class,
msg_send,
rc::Id,
ClassType,
},
AppKit::NSTextField,
Foundation::NSObject,
WebKit::WKWebView,
};
use linkme::distributed_slice;
pub trait DynMethodImplementation {
fn callee(&self) -> objc2::encode::Encoding;
fn ret(&self) -> objc2::encode::Encoding;
fn args(&self) -> &'static [objc2::encode::Encoding];
fn __imp(self) -> objc2::runtime::Imp;
}
impl<Callee, Ret, Args, F> DynMethodImplementation for F
where
F: objc2::declare::MethodImplementation<Callee = Callee, Ret = Ret, Args = Args>,
Callee: objc2::RefEncode + ?Sized,
Ret: objc2::encode::__unstable::EncodeReturn,
Args: objc2::encode::__unstable::EncodeArguments,
{
fn callee(&self) -> objc2::encode::Encoding {
F::Callee::ENCODING_REF
}
fn ret(&self) -> objc2::encode::Encoding {
F::Ret::ENCODING_RETURN
}
fn args(&self) -> &'static [objc2::encode::Encoding] {
F::Args::ENCODINGS
}
fn __imp(self) -> objc2::runtime::Imp {
F::__imp(self)
}
}
#[distributed_slice]
pub static DELEGATE_INSTANCE_METHODS: [&(dyn DynMethodImplementation + Sync)] = [..];
#[distributed_slice(DELEGATE_INSTANCE_METHODS)]
static DELEGATE_INSTANCE_METHOD_0: &(dyn DynMethodImplementation + Sync) =
&(Delegate::__init_withTextField_andWebView as unsafe extern fn(_, _, _, _) -> _);
declare_class!(
struct Delegate {
text_field: IvarDrop<Id<NSTextField>, "_text_field">,
web_view: IvarDrop<Id<WKWebView>, "_web_view">,
}
mod ivars;
unsafe impl ClassType for Delegate {
type Super = NSObject;
const NAME: &'static str = "Delegate";
}
unsafe impl Delegate {
#[method(initWithTextField:andWebView:)]
#[allow(non_snake_case)]
unsafe fn __init_withTextField_andWebView(
self: &mut Self,
text_field: *mut NSTextField,
web_view: *mut WKWebView,
) -> Option<&mut Self> {
let this: Option<&mut Self> = msg_send![super(self), init];
let this = this?;
Ivar::write(&mut this.text_field, unsafe { Id::retain(text_field) }?);
Ivar::write(&mut this.web_view, unsafe { Id::retain(web_view) }?);
Some(this)
}
}
);
fn main() {
} Thoughts on this approach? I think it would really nice from an ergonomics point of view if we could allow decoupling these invocations like how the |
Hmm, in my eyes the points against
So I think I'd rather avoid it, and keep the declaration and the implementations together (e.g. inside |
That's a fair point. But it is a little bit niche though, compared to say,
I think robustness or future-proof-ness is probably the most convincing argument against it.
There doesn't seem to be much (or really any) discussion about it's soundness that I can find. The author has a pretty good track record with developing robust crates though, for whatever that counts for. But maybe it would just be best to ask about it on the
Alright, although since I've already started looking at that, I'd like to try to finish putting together a working implementation to see if it's even feasible, if only for my own curiosity. If it turns out not to be a good idea, that's okay. But maybe another possibility (if it does work) would be to allow both approaches: we could always gate a |
By the way, I notice that in the docs there is this statement (with regard to safety):
So the claim is at least made that it is safe. Also, since it does operate only during compilation and linking, I'm not sure how hypothetical problematic unsafety (in the sense of UB) could actually manifest. |
Since I've been working on a proof-of-concept implementation of proc macro equivalents for
declare_class!
,extern_methods!
, etc., I encountered a number of different points where there were some interesting choices to be made in the design space and I thought it would be a good idea to discuss some of those.For a point of reference, take a current
macro_rules!
based definition like this:The equivalent in terms of the proof-of-concept proc macros currently looks like this:
A couple of observations about this:
Originally, I was thinking it would make sense to have more separate macros like
#[class]
,#[protocol]
, etc., when I proposed something looking closer to this:But at that time I didn't realize yet that we need to be able to parse the class
struct
and the classimpl
together in order to correctly define the::class()
method (because it registers the methods when first called).Unfortunately, there is also no practical way (that I know of) to manage state across proc-macro invocations. So the only real obvious choice as an alternative is to place the respective
struct
andimpl
items within an enclosing item so the proc macro can work similarly todeclare_class
. Which leads to the choice of usingmod
.Given an invocation like this:
What happens is the macro expects to find, within the
mod <ClassName>
, astruct <ClassName> { ... }
, or atype <ClassName>;
(note the lack of=
). The actualmod
is just a dummy item and is not emitted, only the items it encloses are emitted. Furthermore, the name of thestruct
ortype
must exactly match the name of themod
, and only a singlestruct
xortype
is allowed.Within a class
#[objc(super = <superclass>)] mod C { ... }
, animpl
is translated in the following way.Specifying the selector is not mandatory (if omitted, it is computed from the current camel-case/snake-case hybrid scheme we use, correctly handling trailing
:
).Also,
#[method]
/#[method_id]
are not necessary since we determine this from the method return type (looking for-> Id<...>
orResult<Id<...>, ...>
), although as with selectors it is possible to manually control this behavior. In that case you can specify#[objc(managed)] fn f(&self, args*) -> ...
(without explicit retain semantics) or#[objc(managed = "init")] fn f(args*) -> ...
(with explicit retain semantics).For
impl C
, we translate methodsfn(args*) -> ... { ... }
as class methods,fn f(&self, args*) -> ... { ... }
as instance methods, similar as fordeclare_class!
. Methodsfn(&self?, args*) -> ...;
(which are not valid Rust syntax, but which we can parse) are handled as withextern_methods!
.For
impl T for C
, we translate the enclosed methods as protocol methods.One choice I've been considering is splitting this behavior up a little more and using
extern
blocks along withmod
, in the following sense.For
#[objc] mod C { ... }
we would only allow a classstruct
and not a classtype
. Furthermore, we would no longer parse methods without bodies likefn f(...) -> ...;
withinimpl
items in the classmod
.Instead, to handle those cases, you would now write this:
The obvious disadvantages to this approach are that it's maybe a little uglier (since we don't have
impl C
) and we'd probably still need an outer enclosingmod
to handle protocol translations, since we also can't writeimpl P for C
withinextern
.Advantages are that it's arguably clearer what is happening semantically, specifically because we are using
extern type
here. It's also arguably easier to parse, since withinextern
, havingtype T;
andfn f() -> ...;
is valid syntax.The latter part is not a huge issue, since in the case of syn, it handles those non-valid syntax cases as a raw
TokenStream
, it just requires re-implementing some of the parsing for those items by hand. But to be honest, I am already doing some of that in order to parse items within a classmod
without backtracking (e.g., several items are ambiguous until after you parse attributes and visibility qualifiers).This is also the approach that cxx and wasm-bindgen use with their proc-macros.
Actually, with
cxx
you have this:where the stuff in
extern "Rust" { ... }
is used for generating header files for using Rust definitions from C++. AFAIK, we don't have an equivalent for that (and maybe it's out of scope), but it might be worth considering as a future option.And there's also the part where
cxx
uses theinclude!
directive in theextern "C++"
block for generating bindings. Something that might be interesting for us to consider, if proc macros seem like the way to go, is making theheader-translator
functionality available in terms of macro invocations instead of requiring it to be run externally.I think that's all I have to say about this for now. I didn't mention macros for
static
,fn
, andenum
, but I was planning on just re-using the#[objc]
macro for that. It trivial to determine which item it is applied to, so it seemed to make sense to minimize the number of names we use for the macros. But maybe something other than#[objc]
would be appropriate too.Any thoughts or feedback on this? Does it make sense to split the functionality into
extern
even if it's more verbose?The text was updated successfully, but these errors were encountered: