Existing by coincidence, programming deliberately
Refactoring boilerplate code is always easy in dynamically-typed languages, but sometimes takes a bit more effort when constrained by strong typing. This is something I was puzzling over recently, when the penny dropped for me about how Rust's macros can be used to bridge the gap.
If you have control over all of the code in question, macros probably aren't needed of course. Some combination of generics, traits and enums would typically provide a better (and more readable) solution. But there are times when the types involved are out of your control and that is a niche which macros can thrive in.
Here is a basic example
I encountered last week.
Actix-web
exports a TestServer
struct
that helps you test your endpoints.
TestServer::new
expects a configuration function
that takes one argument
of type TestApp
:
TestServer::new(|app| {
// Set up routes, resources and middleware
// by calling methods on `app`...
})
For the production code
there is a similar App
type,
which has methods
with the same signatures
as the ones on TestApp
.
Since the routes you want to test
are likely the same
as the routes on your production server,
a natural next step
might be to try and write
a common route-setup function
that works in both contexts.
Unfortunately though,
TestApp
and App
are not related.
Those "common" methods aren't inherited
from some shared trait,
they're defined independently
on each structure.
So in order to write a function
that sets routes on either App
or TestApp
,
you'd have to wrap them in an enum
and write code to manually forward
all of the method calls
to the inner structs.
Looking up the associated type information
to get those definitions right
is tedious busywork,
but a simple macro
allows you to skip it instead:
macro_rules! init_routes {
($app:expr) => {
// All the initialisation code stays the same,
// methods are just called on `$app` instead...
}
}
Now you can invoke the macro
from both your test setup
and the production code,
without it needing to know anything about
the type information for $app
:
TestServer::new(|app| {
init_routes!(app);
})
The compiler will concern itself with type-checking later, after the macro has been expanded. All the macro cares about is whether its match arm is satisfied.
Macros can also be used to eliminate duplication where there is no executable code involved at all, only type information.
For instance, staying with actix, let's say you have an actor that handles some messages:
pub struct MyActor {
// Whatever state your actor needs...
}
impl Handler<Foo> for MyActor {
type Result = Result<HashMap<String, String>, Error>;
fn handle(&mut self, msg: Foo, context: &mut self::Context) -> Self::Result {
let mut result = HashMap::new();
// Populate `result` somehow...
Ok(result)
}
}
impl Handler<Bar> for MyActor {
type Result = Result<bool, Error>;
fn handle(&mut self, msg: Foo, context: &mut self::Context) -> Self::Result {
let mut result = false;
// ...
Ok(result)
}
}
The code to define
the Foo
and Bar
messages
might look something like this:
pub struct Foo {
pub id: String,
pub wibble: String,
}
impl Message for Foo {
type Result = <MyActor as Handler<Foo>>::Result;
}
pub struct Bar {
pub id: String,
pub blee: String,
}
impl Message for Bar {
type Result = <MyActor as Handler<Bar>>::Result;
}
And that pattern might be repeated many more times
for other message types too.
Conventional refactoring is immediately off the agenda here
because we only have types, properties
and the Message
trait
to work with.
But that's all meat and drink
for a macro:
macro_rules! message {
($message:ident {$($property:ident: $type:ty),*}) => {
pub struct $message {
$(pub $property: $type,)*
}
impl Message for $message {
type Result = <MyActor as Handler<$message>>::Result;
}
}
}
With that in place your message boilerplate now looks like this:
message!(Foo {
id: String,
foo: String
});
message!(Bar {
id: String,
bar: String
});
An important introduction here
was the pair of $(
and ),*
wrapping the declarations
of $property
and $type
in the match arm of the macro.
That denotes repetition
and says the enclosed portion of the match
can be repeated zero or more times,
with a ,
separating each item.
Replacing the *
with a +
would change that to one or more times
and the ,
could be replaced
by anything you like
(including nothing at all).
But ,
is a good choice here
because it makes the macro more intuitive.
That's all well and good for straightforward cases, but what about when a refactoring has many levels? Perhaps there is a core pattern to be extracted in addition to higher levels that depend on the core? This can also be achieved, although there are a couple of gotchas to be careful of.
Returning to the previous example,
you'll have noticed that Foo
and Bar
share a common id
property.
If there are no other message types,
we could just move id
into the macro body inline.
But what if there are
Baz
and Qux
messages,
which don't have an id
property?
Nested macros can help you with that:
macro_rules! id_message {
($message:ident {$($property:ident: $type:ty),*}) => {
message!($message {
id: String
$(, $property: $type)*
});
}
}
id_message!(Foo {
foo: String
});
id_message!(Bar {
bar: String
});
message!(Baz {
baz: String
});
message!(Qux {
qux: String
});
Another way of expressing the same thing
might have been to try and write
a higher-order macro,
nesting one macro_rules!
directly inside another.
But in this case that wouldn't work
because nested repetition is ambiguous syntax:
macro_rules! message_macro {
($macro:ident {$($common_property:ident: $common_type:ty),*}) => {
macro_rules! $macro {
($message:ident {$($property:ident: $type:ty),*}) => {
pub struct $message {
$(pub $common_property: $common_type,)*
$(pub $property: $type,)*
}
impl Message for $message {
type Result = <MyActor as Handler<$message>>::Result;
}
}
}
}
}
message_macro!(message {});
message_macro!(id_message {
id: String
});
Here,
it would require some magical thinking
to infer that we meant for
$($property:ident: $type:ty),*
to be interpreted as part of
the child macro's match arm
rather than the parent macro's body.
The compiler very reasonably points this out to us
with the following error message:
error: attempted to repeat an expression containing no syntax variables matched as repeating at this depth
--> src/main.rs:4:31
|
4 | ($message:ident {$($property:ident: $type:ty),*}) => {
|
The second obstacle to be wary of with nesting is macro hygiene. Rust's macros are hygienic, which means they each have their own context for expansion. An identifier introduced by an inner context may not be referenced by outer layers, instead you must pass the identifier in from outside. Even if the expansion looks like it would make sense when rolled out manually, the compiler will still complain.
More concretely, consider the following macros for defining route handlers with actix-web:
macro_rules! endpoint {
($handler:ident: $dispatcher:ident ($path_type:ty) {$($property:ident: $value:expr),*}) => {
pub fn $handler(
(path, state): (Path<$path_type>, State<ServerState>),
) -> FutureResponse<HttpResponse> {
state
.actor
.send($dispatcher {
$($property: $value),*
})
.from_err()
.and_then(|res| match res {
Ok(body) => Ok(HttpResponse::Ok().json(body)),
Err(_) => Ok(HttpResponse::InternalServerError().into()),
})
.responder()
}
}
}
macro_rules! uid_endpoint {
($handler:ident: $dispatcher:ident) => {
endpoint! {
$handler: $dispatcher (UidParam) {
uid: path.uid.clone()
}
}
}
}
uid_endpoint!
would fail compilation here with:
error[E0425]: cannot find value `path` in this scope
Even though we know path
exists
because endpoint!
always declares it,
we still have to pass the identifier in
so that uid_endpoint!
is allowed to reference it:
macro_rules! endpoint {
($handler:ident: $dispatcher:ident ($path:ident: $path_type:ty) {$($property:ident: $value:expr),*}) => {
pub fn $handler(
($path, state): (Path<$path_type>, State<ServerState>),
) -> FutureResponse<HttpResponse> {
state
.actor
.send($dispatcher {
$($property: $value),*
})
.from_err()
.and_then(|res| match res {
Ok(body) => Ok(HttpResponse::Ok().json(body)),
Err(_) => Ok(HttpResponse::InternalServerError().into()),
})
.responder()
}
}
}
macro_rules! uid_endpoint {
($handler:ident: $dispatcher:ident) => {
endpoint! {
$handler: $dispatcher (path: UidParam) {
uid: path.uid.clone()
}
}
}
}
Something that's obvious from the examples in this post is that macros can get pretty grawlixy and hard to read at times. If the compiler is complaining about some code in one of your macros and you're struggling to identify the problem, it can be helpful to look at the rolled-out macro expansions. There is a compiler flag that lets you print them from the command line:
rustc -Z unstable-options --pretty expanded <MODULE PATH>
Where <MODULE PATH>
is the path
to the source module
containing the macro you want to print.
A general rule of thumb I've found helpful is to try and keep lower-level macros as simple as you can, limiting all repetition to just the outermost macros where possible.