James McGlashan (DarkFox)

Programmer, hacker, cypherpunk.

Rust macro gotchas and two new crates (using macros)!

Note: I will dive straight into Rust code in this article. If you are not familiar with Rust you might want to read through http://doc.rust-lang.org first!

When researching concepts for a personal project I found a project hoverbear/raft which is an incomplete implementation for the raft concensus algorithm written purely in Rust. I have since contributed and created two crates to improve a couple of often tedious tasks using Rust's Macro system. These two crates are wrapped_enum and scoped_log.

Wrapped_enum

I started creating this macro before I had found hoverbear/raft, however, I have since improved it.

When wrapping types, we have to define an impl From<$ty> for each type that we want to wrap. This can be quite easily factored down through macro loops.

pub enum ErrorsFromMyModule {
    /// Documentation is not required on normally defined enums
    EveBrokeIt,
}

wrapped_enum!{
    #[derive(Debug)]
    /// Documentation is really good; and is required for the variants of a
    /// public enum due to the gotchas of macros!
    pub enum Error {
        #[doc = "Another gotcha! Multiline comments require this manual \
                doc attribute. \
                io::Error occurs when functions from std::io fail"]
        Io(io::Error),
        /// Local error
        Local(ErrorsFromMyModule),
    }
} // No manual `impl From<io::Error> for Error`. Thanks macro system!

So, how does this macro work?

macro_rules! wrapped_enum {
    ($(#[$attr:meta])*
     pub enum $enum_name:ident {
         $($(#[$variant_attr:meta])+
         $enum_variant_name:ident($ty:ty)),+
         $(,)*
     }
    ) => (
        $(#[$attr])*
        pub enum $enum_name { $($(#[$variant_attr])+ $enum_variant_name($ty)),+ }
        $(impl From<$ty> for $enum_name {
            fn from (ty: $ty) -> Self {
                $enum_name::$enum_variant_name(ty)
            }
        })+
    );
    ($(#[$attr:meta])*
     enum $enum_name:ident {
         $($enum_variant_name:ident($ty:ty)),+
         $(,)*
     }
    ) => (
        $(#[$attr])*
        enum $enum_name { $($enum_variant_name($ty)),+ }
        $(impl From<$ty> for $enum_name {
            fn from (ty: $ty) -> Self {
                $enum_name::$enum_variant_name(ty)
            }
        })+
    );
}

Too much? Not really, but here is the first gotcha! The keyword pub is used to define the enum such that it is accessible from external modules. This keyword is important but we can't make it optional without duplicating code.

Looking closer, we maintain zero or more attributes to assign to the enum. For variants on the other hand, it gets messy - public enums require one attribute and private enums doen't accept attributes on variants. These attributes include documentation.

After reassembling the enum and implementing the appropriate From traits accordingly, the consumer may implement further traits such as unwarpping via Display.

impl fmt::Display for Error {
  fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
    match *self {
      Error::Io(ref io_error) => fmt::Display::fmt(io_error, f),
      Error::Local(ref local_error) => fmt::Debug::fmt(local_error, f),
    }
  }
}

FAQ

Q: Why not just use the error-trait crate?

Wrapping enums is less complicated and has more use cases than just handling errors. In a personal project I wrap multiple mio::Evented types into an enum also implementing mio::Evented, here the wrapping is to simplify working with multiple transports (Tcp, unix domain sockets and Udp).

Gotcha list

  • pub keyword can't be made optional without duplicating code;
  • Macros expansion fails at multiple NT paths, without checking if a path is even valid.
    • Documentation is required for public enums; (defined as 1 or more; breaks with more, see multi-line issue)
    • Documentation is forbidden for private enums; (not defined as we can't make it optional)
    • Multi-line docs must use #[doc = ""] instead of ///; (Each /// line expands to a #[doc] per line)

Improvement using syntax extensions

This macro could be replaced with a syntax extension which could work with Rust's normal #[derive(X)] attributes so the consumer gets the wrapping without the pitfalls to the limitations of the macro system.

#[derive(Debug, from_variants)]
pub enum {
    Io(io::Error),
    Local(ErrorsFromMyModule),
}

Where do I get the crate?

On crates.io!

Scoped-log

I created this crate purely for Raft to solve a particular issue, including relevant information in debugging output. In Raft's case, this includes the ServerID for connection related logging.

The magic behind this crate is a thread_local! containing RefCell<Vec<String>> to maintain a state, unique per thread. The push_log_scope! macro pushes the new message to the end of this Vec<String> while returning a macro-hyginic variable into the scope in which the macro was called. When the variable reaches the end of the code block, the deconstructor runs.

thread_local!(pub static __LOG_SCOPES: RefCell<Vec<String>> = RefCell::new(Vec::new()));

In the case of hoverbear/raft we used to handle logs with the log crate directly. This crate became a joint effort between myself and Dan Burkert - another contributor of hoverbear/raft.

The following snippet is simplified to maintain relevant logic.

if let (duration, timeout, handle) = connection.reset_peer(event_loop, token) {
  info!("{:?}: reset, will attempt to reconnect in {}ms", self, duration);
  // Do things with the timeout and handle.
}

Here the desired debugging information that we include is our self, so this required either lending self to the function reset_peer, returning the duration up the stack or do what I did and define scoped meta that remains accessible within the thread.

Now the same relevant code has been simplified to:

// In server.rs
push_log_scope!("{:?}", self);
if let (timeout, handle) = connection.reset_peer(event_loop, token) {
  // do things
}

// In connection.rs
scoped_info!("reset, will attempt to reconnect in {}ms", duration);

Returning to the scope handling. Scope is defined as follows:

pub struct Scope;

impl Drop for Scope {
  fn drop(&mut self) {
    if !cfg!(log_level = "off") {
      __LOG_SCOPES.with(|f| f.borrow_mut().pop());
    }
  }
}

push_log_scope ends with a let __logger_scoped__message = Scope which then inherits the lifetime of the code block the push_log_scope! function was called from. Example:

push_log_scope!("1");
scoped_info!("alice");
{
  push_log_scope!("2");
  scoped_info!("bob");
}
scoped_info!("carol");

And the output is:

INFO:example:main: 1: alice
INFO:example:main: 1: 2: bob
INFO:example:main: 1: carol

The scopes remain true with child functions due to the thread_local! RefCell<Vec<String>>.

Improvement using syntax extensions

The most tedious task when using scoped logs is deciding where to start our scopes as this requires manually walking the control path manually and being careful to not duplicate scoped information due to a function being called both from an external module and from inside the type itself.

  • Lint checks ensuring that the scopes never repeat in the same control flow and that they do not outlive the object they document.

  • Attributes can handle the decision making for where to insert the push_log_scope! macro.

For the latest status for a syntax extension alternative see rs-scoped-log/issue#4.

Where do I get the crate?

On crates.io!