Samuel Tardieu @ rfc1149.net

Defining new control structures using C macros

,

While procrastinating today, I stumbled upon a blog post describing how to start some background work from C code in a convenient way. The solution the author came in with allows a developer to write code such as:

int baz(void)
{
  do_something();

  start_deferred_work();
    do_something_else();
  end_deferred_work();

  do_something_different();

  return some_value;
}

The implementation uses two macros, start_deferred_work to fork the current process and end_deferred_work to terminate the child process using exit(0).

I do not intend to discuss the validity or the efficiency of this method. I am more interested in how the author could have introduced a new control structure instead of those two macros. The main reason I do not like them is that they have a risk of letting you with an open scope if you forget the end_deferred_work macro, and they do not play nicely with automatic indentation, as the implicitly created scope does not visually appear.

Many languages allow you to introduce new control structure through delayed evaluation blocks (e.g., Scala, Factor, Smalltalk, Ruby) or some macro expansion (e.g., Lisp-like languages). This is not directly possible in C as the C preprocessor only allows basic manipulations. However, those text manipulations can sometimes be sufficient to mimic new control structures.

Let us rewrite those two macros as a pseudo control-structure named detach. To acknowledge the fact that after forking we need to perform two actions (execute the user-supplied code, the exit the child process) we will use a for loop as a way to invert two actions A() and B(): if B() evaluates to false or makes the current thread terminate, then A(); B(); may be written as for (;; B()) A();. Since exit(0); makes the current process quit, we can use this for construct to insert a call to exit(0); after the user-supplied code:

#include <stdlib.h>
#include <unistd.h>

#define detach if (fork()); else for (;; exit(0))

After this definition, we can use detach as if it were a built-in C control structure comparable to for or while:

/* Example 1 */
detach
  do_something();

/* Example 2 */
detach {
  do_something();
  do_something_else();
}

/* Example 3 */
detach
  for (int i = 0; i < 2; i++)
    do_something(i);

/* Example 4 */
if (execute_in_background)
  detach do_something();  /* Execute on background */
else
  do_something();         /* Execute on foreground */

The astute reader will wonder why, in the macro definition snippet, we used a surprising if (fork()); else instead of if (!fork()) which might first appear to be equivalent. The reason to do so is that both constructs may sometimes have different effects: in C, an else is paired with the closest same-scope if without a else clause. This allows the compiler to parse code such as

if (condition)
  if (other_condition) do_something(); else do_something_else();

without ambiguity. Here, we want users of our macro to be able to use it with a similar if clause as is the case in example 4. If we used the incorrect if (!fork()) code, we would end up with the following macro expansion (indented to show how the compiler understands it):

if (execute_in_background)
  if (!fork())
    for (;; exit(0))
      do_something();
  else
    do_something_else();

According to the rule we described earlier, the else do_something_else(); part would be matched with the if (!fork()) in line 2 instead of the if (execute_in_background) in line 1. If execute_in_background is true, do_something() and do_something_else() will both be executed (the former in the child process, the latter in the parent process), and if execute_in_background is false, none of the functions will be called.

Using our proposed macro, the expansion will give (once again indented as the compiler understands it)

if (execute_in_background)
  if (fork());
  else
    for (;; exit(0))
      do_something();
else
  do_something_else();

which correctly matches the latest else with the right if.

For the same reason, we can properly nest calls to our macro:

#include <stdio.h>

int main(void)
{
  printf("In parent (pid = %d)\n", getpid());
  detach {
    printf("In child (pid = %d, ppid = %d)\n", getpid(), getppid());
    detach printf("In grand child (pid = %d, ppid = %d)\n", getpid(), getppid());
    printf("Still in child here (pid = %d)\n", getpid());
  }
  printf("Still in parent (pid = %d)\n", getpid());
}

gives on my system

In parent (pid = 31402)
Still in parent (pid = 31402)
In child (pid = 31403, ppid = 31402)
Still in child here (pid = 31403)
In grand child (pid = 31404, ppid = 31403)

So the C macro system, although very limited, lets you define new control structures to use in your code. In C++, the Boost library makes an heavy use of them to define, for example, new iterators.

Note that, once again, I am not discussing here the appropriateness of spawning a child process to execute background code, but only a better way of implementing the original proposal.

Aren’t C macros cool?

Edit: the initial version of this article used a more complicated macro which required the compiler to respect the C99 standard. A commenter pointed out a simpler form, which is the form now used here.

blog comments powered by Disqus