EM - Event Manager (version 1.2)

EM is a distributed event management system which allows programs to synchronize their operation across thread, process, and machine boundaries.  The basic concept is that one or more threads/processes may wait for a named event to happen, and another thread/process may trigger that event, which wakes up all the waiters.  The implementation is TCP-based client/server.

License:

CC0
To the extent possible under law, the contributors to this project have waived all copyright and related or neighboring rights to this work. This work is published from: United States.  The project home is http://geeky-boy.com/em.  To contact me, Steve Ford, project owner, you can find my email address at http://geeky-boy.com.  Can't see it?  Keep looking.


  1. EM - Event Manager (version 1.2)
    1. Introduction
    2. Quick Start
    3. Security
    4. Memory of Events
      1. Improvements
    5. Server: emd
    6. Command Line Client Tools
      1. emwait
      2. emtrigger
      3. emquit
      4. EmTail
    7. C API Client Interface
      1. Error Handling
      2. em_create()
      3. em_wait()
      4. em_trigger()
      5. em_quit()
      6. em_delete()
    8. Java API Client Interface
    9. C# API Client Interface
    10. Protocol
      1. Message specification
      2. EM Protocol 1.1
        1. Hello
        2. Flags
        3. Trigger
        4. Quit
    11. Development Tips
      1. Package Version Number
      2. Protocol Version Number
      3. Building
    12. To Do
    13. Release Notes
      1. 1.2 (15-Oct-2012)
      2. 1.1 (1-Oct-2012)
      3. 1.0 (26-Sep-2012)


Introduction

The event manager system is available  as a set of C-based command-line tools for Unix shell, as a set of Java-based command-line tools, as a Unix C API, as a Java API, and as a C# (.NET) API.  The Java tools and API should work on any OS that supports Java, including Windows. Be aware that the event manager server (emd) is currently only available as a C-based command-line tool for Unix and Cygwin (Windows).

You can find event manager documentation at:

This html file was created with SeaMonkey, a free HTML editor.

Quick Start

These instructions assume that you are running on Unix.  I have tried it on Linux, Solaris, FreeBSD, HP-UX, AIX, and Cygwin (on Windows 7).

  1. Download the "em_1.2.tar" file.
  2. Unpack the tar file:
        tar xvf em_1.2.tar
        cd em
  3. Build the package:
        ./bld.sh
        ./tst.sh
  4. Run the server:
        ./emd 12000 test_emd
  5. In a different shell, run the emwait client (the "-1" tells emwait to wait forever):
        ./emwait localhost 12000 test_event -1
  6. In a different shell, run the emtrigger tool:
        ./emtrigger localhost 12000 test_event
    This will cause the emwait command to return.
  7. Run the emquit tool:
        ./emquit localhost 12000 test_emd

Security

The event manager protocol is NOT secure.  If you design a system using EM, somebody else can connect to your server and trigger events at will, interfering with your system's proper behavior.  Do not run an event manager server that is accessible by the Internet.

One fairly simple way to make the protocol more secure is to model the protocol after RADIUS, which uses shared secrets and hashing.  This does not encrypt the communication - somebody would still be able to snoop - but it prevents clients without the shared secret from accessing the EM server.  This approach eliminates the need to have usernames and passwords.

Memory of Events

It is easy to model EM after condition variables with broadcast signalling.  However, there is a subtle but important way that the current implementation of EM might behave differently than expected.  Consider the following sequence (assuming both processes are using the API and have already connected to the server):
    1. process 1: trigger X
    2. process 1: trigger Y
    3. process 2: wait Y  (wakes up immediately)
    4. process 2: wait X  (sleeps indefinitely)

This happens because at step 3, process 2's wait call consumes any waiting events that were triggered.  I.e. the trigger X in step 1 is consumed and discarded in step 3.

Contrast this with:
    1. process 1: trigger X
    2. process 1: trigger Y
    3. process 2: wait X  (wakes up immediately)
    4. process 2: wait Y  (wakes up immediately)

In this scenario, step 3 consumes trigger X and returns, leaving trigger Y to be consumed in step 4.  Thus, the memory of events can only be relied upon given an understanding of how events are distributed and consumed.

The issue of event memory becomes even more important with the command-line tools, which behave differently than the APIs.  Due to the lack of a persistent TCP connection between invocations, the command-line tools have no memory at all of past event triggers.  I.e. if an event is triggered, and then emwait is invoked, the emwait will not wake up.  It will only wake up for events that are triggered while the emwait is waiting.

Users of the command-line tools must take care to avoid race conditions between emwait and emtrigger.

Improvements

Given the non-intuitive and sometimes awkward behavior of EM event memory, it would be nice to add more-formal operations which better-model operating system synchronization primitives, like counting semaphores, mutexes, and condition variables.  These objects would be stateful across connects and disconnects, allowing command-line tools to behave the same as the APIs.  Implementing them would involve much more communication between the clients and the server - e.g. a "wait" operation would need to explicitly tell the server that the client is waiting, and which object it is waiting for.  The server would need to maintain queues of waiting clients for each object and implement the correct semantics.  (For example, a mutex should be automatically released if the client dies.  This is easy to detect with the API by using a TCP disconnect, but harder to detect with command-line tools where each function disconnects upon completion.)

Server: emd

The emd program is a Unix server which runs as a daemon.  Multiple instances of it can be run within a network, and even on the same host (differentiated by the TCP listen port).  Each instance of emd effectively defines an event name space and a synchronization space.  So a client connected to one emd instance can trigger and event, causing all waiters for that event to wake up, but only those waiters which are connected to the same emd instance.  (Note: using the API, a program could be written to connect to multiple servers so as to coordinate activities between two spaces.)

Usage:
    emd
[-q | -Q] port server_name

where:
    -q - quiet: do not print informational messages.  Messages are normally printed to stdout.
    -Q - very quiet: do not print informational message or error messages.  Messages (informational and error) are printed to stdout.
    port - TCP listen port, number between 1 and 65535
    server_name - string consisting of characters A-Z, a-z, 0-9, and "_" (underscore).  When the emquit command is used to terminate an instance of emd, the server name must match.

Command Line Client Tools

There are three event manager client command-line tools available, corresponding to the primary API functions.  The C-based tools for Unix are:
    emwait - wait for an event to happen.
    emtrigger - trigger an event (wake up any waiters).
    emquit - terminate an instance of emd.

The Java-based tools are:
    EmWait
    EmTrigger
    EmQuit

The C#-based tools are:
    EmWait
    EmTrigger
    EmQuit

In addition, there is another useful tool available in Java only:
    EmTail
This program monitors a log file and generates events when log messages matching regular expression patterns are seen.  Think of the tool as being conceptually-similar to the Unix command sequence, "tail -f file | grep patterns | EmTrigger".

Note that the using the Java-based command-line tools in a script can lead to differences in timing.  While the Java language is fast once the JVM gets started and the initialization is over, that start-up cost is considerable, especially if multiple commands are being executed in parallel on the same machine.  For example, the "tst_java.sh" script is basically the same as the "tst.sh" script for C, except that all the EmWait times had to be multiplied by 20.  I tried a multiplier of 10, and it still didn't do the trick.  When writing scripts, I suggest using the C-based tools whenever possible.  (Note: the Java-based API should perform just fine.  A Java-based application would only suffer the start-up cost once, with individual calls to the API performing well.)

These commands are normally used within a set of cooperating shell scripts running in parallel to coordinate their activities.

emwait

Usage:
    emwait hostname port ev_name ms_timeout        (C-based)
or:
    java -cp em.jar EmWait hostname port ev_name ms_timeout        (Java-based)
or:
    EmWait hostname port ev_name ms_timeout        (C#-based)

Where:
    hostname - DNS name of machine running desired emd instance.
    port - TCP listen port for desired emd instance.
    event_name - string consisting of characters A-Z, a-z, 0-9, and "_" (underscore).
    ms_timeout - number of milliseconds to wait for event before timing out.  A value of -1 means "wait forever".

On timeout, emwait prints the word "timeout" to standard out and exits with a status of 2.

emtrigger

Usage:
    emtrigger hostname port ev_name        (C-based)
or:
    java -cp em.jar EmTrigger hostname port ev_name        (Java-based)
or:
    EmTrigger hostname port ev_name        (C#-based)

Where:
    hostname - DNS name of machine running desired emd instance.
    port - TCP listen port for desired emd instance.
    event_name - string consisting of characters A-Z, a-z, 0-9, and "_" (underscore).

emquit

Usage:
    emquit hostname port serv_name        (C-based)
or:
    java -cp em.jar EmQuit hostname port serv_name        (Java-based)
or:
    EmQuit hostname port serv_name        (C#-based)

Where:
    hostname - DNS name of machine running desired emd instance.
    port - TCP listen port for desired emd instance.
    serv_name - string consisting of characters A-Z, a-z, 0-9, and "_" (underscore).  Must match the serv_name provided when the desired emd instance was started.

EmTail

Usage:
    java -cp em.jar EmTail hostname port log_file cfg_file [quit_event]        (Java-based)

Where:
    hostname - DNS name of machine running desired emd instance.
    port - TCP listen port for desired emd instance.
    log_file - input file.  Normally a continuously-growing text file written by a different running program.
    cfg_file - configuration file.  File containing event names and regular expression patterns.
    quit_event - optional parameter specifying an event name which EmTail should monitor, and exit if triggered.  If not supplied, no events are monitored by EmTail.

The EmTail tool first reads the cfg_file which defines a list of event names and regular expression patterns.  Then the tool reads the log file, comparing each line to the list of patterns, and triggering events corresponding to the patterns which match.  The tool reads the log file similarly to how "tail -f" does - it reads till end of file, and then continues to monitor the file for file growth, reading any new lines as they become available.

The purpose of EmTail is to allow events to be triggered by programs which were not directly-integrated with EM.  So long as that program writes a log file with log lines that indicate internal events, those events can be fed into EM.

The configuration file is in the form of a Java properties file.  For example consider the file "eg.cfg":

# config file for EmTail.  This is a comment line.
connect=connection to .* successful
failed=there were [1-9]\\d* failures
winfile=c:\\\\

Then consider the command-line:
    java -cp em.jar EmTail localhost 12000 eg.log eg.cfg
This assumes that an emd is running at port 12000.

For each line in "eg.log" which match the regular expression pattern "connection to .* successful", EmTail will trigger the event named "connect".  Note that two back-slashes in "there were [1-9]\\d* failures".  The idea is that we want to match a non-zero number, so we first match a non-zero digit ("[1-9]"), and then any number of any digits ("\d*").  But backslash is a special escape character for the Java properties file parser, so you need two of them to get "\d*" into the matching engine.  Similarly, you see 4 back-slashes in "winfile=c:\\\\".  This passes "c:\\" of them to the matching engine, which tells it to match "c:\".

Be aware that the patterns are allowed to match sub-strings in the log file.  I.e. if the above "eg.log" file contains "file c:\abc\xyz.txt is invalid", it will trigger the event named "winfile".

Finally, unlike "tail -f", the EmTail tool always processes the entire existing contents of the file before monitoring it for growth.

If you are unfamiliar with regular expressions, see http://www.regular-expressions.info/quickstart.html for a short tutorial.

C API Client Interface

There are five C API functions available:
    em_create() - create an event manager object and associate it with an em instance.
    em_wait() - wait for an event to happen.
    em_trigger() - trigger an event (wake up any waiters).
    em_quit() - terminate an instance of emd.
    em_delete() - delete an event manager object.

When an event manager object em is created, it establishes a TCP connection to the desired instance of emd and indicates if it should receive event notifications.  If the em object will only be used to trigger events, then the application should indicate that it should not receive event notifications.  But if the application will use the em object to wait for events, then it must indicate that it should receive event notifications.  In this case, the em object can be used for both triggering and waiting for events.

Note that because the em object holds a persistent TCP connection to emd, the order of trigger / wait is not important.  If an event is triggered before em_wait() is called, the trigger will be remembered, and the wait will return immediately.  This is different from the command line interface.

Error Handling

Each of the API function has as its final parameter a pointer to a variable of type "char *".  The API function will set this variable to NULL if there is no error, or will set it to point at a constant C string if there is an error.  For example:

    char *errstr;
    em_trigger(em, "event_name", &errstr);
    if (errstr != NULL) printf("%s\n", errstr);

The API also has the following convention regarding the error strings: if the error string starts with upper-case 'E' then additional detail can be obtained via the errno family: errno, perror(), strerror(), or sterror_r().  Furthermore, be aware that any error string starting with 'E' actually starts with the 2-character sequence "E:", which can be skipped if desired when printing.  For example:

    char *errstr;
    em_trigger(em, "event_name", &errstr);
    if (errstr != NULL) {
    if (errstr[0] == 'E'
)
            perror(&errstr[2]);    /* skip the "E:" */
        else
            fprintf(stderr, "%s\n", errstr);
    }

em_create()

Synopsis:
    #include "em_api.h"

    em_t *em_create(char *hostname, int port, int em_flags, char **errstr);

The em_create() function establishes a TCP connection to an instance of emd and indicates if it should receive event notifications.  When event manager is no longer needed, em_delete() should be called.  The function takes the following arguments:

    hostname - pointer to C string containing DNS name of machine running desired emd instance.
    port - integer TCP listen port for desired emd instance (in host order).
    em_flags - integer bit map.  0 or bit-wize OR of optional flags.  Current flags are:
        EM_FLAGS_RCV_EVS - (value=1) indicate that event notifications should be received.
    errstr - caller should pass a pointer to a variable of type "char *".  See error handling for details.

Return Value:
    event manager object.

em_wait()

Synopsis:
    #include "em_api.h"

    int em_wait(em_t *em, char *ev_name, unsigned int ms_timeout, char **errstr);

The em_wait() function waits for the desired event to to be triggered, or until a timeout.  The function takes the following arguments:

    em - event manager object.  See em_create().
    ev_name - name of event, C string consisting of characters A-Z, a-z, 0-9, and "_" (underscore).
    ms_timeout - integer number of milliseconds to wait for event.  Special values:
        -1 - wait forever.
        0 - wait minimum amount of time (0 for C API; i.e. immediate return).
    errstr - caller should pass a pointer to a variable of type "char *".  See error handling for details.

Return Value:
    0 - event happened.
    -1 - timeout happened.

em_trigger()

Synopsis:
    #include "em_api.h"

    int em_trigger(em_t *em, char *ev_name, char **errstr);

The em_trigger() function triggers the desired event.  Wakes up all processes which are waiting for the event.  The function takes the following arguments:

    em - event manager object.  See em_create().
    ev_name - name of event, C string consisting of characters A-Z, a-z, 0-9, and "_" (underscore).
    errstr - caller should pass a pointer to a variable of type "char *".  See error handling for details.

Return Value:
    none.

em_quit()

Synopsis:
    #include "em_api.h"

    int em_quit(em_t *em, char *serv_name, char **errstr);

The em_quit() function tells the connected emd instance to exit.  The function takes the following arguments:

    em - event manager object.  See em_create().
    serv_name - name of emd instance, C string consisting of characters A-Z, a-z, 0-9, and "_" (underscore).  Must match the serv_name provided when the emd instance was started.
    errstr - caller should pass a pointer to a variable of type "char *".  See error handling for details.

Return Value:
    none.

em_delete()

Synopsis:
    #include "em_api.h"

    int em_delete(em_t *em, char **errstr);

The em_delete() function disconnects from the emd server and deletes the object.  The function takes the following arguments:

    em - event manager object.  See em_create().
    errstr - caller should pass a pointer to a variable of type "char *".  See error handling for details.

Return Value:
    none.

Java API Client Interface

See javadoc documentation.

C# API Client Interface

The C# API is essentially identical to the Java API.  See javadoc documentation.  Yes, I know that I could use SandCastle and do it right.  Sometimes I suck.

A few deviations from standard .NET convention:

Protocol

The message protocol for EM was developed with the following requirements in mind:
    - Protocol must be pure ASCII.  This makes it easier to test and debug (e.g. using telnet as a client).
    - Protocol must allow for future growth, and support interoperability between versions.
    - Protocol must be easy to implement.
    - Basic message format must be reusable for other applications (i.e. flexible and not specific to EM).

Note that "high performance" and "efficiency" are not among the requirements.  It is not expected that applications will be using EM for handling many thousands of events per second.  For that kind of performance, you might want to consider using a high-performance messaging middleware, like Informatica's Ultra Messaging product.  (Full disclosure: I work for Informatica.)

Many programmers will conclude that this protocol is over-engineered for the simple application of EM, and I would agree with them.  But I wanted a simple and general protocol in my personal toolbox.  Also, regarding the lack of performance requirements, I did put in some effort to make the server pretty efficient.  For example, it does non-blocking reads with a large buffer, and can pull many smaller messages out of that buffer.

Also note that a particular application system does not need to implement all aspects of this protocol.  For example, the Event Manager system does not implement hex encoding of disallowed characters.

Message specification

message

So, a message is a cmdBlock, followed by zero or more optionBlocks, followed by a lineEnd:
    message := cmdBlock [ ';' optionBlock ... ] lineEnd

A lineEnd can either be a single NL "\n", or a single CR "\r", or a CR/LF pair "\r\n".  Any implementation must be able to handle any lineEnd sequence dynamically.

A semi-colon character is used to separate the cmdBlock from the first optionBlock, and to separate multiple optionBlocks.  The maximum size of a message is 4096 characters (including lineEnd).

A particular command type defines the valid options.  There is no particular limit to the number of optionBlocks in a message, subject to the maximum message size.

cmdBlock

A cmdBlock consists of a cmdChar followed by zero or more positional parameters:
    cmdBlock := cmdChar [ parameter [ ',' parameter] ... ]

A cmdChar is one of the 26 ASCII letters, which defines the command type.  The case of the cmdChar indicates whether the command is required (upper-case: must be processed) or optional (lower-case: may be processed).  Note that this protocol specification does not define the meanings of the command types; that is determined by the application.

When multiple parameters are included, a comma character is used to separate them.

A particular command type defines the proper number of parameters.  There is no particular limit to the number of parameters in a cmdBlock, subject to the maximum message size.

optionBlock

An optionBlock consists of an optChar followed by zero or more positional parameters:
    optionBlock := optChar [ parameter [ ',' parameter ] ... ]

An optChar is one of the 26 ASCII letters, which defines the option type.  The case of the optChar indicates whether the command is required (upper-case: must be processed) or optional (lower-case: may be processed).  Note that this protocol specification does not define the meanings of the command types; that is determined by the application.

When multiple parameters are included, a comma character is used to separate them.

A particular option type defines the proper number of parameters.  There is no particular limit to the number of parameters in an optBlock, subject to the maximum message size.

parameter

A parameter is an arbitrary string of standard printable ASCII characters, not including semi-colon ";", comma",", and backslash "\".  Note that since CR, LF, null, etc are not printable, they are now allowed in a parameter string.

If the data to be included in a parameter includes disallowed characters, they may be encoded in hex using the C language character literal syntax: \x00 - \xFF.  Thus, if the data is "abc" followed by a backslash character, it should be encoded as "abc\x5C" (without the double quotes).  Hex should be encoded with upper-case A-Z.  It is permissible to encode allowed characters in hex form.  Thus "abc\x5C" can also be encoded as "\x61bc\x5C"; both represent the data "abc" followed by a backslash.

The maximum length of a parameter is 255 characters.  For escaped characters, the multiple characters embodying the escaped character must all be counted towards the maximum length.  I.e. "abc\x5C" counts as 7 characters, even though it only represents 4 characters of data.

Example

"H1,1;Atest" followed by newline - command type "H" (upper-case means that it MUST be processed) with two parameters, each set to the value of "1", and one option (upper-case means that it MUST be processed - i.e. required option) with a single parameter with the value "test".  The "\n" newline character ends the message.  If it is desired to have a newline as part of the A option parameter value, it would be encoded as "H1,1;Atest\x0A" followed by a linefeed.  Also note that the backslash-hex encoding must actually be in the data stream.  Thus, to encode that message in C, you would enter it as "H1,1;Atest\\x0A\n".  The double-backslash prevents C from interpreting the hex literal, since we want the hex literal to be sent on the wire.

EM Protocol 1.1

Be aware that the Java API does not handle message optionBlocks; this verison of EM  does not require their use.

This version of EM does not require the use of disallowed characters (comma, semicolon, non-printables) in parameters.  Thus EM does not implement the C language hex syntax for literal characters.

In the message definitions below, parameters which are classified as "names" are further restricted to consist of only characters A-Z, a-z, 0-9, and underscore "_".

Hello

Command: H (hello), MUST be processed by receiver.
Usage: Upon connection accept, server sends hello.  Client processes and responds with hello.  Server processes.
Command Parameter 1: Protocol version spec, "1.1" in this version.

Option: a (application name), MAY be processed by receiver.
Usage: client may include this option when it sends hello message.  Server must not.
Option Parameter 1: string, name of application.

Flags

Command: F (flags), MUST be processed by receiver.
Usage: after hello handshake, client may send this to set client flags.  Client can skip this if client flags = 0.
Command Parameter 1: integer, decimal representation of flags bitmap.  See em_flags parameter to em_create().

No options.

Trigger

Command: T (trigger), MUST be processed by receiver.
Usage: client sends to initiate an event trigger.  Server sends to notify client that an event was triggered.
Command Parameter 1: string, name of event.

No options

Quit

Command: Q (quit), MUST be processed by receiver.
Usage: client sends to request the server to exit gracefully.
Command Parameter 1: string, name supplied to server when it was started.

No options.

Development Tips

Package Version Number

When the package version number needs to be bumped:

Important: see next section.

Protocol Version Number

There is also a protocol version number, defined in em_parse.h and Em.java (EM_VERSION_MAJOR and EM_VERSION_MINOR).  This is NOT NOT NOT the same as the package version number.  As bugs and minor enhancements are made to the package, I would not expect the protocol version to change.

When client and server handshake with hello messages, each component will compare the major version in the hello message with the major version implemented by the component.  If different, they will disconnect.  (Client and server do not care if minor versions are different; that is supplied for informational purposes.)

The minor version number should be bumped when command and/or option types are added or deleted which are lower-case (i.e. MAY be processed, but not required).

The major version number should be bumped (and minor set back to 1) when command and/or option types are added or deleted which are upper-case (i.e. MUST be processed).  Other things which would require a major version change: significant changes in the sequencing of messages, fundamental changes in message format, an expansion of the protocol features being used.

Building

Yes, I am a old-school C programmer with a strong Unix bent.  This means that I edit with vi and build using shell scripts.  Java and .NET programmers will snicker at the fact that I don't use an IDE.  Fortunately, I'm old enough not to care.  If you want to cram all this into an IDE, be my guest.  In particular, Phil Viso provided the IDE-specific files for the .NET port.  I'll keep using vi and the build scripts from Cygwin.  :-)

To Do

Possible enhancements:

Release Notes

1.2 (15-Oct-2012)

1.1 (1-Oct-2012)

1.0 (26-Sep-2012)