Montage Banner

Table Of Contents

This Page

Case Study: Single Monitor Node

Introduction

In this scenario, a “monitor” agent is loaded and started. The agent monitors the local host and reports host status such as CPU usage and load average. For this example, we will configure and start the agent, pause for 60 seconds, then stop the agent.

The sequence of events in the experiment is controlled by the Magi Orchestrator. At run time, the orchestrator reads a file which contains the desired event streams and orchestrates interaction among agents running in the experiment. First we show the event file, then the expected orchestration output when run on a live experiment.

Specifying the Experiment Events

This is the Agent Abstraction Language (AAL) file the orchestartor uses for our experiment. It, like all AAL files, contains three sections: the event streams, the agent definitions, and a specific mapping between an instantiated (running) experiment and a set of groups. In future versions of the AAL format, the physical mapping portion (hostname to group name) will reside elsewhere or be parameterized.

# Declare a single event stream via the "streamstarts" keyword. This
# stream will start monitoring,  wait for 60 seconds, then stop
# monitoring. (Note: this clause is a current AAL requirement, but will
# be removed in a future AAL format.)
streamstarts: [monitor]

# Declare a single group via the "groups" keyword. The group,
# "monitor_group", contains the single node "control". "control" is the
# name of the single node in the experiment. On a DETER testbed, this is
# specified in an NS file or a topDL file. See DETER documentation for
# more information.  The "groups" clause is where the test bed
# instantiation maps unto Magi. You, as the experiment designer, map your
# nodes unto a series of groups. Magi then orchestrates the experiment
# via these groups. In this case, we have a single group named
# "monitor_group" that contains our single node.
groups:
  monitor_group: [node1]

# Use the "agents" keyword to map agents unto groups. In this case, we
# have a single group to map to a single agent. We declare a single agent,
# "monitor_agent" which runs the monitoring code (implemented by the
# nodeStats agent) and runs on all nodes in "monitor_group".
agents:
  monitor_agent:
    group: monitor_group        # Run the agent on the this group
    path: /share/magi/modules/nodeStats.tar.gz             # load the agent code from this location. This
                                                          # location can be a tar file, full path to agent
    execargs: []                # Give empty initialization command to the agent
                                # implementation.

# Use the keyword "eventstreams" to define your event streams. Events are
# one of two things, events or triggers. Events are similar to remote
# procedure calls on agents. Triggers are synchronization and/or
# branching points in your event stream.  In this case our event stream
# is pretty simple: configure the agent, tell it to begin monitoring the
# local node, wait for 60 seconds, then tell the agent to stop
# monitoring. When the event stream has no more events or triggers, the
# orchestrator will exit.
eventstreams:
  monitor:
  # The first event sends an initialization call to all monitor agents.
  # This invokes the method "setConfiguration" with the given arguments
  # on all agents monitor_agent. The configuration is set to check the node's
  # stats every 5 seconds. The experimentDescription string is associated
  # with the data collected (in the local Magi database).
  - type: event
    agent: monitor_agent
    method: setConfiguration
    args:
      interval: 5
      experimentDescription: "This is a sample experiment."

  # The next event tells the agent(s) to start data collection. It does
  # this by invoking the method "startCollection" on the agents. Note
  # there is no sequence point between this event and the last, so these
  # events happen quickly one after the other.
  - type: event
    agent: monitor_agent
    method: startCollection
    args: {}

  # next, we set a timeout trigger. This trigger causes the execution
  # stream to pause for the given number of milliseconds (in this case
  # 60,000). This is an example of a trigger and synchronization point in
  # the AAL.
  - type: trigger
    triggers: [{timeout: 60000}]

  # After the timeout, the next event tells the agents to stop data
  # collection via the stopCollection method.
  - type: event
    agent: monitor_agent
    method: stopCollection
    args: {}


  # Since there are no more events, event streams, or triggers, the orchestrator will exit when
  # it reaches this point.

Specifying the Experiment Instantiation

Note that this example is DETER and Emulab specific as we require a testbed on which to run the sample experiment. Magi itself is testbed agnostic. This NS file declares a single node running Ubuntu 12.04and tells the node, on startup, to execute the Magi bootstrap script. This script installs and configures Magi on the node.

set ns [new Simulator]
source tb_compat.tcl

set control [$ns node]
tb-set-node-os $control Ubuntu1204-64-STD
tb-set-node-startcmd $control "sudo python /share/magi/current/magi_bootstrap.py"

$ns rtproto Static
$ns run

Executing and Orchestrating the Sample Experiment

Swap in the experiment using the NS file given here. See the standard DETER documentation on instructions for doing that.

The orchestrator requires two things to run: an instatated experiment and an AAL file. When you run the orchestrator, you pass it the AAL file and a single node’s hostname from the experiment. It uses this hostname to connect to the Magi messaging system and orchestrate the experiment. It sends events and receives triggers via the Magi message system to and from this node.

In DETER nodes can be referenced by a host_name.experiment_name.group_name hostname construction. For this example, we assume the hostname is “node1”, the experiment name is “monitor”, and the DETER group name is “montage”, therefore our connection to the experiment is via the node node1.monitor.montage. We aslo assume the AAL above exists in the current directory and is named myEvents.aal.

The command below runs the orchestrator as described, writing the output log to event.log.

> magi_orchestrator --control node1.monitor.montage --events myEvents.aal --vlogfile event.log

Once run, you will see the orchestrator step through the events in the AAL file. The orchestrator adds an initialization event stream to the start of the events in the AAL that creates groups for group communication (a single node in the group in this case) and loads the agents (just the “nodeStats” agent in this case). This is the stream initialization events in the output. Every orchstrator run begins this way. Once the initialization stream is complete, the event streams from the AAL are executed in parallel. In this example, we have a single event stream named “monitor”. The orchestrator output generally has a single line for each event or trigger in a stream. The line shows the event stream name, event status, then event specific information. In the case of an event, this is the method call, method arguments, and on whom the method is invoked. For triggers, this is the trigger status. In the event of an error, like a method returning False, or an exception thrown by an agent, the orchestrator will display the error, unload agents, break down communication groups it has built, then exit. The output line will give information about the error.

Running the magi_orchestrator command above gives the output below. (Note the exact output format may differ slightly as there is ongoing development on this orchestrator module.)

stream initialization : sent : joinGroup monitor_group --> __ALL__
stream initialization : done : trigger GroupBuildDone monitor_group  complete.
stream initialization : sent : loadAgent monitor_agent --> monitor_group
stream initialization : done : trigger AgentLoadDone  monitor_agent complete.
stream initialization : DONE : complete.
stream monitor        : sent : setConfiguration(['This is a sample e ... ) --> monitor_group
stream monitor        : sent : startCollection(None) --> monitor_group
stream monitor        : sent : stopCollection(None) --> monitor_group
stream monitor        : DONE : complete.