Data Management¶

Data Management is a very important aspect of experimentation, which is why the data management layer is a very important aspect of the MAGI framework.

The following are some of the important terms that are used in context of MAGI’s data management layer.

Sensor: MAGI agent that senses information and needs to store it.
Collector: Database server that can be used to store data.
Shard: In case of a distributed database setup, the data is partitioned and stored in multiple database servers. This concept od partitioning data is known as sharding, and each of the database servers is known as a shard.

MAGI’s data management layer is highly configurable, with experimenters having the ability to setup a centralized or a distributed database, and also configure, at the node level, where sensors collect data.

In case of a distributed/sharded database setup, MAGI sets up a global database server. This server gives a holistic view of the database.

MAGI data management uses MongoDB at its base.

Data Manager Configuration¶

The data management layer configuration is part of the MAGI’s experiment level and node level configuration files.

As mentioned earlier, MAGI’s data management layer is highly configurable. More information about the same in available at DBDL: Configure the MAGI Data Management Layer

MAGI’s data management layer enables an experimenter to do the following.

Sense and Collect¶

The following are the steps an agent developer should follow to populate MAGI’s database

Import the database management utility
from magi.util import database
Initialize a database collection passing it a unique name. We suggest using the agent name. Each agent implementation that extends from one of the predefined agents, like the DispatchAgent, has a variable “name” that stores the agent name.
self.collection = database.getCollection(self.name)

Insert data. Each record can be inserted as a dictionary of key-value pairs.

self.collection.insert({“key1” : “value1”, “key2”: “value2”})

Note

The db management utility inserts three other entries per record

host: <node’s hostname>
created: <record creation time>
agent: <agent name>

Query and Analyze¶

In case of a distributed database setup, a user can connect to the mongo db server running on the global server node to get an experiment-wide view.

However, in case of an unsharded setup, a user would have to connect to the appropriate collector based on the sensor-collector mapping to fetch data stored by a particular sensor.

MAGI, by default, sets up an non-distributed database, with all the sensors collecting at the same collector.

> mongo node-1.myExperiment.myProject:27018
mongo> use magi
switched to db magi
mongo> db.experiment_data.find()
{ "agent" : "user_agent", "host" : "node-1", "created" : 1409075736.646182,
"key1" : "value1", "key2" : "value2" }
{ "agent" : "user_agent", "host" : "node-2", "created" : 1409075737.514683,
"key3" : "value3", "key4" : "value4" }

In case of a distributed setup, the configuration file would have information about a global server host. An experimenter can connect to the global server to get an experiment wide view of the database, or connect to individual collectors to get their local view.

And, for more advanced queries, you can refer the mongo documentation available at http://docs.mongodb.org/manual/tutorial/query-documents/.

Table Of Contents

Previous topic

Next topic

This Page

Data Management¶

Data Manager Configuration¶

Sense and Collect¶

Query and Analyze¶

Navigation

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Data Management¶

Data Manager Configuration¶

Sense and Collect¶

Query and Analyze¶

Navigation