Next: Database Maintenance Up: The SNO Database: SNODB Previous: Database Structure

Networking and Database Checking

The HEPDB database package comes with a distribution system for transmitting updates to and/or from a central repository. It is based on a master-slave protocol in which one node keeps the primary copy of the database, implements updates to that copy, and distributes updates to secondary copies at the slave nodes. Since it is critical to maintain a certain degree of control over the contents of the database, individual members of the collaboration will not in general be permitted to implement updates to the master database. Other large collaborations typically give control to one person (hereinafter referred to as the database ``czar'') who receives update submissions from individuals responsible for various segments of the database. After a reasonable number of these requests are accumulated and tested, they are incorporated into the master database, verified, and then sent out to the slave nodes and verified. Ideally, this should happen on roughly a monthly basis. Certainly more frequent updates will occur during detector commissioning.

The files and programs HEPDB uses to implement this master-slave protocol are described below:

The master database server for SNO will probably be placed at a location with good access to the internet, with all other institutions acting as slave servers. The configuration for each server or application is determined by the hepdb.names file in the directory pointed to by the particular CDSERV variable which the job is given. When accessing information, a database application reads directly from the database file pointed to in the hepdb.names file. The cdserv program on the master server will only run when it is deemed appropriate by the database czar. Journal files resulting from applications requesting changes to the database will be automatically written to the queue directory by the database application through interaction with HEPDB and its cdserv program. Since the slave sites' queue and todo directories are not identical, the slave servers do not process the file and hence do not insert the changes into the slave servers' local database. Instead, the database czar, upon determining that a sufficient number of database updates are ready, starts the cdmove program, which in all likelihood will run on the master server. This program moves the journal files from the slaves' local queue directories to the todo directory of the master server.

Before the master server's cdserv program is allowed to process the journal files the database czar must check for the integrity of the submitted database changes. (It is for this reason that the master server's cdserv program does not run continuously; otherwise the journal files copied over by the cdmove program would be automatically inserted into the master server's official copy of the database by cdserv, and then propagated out to the entire collaboration by cdmove.) The way this check is done is as follows: a full ``check'' copy of the database is created, complete with a copy of all the requested changes in the form of the journal files copied from remote slave nodes by cdmove and copies of all the system interfaces' database write requests (see Section 7.2). A cdserv process, as well as all appropriate system interface routines which write to the database (e.g., DAQ, CMA, etc.), are started. These process the copied journal files and interface write requests, respectively, and thereby implement the submitted database changes to the database copy. Then, a comprehensive SNOMAN test job is run to check that the submitted updates are okay. Note that the SNOMAN test job has not yet been created, and probably will be only after we have a better idea what to look for.

Once the checking procedure is satisfactorily completed, the official cdserv process is started. The master server then processes the journal files, updating the master copy of the relevant database files. The master server also puts copies of the journal files in the appropriate directories for distribution to those slave sites that are identified in the master's hepdb.names file. The cdmove program looks in those directories, and moves the copies of the journal files to the todo directory at each slave site. An email message will be sent to local database contacts informing them that an update set has been propagated and that they can process the updates. Alternatively, a slave node can run the cdserv process continuously. By processing these files the official local copies of the database are updated.

The basic master-slave interaction for the case in which a slave server submits an update is shown in Fig. 3. This figure depicts the master-slave interaction for the illustrative case in which the slave server ``slave1'' submits an update. This update is processed by slave1 which puts the resulting journal file into its queue directory (step ``1''). Some time later, the master node starts cdmove which moves the files from slave1 to its todo/queue directory (step ``2''). Assuming the ``check'' copy checks out okay, the master node's server applies this update to the official database and places copies of the update in the form of journal files in special local directories, one for each slave node in the system (steps ``3''). Some time later, cdmove places these files in the todo directories on the corresponding slave nodes (steps ``4''). The cdserv processes running on the slave nodes then apply the update to the local copy of the official database (steps ``5'').

A process dedicated to serving the DAQ read requests will run continously on the ``site node'' (a slave server running on site) and as such will be monitored but will not be regularly stopped and started by the database czar. This is because the DAQ will be making read requests whenever it wants to load in new constants for the electronics (say), and it should not have to wait for the next time the database czar decides to update the database. Database reads use the program sdb_output_titles which does not generate journal files nor does it require the cdserv process to be running, so reads will have no impact on the regular operations of the other parts of the database.

Until the site gets a good connection to the internet, the master node will reside elsewhere. Under this scenario, updates made on site-of which there will be many, especially during the commissioning phase-will not actually appear in the site's official copy of the database until the minimum time it takes for updates to propagate from the site, to the master, and back again. For large update sets, this time lag will be too long.

To overcome this problem, a ``mirror'' database will be implemented on site. The SNODB mirror consists of a separate server on site, running as a master, along with a process which manages the placement of local updates in the relevant todo and queue directories. The mirror takes local updates and applies them to the site's mirror database immediately, as well as submitting the same updates for propagation to the master node. These updates will eventually get processed on the master node and propagated back to the site node, where they will be applied to the local official copy of the database. Since other nodes will also be submitting updates which we will want applied to both the site mirror and official databases, but we do not want updates which originated from the site node getting re-applied to the site mirror database, a special filter has been implemented. The mirror database is depicted diagrammatically in Fig. 3. See Sec. 4.4 for details on how to set up the mirror on the site node. (Users at other nodes should avail themselves of a test database if they wish to see locally-generated updates quasi-instantly.)



Next: Database Maintenance Up: The SNO Database: SNODB Previous: Database Structure


cdsno@higgs.hep.upenn.edu
Mon Aug 10 17:56:28 EDT 1998