1 |
1. ============= DRMS Module overview ============== |
2 |
|
3 |
DRMS modules share a common main function found in |
4 |
jsoc/src/base/drms/jsoc_main.c. The module executable is run like a |
5 |
normal Unix command line program (see below) and its main function |
6 |
performs the following steps: |
7 |
|
8 |
1. Parses the command line options and stores them in a |
9 |
global data structure. |
10 |
2. Connects to a DRMS session master process. |
11 |
3. Calls the module's main function "DoIt()". |
12 |
4. If DoIt() returns a non-zero status code, indicating an error, |
13 |
it sends an abort request to the DRMS session master. |
14 |
If DoIt() returns a zero status code (success) it sends a message |
15 |
to the DRMS session master asking it to commit the data generated |
16 |
or modified by the module when the session ends. |
17 |
5. Disconnects from the DRMS session master. |
18 |
|
19 |
The top-level file for implementing a module should |
20 |
look something like this: |
21 |
|
22 |
|
23 |
------------------ module.c ------------------------ |
24 |
#include "drms.h" |
25 |
#include "jsoc_main.h" |
26 |
|
27 |
/* List of default parameter values. */ |
28 |
DefaultParams_t default_params[] = { |
29 |
{"parm_name1", "default value1"}, |
30 |
{"parm_name2", "default value2"}, |
31 |
{"mandatory_parm3", NULL }, |
32 |
/* ... more default named parameter values ... */ |
33 |
{NULL, NULL} /* List must end in {NULL, NULL}. */ |
34 |
}; |
35 |
|
36 |
/* Module main function. */ |
37 |
int DoIt(void) |
38 |
{ |
39 |
int status; |
40 |
|
41 |
/* Do work... |
42 |
... |
43 |
Set status == 0 to indicate success. |
44 |
Set status != 0 to indicate failure. */ |
45 |
|
46 |
return status; |
47 |
} |
48 |
---------------------------------------------------- |
49 |
|
50 |
|
51 |
2. =========== DRMS command line parsing ============== |
52 |
|
53 |
The functions associated with command line parsing are found in |
54 |
jsoc/src/util/util/cmdparams.{c,h}. |
55 |
|
56 |
The DRMS main program parses the module command line and stores |
57 |
the information in a global data structure |
58 |
|
59 |
CmdParams_t cmdparams; |
60 |
|
61 |
that can be used to access the parameters from anywhere within the |
62 |
module code, including library subroutines. The command line consists |
63 |
of four types of tokens |
64 |
|
65 |
* named parameters given in one of the forms "variable= value", |
66 |
"variable=value" or "--variable value" |
67 |
* single letter flags "-a -b -c" which can also be written in |
68 |
concatenated form "-abc". Flags are translated into named single |
69 |
letter named parameters with the value "1". |
70 |
* unnamed argument strings of the form "value" |
71 |
* command line files of the form "@filename". Each line such a file |
72 |
is parsed as an additional command line. Command files may contain |
73 |
references to other command files. Blank lines or lines beginning |
74 |
in "#" are treated as comment lines and ignored. |
75 |
Command line files are a convenient mechanism go circumvent the |
76 |
limitation on the number of command line arguments in most operating |
77 |
systems. |
78 |
|
79 |
|
80 |
-------- example ----------- |
81 |
Example: Assume that the file inputs.conf contains the three lines |
82 |
|
83 |
# This is a test |
84 |
input1.txt |
85 |
input2.txt |
86 |
|
87 |
then the command line |
88 |
|
89 |
module.exe -vf test=debug abc.txt --log logfile def.bin @inputs.conf |
90 |
|
91 |
will get parsed to have 3 named parameters |
92 |
|
93 |
v = "1" |
94 |
f = "1" |
95 |
test = "debug" |
96 |
log = "logfile" |
97 |
|
98 |
and 4 unnamed arguments |
99 |
|
100 |
abc.txt |
101 |
def.bin |
102 |
input1.txt |
103 |
input2.txt |
104 |
-------- end example ------ |
105 |
|
106 |
The values of the named parameters are read using the following |
107 |
functions: |
108 |
|
109 |
char *cmdparams_get_str(CmdParams_t *parms, char *name, int *status); |
110 |
int8_t cmdparams_get_int8(CmdParams_t *parms, char *name, int *status); |
111 |
int16_t cmdparams_get_int16(CmdParams_t *parms, char *name, int *status); |
112 |
int32_t cmdparams_get_int32(CmdParams_t *parms, char *name, int *status); |
113 |
int64_t cmdparams_get_int64(CmdParams_t *parms, char *name, int *status); |
114 |
float cmdparams_get_float(CmdParams_t *parms, char *name, int *status); |
115 |
double cmdparams_get_double(CmdParams_t *parms, char *name, int *status); |
116 |
double cmdparams_get_time(CmdParams_t *parms, char *name, int *status); |
117 |
|
118 |
If the named parameter is was not given on the command line |
119 |
the functions above try to obtain its value from the environment |
120 |
using the getenv function. Therefore the commands |
121 |
|
122 |
module.exe blah="Hello" |
123 |
|
124 |
and |
125 |
|
126 |
setenv blah Hello |
127 |
module.exe |
128 |
|
129 |
should have the same outcome. |
130 |
|
131 |
The function |
132 |
|
133 |
int cmdparams_exists(CmdParams_t *parms, char *name); |
134 |
|
135 |
returns 1 if a named parameter matching the string in "name" |
136 |
was given on the command line, and 0 if no such parameters was |
137 |
given. |
138 |
|
139 |
The (string) values of the unnamed arguments are read using the |
140 |
following functions: |
141 |
|
142 |
char *cmdparams_getarg(CmdParams_t *parms, int num); |
143 |
int cmdparams_numargs(CmdParams_t *parms); |
144 |
|
145 |
|
146 |
cmdparams_getarg(cmdparms, 0); |
147 |
|
148 |
returns the name of the running program (argv[0]). |
149 |
|
150 |
Default values for parameters can be given in the global struct |
151 |
default_params that must be present in the module. The struct |
152 |
takes the following form: |
153 |
|
154 |
DefaultParams_t |
155 |
default_params[] = { |
156 |
{"parm_name1", "default value1"}, |
157 |
{"parm_name2", "default value2"}, |
158 |
{"mandatory_parm3", NULL }, |
159 |
/* ... more default named parameter values ... */ |
160 |
{NULL, NULL} /* List must end in {NULL, NULL}. */ |
161 |
}; |
162 |
|
163 |
If the value field in the struct for a given parameter is |
164 |
NULL it means that the parameter is mandatory and must be |
165 |
present on the command line. If not, an error message will |
166 |
be printed out and the module terminated immediately after |
167 |
command line parsing. |
168 |
|
169 |
|
170 |
3. =========== DRMS data functions ============== |
171 |
|
172 |
The module read and writes data using the functions described in |
173 |
jsoc/CM/*/drms_api.txt. |
174 |
|
175 |
|
176 |
4. =========== Running a DRMS module =========== |
177 |
|
178 |
Running one or more DRMS modules involves three main steps |
179 |
|
180 |
a) starting a DRMS session, |
181 |
b) runnning the module(s) and |
182 |
c) closing the session. |
183 |
|
184 |
The final step will either commit all the data generated by |
185 |
modules in the session or discard it if an error occured. |
186 |
|
187 |
The script /jsoc/scripts/drms/drms_run automates the three steps |
188 |
detailed below, and allows modules (or scripts containing multiple |
189 |
module commands) to be run with a single command. |
190 |
The command |
191 |
|
192 |
host:~> drms_run <command> [options...] |
193 |
|
194 |
will start a new DRMS server, run <command> and depending on the exit |
195 |
status of <command> will either commit or discard changes to the |
196 |
database and stop the DRMS server. drms_run will use the drms_server |
197 |
executable pointed to by the environment variable DRMS_SERVER_EXE. If |
198 |
DRMS_SERVER_EXE is not set drms_run will assume that an executable |
199 |
"drms_server" is in your path. The output from the DRMS server is |
200 |
piped to the file pointed to by the environment variable |
201 |
DRMS_LOGFILE. If DRMS_SERVER_EXE is not set drms_run will create a log |
202 |
file in /tmp/DRMS.<pid>, where <pid> is the PID of the drms_run |
203 |
script interpreter. |
204 |
|
205 |
The three steps are carried out as follows: |
206 |
|
207 |
a) Before you run modules you must have a DRMS server running to |
208 |
act as a session master. This can be done by running the command |
209 |
|
210 |
host:~> jsoc/bin/<target>/drms_server -f |
211 |
|
212 |
The server will print out what interface it is listening |
213 |
for connections on. For example: |
214 |
|
215 |
akhenaten:~/jsoc> bin/custom.akhenaten/drms_server -f |
216 |
DRMS_HOST = akhenaten.Stanford.EDU |
217 |
DRMS_PORT = 33137 |
218 |
DRMS_PID = 20955 |
219 |
DRMS_SESSIONID = 38 |
220 |
DRMS server started with pid=20955, noshare=0, noroe=0 |
221 |
... |
222 |
|
223 |
The "-f" flag makes the server run in the foreground. Without |
224 |
"-f" the drms_server command spawn a server in a background |
225 |
process, prints the connection info to stdout (as above) |
226 |
and exits. |
227 |
|
228 |
The server will print log messages to stdout and |
229 |
stderr (TBD: Clean up error handling and logging.), and these |
230 |
should be piped to a file if you intend to keep them. |
231 |
|
232 |
|
233 |
b) Now you can run the module(s). The modules do not need to run on |
234 |
the same host as the server. They can run on any host as long as |
235 |
they are able to open a TCP socket connection to the server |
236 |
process. |
237 |
|
238 |
When running a module, the named parameter DRMSSESSION must be set |
239 |
to indicate the host and port where the DRMS server is listening |
240 |
for connection attempts. It is perhaps most convenient to do this |
241 |
by setting the environment variable DRMSSESSION. In the example above |
242 |
this would mean executing the command: |
243 |
|
244 |
akhenaten:~/jsoc> setenv DRMSSESSION akhenaten:33137 |
245 |
|
246 |
Each module that connects causes the server to spawn a new thread |
247 |
to service the new client. The server can service multiple |
248 |
clients simultaneously, but database operations are serialized |
249 |
within the server and executed sequentially using a shared |
250 |
connection to the DRMS database. |
251 |
|
252 |
|
253 |
3. When all modules have finished successfully you can either |
254 |
|
255 |
a) tell the DRMS server stop and commit all data generated or |
256 |
modified by the modules to the DRMS database by sending a |
257 |
SIGUSR1 signal to it. In the example above that would mean |
258 |
issuing the command |
259 |
|
260 |
akhenaten:~/jsoc> kill -s USR1 20955 |
261 |
|
262 |
or if an error occurs you can |
263 |
|
264 |
b) tell the DRMS server to abort and discard all data generated |
265 |
by the modules by sending it a SIGTERM, SIGQUIT or SIGINT. |
266 |
In the example above that could be done by pressing CTRL-C |
267 |
in the terminal where the server is running or by issuing |
268 |
the command |
269 |
|
270 |
akhenaten:~/jsoc> kill -s INT 20955 |
271 |
|
272 |
It should be safe to kill the server with SIGKILL (kill -9). |
273 |
It will have the same effect as a regular abort except that it |
274 |
leaves a stale entry in DRMS's active session table. |
275 |
|
276 |
|