I have just shipped version 1.5b via DISTRIBUTE. All the changes are descri-
bed in LISTNEWS MEMO, except for some details. First I realized that some ser-
vers keep abending in a fashion that closely resembles what happens when you
have a maimed stats file. This was a bug caused by versions < 1.4c I think and,
unless you're sure you've already done that when installing 1.4c, I'd like to
urge you to check the integrity of your STATS files: there should always be a
userid at the beginning of each line, and it should not be long enough to trun-
cate the decimal data. If so, just delete the line and FILE back. No big deal
as you can see. If in doubt, just do a STAT listname RESET (new command) e
basta.
Performances: if you are not interested in performances you can skip this
section :-) However, when testing DISTRIBUTE I realized that it took it A LOT
of time to process anything > 50 lines or so... I did some analysis of the
problem and some timing too. Here are the result (the sample file I used was
2,000 records long and being distributed to only one person (me) to avoid
loading the network with junk files...)
Version 1.5a:
CONNECT= 00:00:54 VIRTCPU= 000:28.94 TOTCPU= 000:34.06
That was just utterly unacceptable.
After installing the Res=Disk feature of the DD card, which avoids storing the
dataset in storage, retrieving it and writing it back to disk:
CONNECT= 00:00:41 VIRTCPU= 000:06.43 TOTCPU= 000:10.06
That was more reasonable, but still a lot of CPU...
I then made some benchmark tests and found out that EXECIO * CARD is something
unthinkably slow. I wrote my own one, using diag 14 to read the spool file:
CONNECT= 00:00:22 VIRTCPU= 000:04.96 TOTCPU= 000:06.51
Amazing... 3.5 sec TCPU just for the EXECIO * CARD part!!!!
Then I wrote a faster equivalent of EXECIO * DISKW, using 16k-byte blocks and
BALR call to DMSCRD to read the lines:
CONNECT= 00:00:17 VIRTCPU= 000:04.11 TOTCPU= 000:05.45
That's only one second. I had expected a better improvement, was disappoin-
ted. Anyway the module is here so I left it.
Then, idea... Why not use BALR to call DMSCAT, instead of issuing 2,000 SVC 202
CONNECT= 00:00:15 VIRTCPU= 000:03.91 TOTCPU= 000:05.28
That's 0.3 sec for just three more lines of code... SVC 202 is really slow.
More stats on the EXECIO * CARD replacement:
EXECIO: 3.35/7.36 DIAG14+SVC202: 0.91/1.09 DIAG14+BALR: 0.53/0.71
So you can divide TCPU by 10, *even* when you stack all the data in 2,000 suc-
cessive calls! I already had a FREAD program that uses diag14+block write for
effecting a READCARD operation, and it's 10-15 times faster than READCARD
(TCPU). But then it only does a few calls to DMSBLKW and certainly not N calls
to DMSCAT who in its turn does N calls to DMSFREE, etc.
Conclusion: EXECIO *MUST* have been tuned to give the poorest possible perfor-
mance. :-) In case you're interested, the source of the new modules (LSVCARDR,
LSVWRF80) explain how to use them.
I also improved LSVBITFD. It's now a nucleus extension, and it keeps the data
file in storage until reset by a NODESGEN <WTONLY> command. That's a 25% saving
in TCPU and it can only help your DASDs. Not a major improvement though.
By the way, I feel a little uneasy about those new modules. There should be
no problem, but usually when you do a BALR call you have to L 15,ADMSxxx. And
it may be possible that the address has changed in VM/SP4, although I'd bet on
the opposite. If so, the files need to be re-assembled. If you run VM/SP4 and
have an addressing exception or suchlike, then you know what the problem is.
But I'm pretty sure there won't be any problem. Just send a dummy distribute
job to your server if you want to be sure.
Sorry about all that blahblah, but optimisation is really one of my favourite
games :-) It took about 20 secs TCPU (4341-2 units) to perform the DISTRIBUTE
command for the 3300-recs shipment you're going to receive, and there were more
than 40 users. Given that LSVBITFD requires about .25 sec per user (each user
being from a different node), this means 10 secs overhead, of which 5 have been
spent reading, writing and punching two copies of the file (even with CARD
PUNCH it does take time on a 4341+3350), so it's no longer such a CPU hog and
I've not lost my day ;-)
Ah, about FRMOPxx: I just realized I had misread the new topology map. Most
fortunately, FRHEC11 will be linked directly to FRMOP22 (the MVS node) and
the dreaded FRMOP11 will be bypassed -- the node with two entries in
BITEARN NODE (plus one for their test rscs), two RSCSs of which only one has
link restart execs (but then the other has all its network links configured
FIFO), which tends to polarize itself from time to time in such a way that
files can go to Spain (eg a Netserv AFD PUT command) but can't come back (eg
no NOTE from Netserv but the file HAS been stored), etc. At least JES2 has an
automatic restart feature for links, and it usually doesn't configure them FIFO
:-) Oh well, I know I'm being mean, I'm just so glad my files won't have to go
through that node!... ;-)
Oh, a last change: I have changed "Reply-to:" to "Reply-To:" since some UN*X
systems seem (*most unexpectedly*) not to upcase the tag before the comparison.
I wish all the problems and bugs were that simple :-)
Eric
|