Nicolas Raymond ~  Free Grunge Textures

How to Destroy a Sysplex

To say we had an interesting Business Recovery Exercise this week would be an understatement!

Since bringing our BR (business recovery) / DR (disaster recovery) solution in house, rather than performing offsite, we’ve had a total of five BR Exercises this year alone.  This is pretty impressive for our shop since we use to go YEARS between BR Exercises.  Now our clients can declare a BR Exercise without prior notice to ensure our infrastructure is sound and solid.

Our infrastructure IS sound and solid…provided no one  messes with it!

Two months earlier I was doing what I thought was helpful clean up on RACF.  I was adding a new PROFILE for a monitoring application.  Our RACF expert had just recently retired and our new RACF person was not quite trained and up to speed.

On occasion I would go in and “fix up” some things in RACF trying to helpful.  Although I had ADMIN rights to reset PASSWORDS when I’m on-call,  I’m not really suppose to mess around in RACF.

But what’s the worse that can happen?

I honestly thought I was doing something good by deleting a VERY suspicious * (G)ENERIC profile.

Disaster Recovery_RACF_profiles
(* I have my very own screen shot auditing script that captures my screen every minute on my workstation.  It was able to capture the quiet destruction of the sysplex.)

To me this generic profile seemed a security risk and decided to take matters into my own hands (since the new guy surely was not going to) and DELETED this profile!

Disaster Recovery_RACF_delete
Ah oh!!!
Disaster Recovery_RACF_warning
“You still have a chance to undo this Paul!!!”
Disaster Recovery_RACF_refresh
Nope.  Profile is deleted.

Quiet Disaster

What I didn’t realize what I had done is that instead of making the system more secure I delete a VERY important PROFILE that’s used at IPL.

As Michael Cairns excellently describes in his article “Addressing Common RACF Configuration Issues“, that * GENERIC PROFILE was the catchall profile.

[The] class SURROGAT profile consisting simply of "**" or "*.*" (sometimes called a catchall profile). It applies to all user IDs that aren't matched by a more specific profile and probably covers your user ID unless steps have been taken to avoid this.
...
Without a catchall generic profile of some kind in the class STARTED, a previously undefined started task will fall back to the contents of ICHRIN03. 
...
If fallback to ICHRIN03 can happen, you need to know what privileges it's granting.

That’s exactly what happened.

We started the Business Recovery Exercise and the system upon the first IPL came to a screeching halt.  Apparently JES2 (Job Entry Subsystem) did not have authority and the ICHRIN03 was poorly coded.

But…NOTHING has changed!!!!

Imagine the frustration my fellow colleagues (and myself before discovery) were experiencing.  Here we were doing our FIFTH BR exercise this year.  It always worked.  It never failed.  We had a perfect mirror of our working production.  Nothing had changed!

To make a long story (and painful one for me) short, we opened a Service Request with Severity 1 with IBM.  This is equivalent to calling 911 or pressing the nuclear panic button when you need IBM support and need it fast!

We were directed to a teleconference with their JES and RACF experts and with their AWE INSPIRING expertise guide us to the discovery that yes, we were missing that * GENERIC PROFILE in RACF.  Since JES2 at our shop started in a certain sequence we were unable to re-create this PROFILE on our BR system.

Since this was a mirror of our production we discovered that we were in fact vulnerable on our PRODUCTION SYSTEM!!!

If we had IPL’d any of our production LPARS, meaning recycling them, there was NO WAY they were coming back up.  JES2 would have ran into the same authority issue error and the entire system would be in a matter of speaking…toast!

Luckily we caught this and were able to RECREATE the profile on our PRODUCTION system so we could mirror it over to the BR SYSTEM and finish the exercise.

Take away lessons:

  1. NEVER…  EVER…   MESS WITH RACF! (At least without knowing what you’re doing.  My RACF roles have been relinquished to the appropriate people.)
  2. Business / Disaster Recovery Exercises are there for a REASON!  If you’re not doing it at your shop, how do you know you’re not vulnerable?

</CONFESSION AND LESSON>

http://www.flickr.com/photos/microraptor/5240669099/

Please refer to the MVS Initialization and Tuning Reference’  or the MVS Initialization and Tuning Guide’

Time and time again I was told to reference either the “MVS Initialization and Tuning Reference” or the “ MVS Initialization and Tuning Guide” more than any other manuals in the entire z/OS Internet Library.

Both these manuals cover the settings and configuration of MVS which is for the most part everything that’s in the PARMLIB dataset.

Unfortunately these manuals are about as exciting as reading the phone book.  Do I suggest Millennial Mainframers going through the entire manuals and comparing it to what your shop uses?  Yes!

It took me awhile but I grabbed our shop’s SYS1.PARMLIB listing and went through the entire listing and looking up what each setting did in the the “Init and Tuning Reference” primarily,  using my handy WikidPad to as I journeyed through.

I often reference my initial notes and discoveries to this day.

PARMLIB

 Why is PARMLIB so important?

If trying to use microprocessor terms,  PARMLIB is like the System Settings in Windows or OS X.  Within this dataset are the that defaults and setting for the entire z/OS system.  Knowing where to poke around as a Systems Programmer or where to ask for changes is an essential skill set.

More often than not, if you’re having issues with z/OS the first place you will want to check is the PARMLIB.

Here’s a few important members just to give you an idea how PARMLIB works:

IEASYMxx

Defines static system symbols.

SYSDEF LPARNAME(TST1) 
SYSNAME(TST1) 
SYMDEF(&SYSETC='&SYSNAME') 
SYMDEF(&RMDSCLS='N') 
SYMDEF(&S='&SYSNAME(-2:2)') 
SYMDEF(&SN='101') 
SYMDEF(&DOMAIN='CNMT1') 
SYMDEF(&NVASSLU='Z') 
SYMDEF(&MASTYPE='MASTEST') 
SYSPARM(T1)

Now if I were to use the symbolic &SYSNAME in another member the system will translate this into TST1

 

 

PROGxx

PROGxx (authorized program list, exits, LNKLST sets and LPA)

The PROGxx parmlib member contains the following optional statement types:

  • APF, which defines the format and contents of the APF-authorized program library list.
  • EXIT, which controls the use of exits and exit routines.
  • SYSLIB, which allows for the definition of alternate data sets for the system defaults (SYS1.LINKLIB, SYS1.MIGLIB, SYS1.CSSLIB, SYS1.SIEALNKE, SYS1.SIEAMIGE, and SYS1.LPALIB) at the beginning of the LNKLST and the LPALST concatenations.
  • LNKLST, which controls the definition and activation of a LNKLST set of data sets for the LNKLST concatenation.
  • LPA, (Link Pack Area) which defines the modules to be added to, or deleted from, LPA after IPL.
  • REFRPROT, which indicates that REFR programs are protected. Use the REFRPROT statement type to specify that REFR programs are protected from modification by placing them in key 0, non-fetch protected storage, and page protecting the full pages
  • NOREFRPROT, which indicates that REFR programs are not protected. 

Adding APF authorization to libraries  is sometimes a necessity when installing new software, and if not will result in a Return Code error when the system attempts to use these libraries.  In this example there is three”APF ADD” statements.

SYS1.PARMLIB(PROGAU)
********************************* Top of Data *****************
APF FORMAT(DYNAMIC) 
APF ADD DSNAME(ASM.SASMMOD1) VOLUME(******) 
APF ADD DSNAME(BACU.PRODINFO.WEBA.LOAD) VOLUME(FIN002) 
APF ADD DSNAME(BACU.TDLA.SIZDLOAD) SMS 

In this example:

Line #4:  ADD datasetname  “ASM.SASMMOD1” and assigns the volume to ‘******‘ which means that the system is to use the volume serial number of the current system residence (SYSRES) volume.  At our shop we have several “SYSRES” volumes and this can change depending on the LPAR

Line #5:  ADD datasetname  “BACU.PRODINFO.WEBA.LOAD” specifically using VOLUME(FIN002)  

Line #6:  ADD datasetname “BACU.TDLA.SIZDLOAD” which in on a volume that is SMS managed.

 

IEASYSxx

This is the grand daddy of all the members!  This member as per the reference manual:

You can specify system parameters using a combination of IEASYSxx parmlib members and operator responses to the SPECIFY SYSTEM PARAMETERS message. You can place system parameters in the IEASYS00 member or in one or more alternate system parameter lists (IEASYSxx) to provide a fast initialization that requires little or no operator intervention.

In many cases the IEASYSxx define what the other members in PARMLIB will be using.  For example CMD=00 completes the name of the parmlib member COMMNDxx which will be COMMAND00 , which contains commands to be issued internally during master scheduler initialization.

Here’s an example of IEASYSXX (system parameter list):

SYS1.PARMLIB(IEASYS00) 
********************************* Top of Data *****************
ALLOC=00,               ALLOCATION SYSTEM DEFAULTS
CLOCK=00,               PROMPT OPERATOR FOR TOD
CLPA,                   CLEAR LINK PACK AREA - REMOVED VIO
CMB=(COMM,200),         CMB SET TO 200 FOR COMMS DEVICES
CMD=00,                 COMMANDS AUTOMATICALLY ISSUED AT IPL
CON=00,                 CONSOLE CONFIGURATION DEFINITION
COUPLE=00,              CROSS-SYSTEM COUPLING FACILITY (XCF)
CSA=(3200,153600),      3.5 MB CSA AND 150MB ECSA
DIAG=00,                CONTROL CS TRACKING AND GFS TRACE
DUMP=NO,                PLACE SVC DUMPS ON DASD DEVICES
FIX=00,                 FIXED LPA LIST
GRS=STAR,               GRS COMPLEX MODE
GRSCNF=00,              GRS CONFIGURATION
GRSRNL=00,              GRS RESOURCE NAME LISTS
IKJTSO=00,              TSO/E COMMANDS AND PROGRAMS
IOS=00,                 MISSING INTERRUPT HANDLER (MIH)
LPA=(00,L),             SPECIFY LPALST00 AS LPA LIST
LOGCLS=L,               SYSLOG
LOGLMT=999999,          MUST BE 6 DIGITS,MAX WTL MESSAGES QUEUED
LOGREC=SYS1.LOGREC.&SYSNAME.,
MAXUSER=600,            (SYS TASKS + INITS + TSOUSERS) < 600
MSTRJCL=00,             MASTER SCHEDULER JCL
OPI=YES,                ALLOW OPERATOR OVERRIDE TO IEASYS00
OPT=00,                 SPECIFY IEAOPT00 (SRM TUNING PARMETERS)
PAGE=(PAGE.PLPA.&SYSNAME.,
      PAGE.COMMON.&SYSNAME.,
      PAGE.LOCAL1.&SYSNAME.,
      PAGE.LOCAL2.&SYSNAME.,
      PAGE.LOCAL3.&SYSNAME.,L),
PLEXCFG=MULTISYSTEM,   TYPE OF PLEX, ANY ALLOWS GRS-STAR OR MIM
PROD=00,               PRODUCT ENABLEMENT POLICY
PROG=(EX,AU,LK,LS),    EXITS AUTHORIZATION AND LINKLST
REAL=0,                NO V=R STORAGE
RSVNONR=800,           ASVT RESERVED ENTRIES(REPLACEMENTS)
RSVSTRT=5,             ASVT RESERVED ENTRIES(STARTS)
SCH=00,                SVC TABLE SCHED00
SMF=00,                SELECT SMFPRM00, SMF PARMETERS
SSN=00,                SUBSYSTEM DEFINITIONS
SQA=(16,1200),         SQA=(16*64K),ESQA=(1200*64K)
SVC=00,                INSTALLATION-DEFINED SVCS
UNI=01,                UNICODE CONVERSION SERVICES
VAL=00,                SELECT VATLST00 DEFAULT
VIODSN=SYS1.STGINDEX.&SYSNAME.,
VRREGN=64              DEFAULT REAL-STORAGE REGION SIZE DEFAULT
******************************** Bottom of Data ***************

This is just a fragment of what an actual PARMLIB would consist of.  Keep in mind if you made changes to the PARMLIB they might not even take effect until the next IPL or until you issue a refresh command of some kind.  More importantly PARMLIB changes could seriously impact the system at large!

So please ‘PARMLIB‘ responsibly!

The Watson Dynasty:  10th anniversary on a IBM Historical Account

Watson_Dynasty_Richard_Tedlow

I read The Watson Dynasty, by Richard S. Tedlow back in 2003 when it was first published.   Ten years later I’m re-reading this IBM Classic account and re-introducing myself to the mindset and philosophies of IBM’s founding father and son, T.J. Watson Sr. and T.J. Watson Jr.

The book is interesting from the point of view of adventures in business and rags to riches empire building.  While the book certainly talks about and touches on the great innovations and technology that IBM created, the purpose of the book is to account the history of how IBM grew into the powerhouse its knows for by the Watson father and son leadership.

Thomas J. Watson Senior

Starting with Thomas J. Watson Senior’s life and character you get a sense right off the introduction that this is not your typical business leader.  The opening example of taking what would be a crippling train accident with hundreds of his employees during IBM Day at the second New York World’s Fair in 1940, and turning it into an flourishing opportunity.

I found T.J. Watson’s life a fascinating and relentless battle.   Here’s an excerpt at age twenty-one:

‘The only place for him to rest his head at night was a pile of sponges in the basement of a store.  Watson went from rags to riches, but he did not begin life in rags.  He worked his way down before he worked his way up.’

What’s interesting is learning how at the roots of IBM’s history was computation devices such as scales and cash registers.  It was through a series of lucky opportunities and hard work and love of selling that allowed T.J. Watson  Sr. at the beginning of his career to found International Business Machines and turn it into a market leader.  In fact he was so successful and so cut throat to the competition he was faced under devastating anti-trust lawsuit in a former company he headed the sales with before founding IBM.

T.J. Watson Sr. is also certainly known for his famous quote,

“Would you like me to give you a formula for… success? It’s quite simple, really. Double your rate of failure.”

This philosophy is captured by Tedlow as he touches on some other gems I found interesting about Senior.

‘Senior was a big tipper…His son asked why.

     “I do this for two reason, Tom.  First, that fellow…I feel sorry for him.  The second reason is that there is a whole class of people in the world who are in position to poor-mouth you unless you are sensitive to  them…They see you in an intimate fashion and can really knock off your reputation.”

Junior (aka “Terrible Tommy Watson”)

After  getting a sense of T.J Watson Sr. as the “Man of Men”, the book begins to account the life and character of Junior.  Described by college presidents and administration people as a “predetermined failure”.  It’s difficult not to feel sorry for “Terrible Tommy” overshadowed by his father’s legacy and depending on his father’s influence to get him into Brown University.

The bickering between father and son only intensifies as Junior followed in his father’s footsteps working and leading IBM on.

“Terrible Tommy failed at most of the things he tired.”

It interesting to see how “Terrible Tommy” despite other people’s expectations and the battles between him and his father how he later saves IBM from “The Old Man” as the dawn of electronic computers arrived and later the pushing force behind mainframes.

Intangible vs. The Punch Card in Hand

It’s half way in the book, Chapter 17, before electronic computers are introduced.  Before that IBM was running a sophisticated business with sophisticated technology running gears to compute and calculate  the numbers for businesses.  This way of doing the job obviously takes a drastic change once World War II introduced a “computer revolution”.

The Electronic Numerical Integrator and Computer (or ENIAC) project, funded by the Army Ballistic Research Laboratory  at the University of Pennsylvania’s Moore School of Engineering, proved to out perform IBM’s fastest punch card machines in 5000 additions per second to 4!

While ENIAC at the time was acres of vacuum tubes that attracted moths (hence the term “debugging”) the “electronic brain” was born!

It’s interesting obviously in hindsight as the first computers were being engineered and IBM trying to convince customers the validity and utility of this new technology.  Senior had the customer’s interest in mind with having a tangible IBM punch card with the information that could be held in the hand, while Junior represented the new generation and influence where information was moving on to magnetic tapes becoming an intangible solution to storage woes of some of IBM’s biggest customers.  One of the them described they had 3 floors of typists on punch cards.

System/360

The true arch in the story, and what will interest many Millennial Mainframers, is the intense gamble IBM took creating System/360.  It was without a doubt a HUGE undertaking.  Tedlow quite boldly states:

‘The System/360 was one of the two greatest new product introductions in the twentieth-century American business history.  The other was the Model T Ford.’

It goes into detail explaining how System/360 took $5 billion over a period of four years and close to 2000 programmers.  The name 360 was in reference to all points of the compass.

Internal conflict and shaky belief that System/360 would ever succeed threatened the entire project as it slowly progressed forward.   When it was finally revealed the earliest customers were Bank of America and NASA proving to be a compelling and superior product that it brought about another antitrust lawsuit against IBM.

Conclusions

I liked the book.  Although be warned its a historical account of IBM and the Watson Dynasty, thus the hardcore nerdy technological details are missing and broadly covered.  Tedlow however does a fantastic job covering the historic business of IBM.

Probably the most interesting aspects of this book is that despite IBM being a huge empire making big moves and influencing the business world, at the center of it all was some family drama.

My favorite part obviously is the struggle and calculated risk IBM took pushing System/360 into the world.  System/360 was the predecessor of the now famous z/OS mainframe operating system still used extensively today.

I recommend checking this book out!

(Read more about The Watson Dynasty here:  The Watsons: IBM’s Troubled Legacy )

*** full disclosure: some of the Amazon links have affiliate links to support Sean McBride’s efforts with MillennialMainframer.com ***