A glance back in history to the origins of OS/400 (by Wayne O. Evans)

26 June 2000

This article reviews some of the design decisions that were made early in the development of S/38. This early history is important because much of the architecture of OS/400 was inherited from S/38.

Wayne O Evans Consulting, Inc.; AS/400 Security Consulting and Education; 5677 West Circle Z Street; Tuscon AZ 85713-4416; Phone 520-578-7785; Fax 520-578-7786; Email .

This is a work in progress and I hope to expand this with more stories from others that worked on the S/38 development. June 26, 2000

[ Minor corrections, punctuation, and spelling.—Hanna ]

Over 20 years ago [ circa 1980—Hanna ] when the S/38 was being developed, the world had never heard of Microsoft or Netscape. But the struggle between the computer giant IBM and its competitors for market share was as fierce as today’s battles over the next browser. The IBM Corporation had experienced the loss of hardware sales to “look alike” vendors Amdahl and others. These hardware clones would run the IBM operating system and customers could get equivalent computing power at less cost by running a mixed shop. Hardware sales represented a major source of revenue to IBM therefore this loss of hardware sales to clone vendors was serious business to IBM executives.

On the large mainframe front, IBM had just canceled a project known as FS (Future Systems) that was going to be a revolutionary replacement for the existing line of IBM mainframe computers. After several years of work, meetings, and more task force meetings and many trainloads of paper specifications later the IBM management team determined the project was too ambitious even for IBM and IBM abandoned the project in 1975. (Does any one know the date? This is a guess.)

In the heart of the cornfields in Rochester, Minnesota, the S/3 and its follow-on systems S/32 and S/34 were nearing the end of a very successful life. The time for a replacement system was nearing. There was no hardware clone of the S/34 but the fear of a clone vendor replacing existing hardware was high on IBM management’s concerns. The news of FS demise was slow to reach Rochester so the group of planners, engineers and programmers continued on an independent approach to design the system for the future at elast for the small and medium business customer. Everyone wanted bigger, better, faster, and cheaper than the now aging S/34 but just how that would be accomplished was still an unknown.

Early in 1975, I joined the core group of less than 25 people who were developing the vision of the replacement for S/34. This small group of talented people came from varied backgrounds. Some from the design teams of the S/3, S/32 and S/34 while [others] had large S/360 and S/370 backgrounds.

Abstract Machine Interface

When I joined the project there was a revolutionary idea being discussed. The words “objects,” “encapsulation” and abstract machine were new to most system programmers including me. Rather than having the instruction sset of the computer being designed by the engineers, the programmers (the real users of the system) were defining an instruction set of an abstract machine. These instructions rather than be optimized for the hardware were being designed for writing software applications. This abstract machine interface would eventually become known as MI (Machine Interface) or what IBM currently calls TIMI (Technology Independent Machine Interface). This abstract Machine INterface (MI) defined some very advanced conceptos to make programming more efficient and less error-prone. Programmers would write programs for an abstract machine using new concepts like not loading data into a hardware register for processing. In this new abstract machine programmers would improve the reliability of programs by preventing programs from modifying storage to manipulate address registers. Using MI programs could manipulate pointers but the abstract machine would not allow a change that would cause a pointer to be corrupted. This abstract machine interface was being designed to protect programmers from themselves. On previous hardware, the programmers knew the addressing structure of pointers and could even alter memory to change the address (which often was the cause of program reliability problems). On this abstract machine these pointers were encapsulated (actual pointer content was nto presented to appplication programs). This meant that programs could not create a pointer by modifying storage and only pointer instructions could be used to modify pointers. Any attempt by the programmer to alter storage that contained a pointer would cause the pointer to become invalid. These abstract machine pointers were much larger than any hardware register and the system microcode would translate the use of pointers into actual hardware registers.

The separation of the actual machine hardware from the abstract machine is responsible for the successful migration from CISC hardware to RISC hardware. Other platforms are busy rewriting programs to get to 32 or 64 bit address (some platforms are projecting completion in 2005). What is truly amazing is that every AS/400 applicaiton is 64 bit enabled without any modifications to customer or OS/400 programs. [ As long as observability is maintained in the compiled object. Otherwise, objects had to be recompiled for the CISC-to-RISC conversion. I recall there was an additional conversion from V4R3 to V5 or V6 OS/400 that required observability in the compiled object.—Hanna ]

A second major concept of this abstract machine interface was to move much of the traditional operating system functions (task management, security, and even database) into the microcode. Since these functions are implemented close to the hardware interface they could be programmed efficiently. More important to IBM was that this integration of operating system function into the hardware microcode would make copying of the S/38 hardware more difficult. This fact alone insured that the S/38 project got funding from IBM management.

Today our experience with AS/400 proves this abstract machine concept makes a lot of sense but to the system programmers of the 1970s this idea was radical.

The big debate

This idea of an abstract machine interface was not without its critics. There were several very qualified technical professionals that argued (with great passion) that such an abstract machine interface may be academically possible but would introduce so much overhead that the program execution would be poor. Much effort (and strong emotions) was being devoted to discuss and prove or disprove the performance of direct hardware or microsode implementations.

I remember the day the issue was solved. A meeting of key technical staff was held one cold winter day in Rochester, MN. The proposents and opponents of the abstract machine were gathered in the conference rooom of Glen Henry, the programming system manager. Glen was an extermely intelligent individual but like so many truly gifted people, he lacked polished human interaction skills. When Glen Henry saw the obvious answer he told you so with few words and without any sugar coating. Glen prclaimed the end of the debate. With great emotion he explained that the abstract machine interface (later to be known as MI) was the only choice. Glen explained how this abstract machine would protect IBM investment from the clone hardwrae vendors. He concluded with “either get with the program or I will be happy to sign the transfer of any individual to another area.”

We can look back onw and see that this anstract machine interface has been one of the most significant architectural advantages of the AS/400. Separation of the software from the actual hardware allowed radical changes to the physical hardware with no modification to the software. Users of AS/400 were able to make the transition from CISC to RISC without any disruption in their applications because of the abstract machine interface.

Why the AS/400 Names are 10 Characters Long

Once the decision about the abstract machine was made then the less critical design decisions could be made. SInce this was a new machine all of the design parameters were up for question. One of the decisions that had to be made was the length of object names. One proposal was an 8-character name that would be compatible with the S/34. Other proposed “self documenting names” of up to the 32 characters supported by the MI. The decision was made when the database design team determined that they need to be able to store the file, library and member name in the 32 characters allowed by the MI. The 10-character name was deteremed by dividing the maximum name length (32) by 3 so S/38 names (and AS/400 names) are limited to 10 characters long.

Selection of Command Naming

The consistency and structure of the CL command language is well recognized. The verb object structure of command names did not just happen but resulted from a problem discovered during the early development. The commands were developed by individual teams of developers. They would name their commands in inconsistent ways and that the “terminology police” needed to impose very strict naming conventions of verb-object and use common abbreviations across the entire operating system. These consistent naming conventions were extended form the operating system to also incluce the Licensed Program Products. A verb-object structure was not enough because it became apparent there was difficulty being able to recognize the breads between abbreviations, so the standards were extended to standardize on 3 character abbreviations, with the exception of the last word in the command name. Thousands of AS/400 users find the CL language easy to learn and used need to thank the efforts of the “terminology police” for selecting and enforcing command naming standards.

Prompter is Born

Usability tests showed that users coudl not remember all the command names and keyword and value naming. Initial effortrs were to construct individual screens for each command and tried to keep them consistent. As the number of commands grew it was apparent that IBM couldn’t afford the cost of individually created screen interfaces this so then IBM planners tried to prioritize the commands, so there would be screen for some and not others. With the number of commands growing by the day, the user interface department could not develop all these screens or even keep track of them, therefore they proposed automated the generation of those screens based on the stored command definition objects.

The final result was the revolutionary command prompter, which has proven invaludable to over 1,500 IBM commands but is also available for customer commands. Using the question mark in front of a command name provldes an easy way for CL programs to prompt for input.

System Review

Soon afer the FS project was canceled by IBM, management began to foxus on the development efforts in Rochester. The question of the day was how could this small group in the cornfields of Minnesota (known by code name Pacific) hope to accomplish what the massive FS design teams from the east and west coasts could not. A system review was held to determine the status of the Pacific project. The future of the Pacific project depended upon the outcome of this system review. Lead technical people from the former FS project reviewed the hardware and software design. All efforts in Rochester were focused on telling a good story about our progress. The detailed hardware and software plans were prepared and explained to the review team. Sometimes the ink was not dry on the latest deisgn when it was presented as fact to the review team. This was a time of major stress because beause the design team was aware of the fate of the system was in the hands of this review team. The Pacific project survivied the review and S/38 and AS/400 are the result.

The review served an important role in solidifying the design of the operating system. Prior to the system reivew, individual designers would change (“improve”) their designs on a weekly basis. The system review forced the design team to solidify and document design decisions.

Implementation Challenges

The staffing of the engineering and programming group began to grow rapidly after the successful design review. The IBM facilities at the Rochester site could not hold all of the people so the complete Pacific project (engineering and programming) was moved to an abandoned department store in Rochester. This kept the hardware and software design groups together and allows developers to walk down the hall to talk with designers in related areas. I believe this ease of communication was one of the reasons that the project in Rochester was successful where the multi-site attempts to implement FS failed.

Figure 1. Implementation layers

CPF: Operating System
MI: Machine Interface / Abstract Machine
VMC: Virtual Micro Code
HMC: Hardware Micro Code

As shown in Figure 1, there were three design teams (hardware, microcode, and operating system) all focusing on design and implementation. These design teams closely represented the machine architecture. The communication between these three different groups was excellent but often you thought that you were in a tower of babble. Different groups used the same data processing term (such as queue) to describe a similar concept but the details of the function were vastly different. This terminology conflict could be very confusing because when designers from different areas talked every one though they understood but frequently there was often a vast difference in actual understanding. When the operating system programmers and engineers talked they often need to get one of the few indiiduals that could server as a translator to explain the subtle differences.

Simulator

The development of the hardware, microcode, and operating system was all proceeding in parallel. In the early phases of the project there was no hardware available for the programming team. A simulator of the machine hardware instructions was developed on the S/370. This simulator allowed the microcode developers to design and test their functions in the absence of physical hardware. The simulation of the hardware instructions was slow at best. I was the lead developer of the command analyzer and to simply syntax check a CL command using the simulator would take over 30-45 minutes when the computers were not loaded. If multiple simulations were running the same test would require up to 2 hours. As a result many other people and I started to come to work off-hours to get better response time. (If you call 30 minutes to syntax check a command better response time.) Frequently the programs would fail, this it was a challenge to determine if it was your code, microcode bug, or problem with the simulator. The programming language that was used to develop the operating system was PL/MI a derivitive of PL/I. There was a similar language PL/S that developed code for the S/370 so our development team used the S/370 to do much of the initial program debug on the command analyzer. This was essential since the CL command analyzer needed to be done early so that other development teams could create the individual commands in S/38 operating system called Control Program Facility (CPF).

The S/38 command analyzer has the unique feature that both the operating system and user installations use the same feature to create commands for the S/38 operating system and now the AS/400. Since the same features were used for both operating system and user defined commandds this means that user commands have the same features as operating system commands. Of special note is the capability to prompt for user and IBM supplied commands.

Testing on Hardware

Finally engineering began to make some of the hardware available and that became a new challenge. The tools used to step through code on the simulator did not exist on the hardware so you need to learn a whole new method of debugging a program. The debug tools did not exist during the early stages so setting hardware address stops became the method to test your programs.

The most used program on the system was a utility called 2 to 1 copy. All of the microcode and operating system would fit on a single disk (Piccolo drive). The first step was a 15-30 minute copy of the entire system from the drive 2 to drive 1. Then you would add your programs and if you were lucky you would be able to add your programs to drive one and do a shut down of the system so you could then copy drive 1 back to drive 2. Frequently the system would hang and the only alternative was to start over running the 2 to 1 copy again. The copy of disk drives was running 80% of the time and only 20% being used for actual testing.

Every week the developers would get a new driver with the latest fixes and more of the microcode function complete. The save restore functions were not working so the drive build team would roll a disk drive up to your system connect thecables and run the famous 1 to 2 coyp but this would be a reaload of the complete system. Each week the system began to become more stable.

ITSWITCH saves the day

The most useful program other than the 2 to 1 copy was a tool called ITSWITCH. If it were not for this program the S/38 operating system, CPF would never have been completed. This program had many functions, settings the entry point table, determining the compile date and time of programs. The most critical function was the capability to trap an unexpected error and allow the operator to take manual recovery steps. There would be two terminals attached to each system one would be used for testing and the other to run ITSWITCH.

Since the command analyzer code was done early, I was able to move on the development of a demonstration of system function. This became the demonstration of S/38 function that was used in the first day announcement in 1978. The demonstration was a simple order entry application updating the database and displaying the results. By today’s standards it was a very simple application but during the deveopment of CPF this was a big application. Imagine running multiple terminals all accessing the same file and allowing updates to the files. Everything was carefully scripted to be sure that no order quantity ever reduced the amount on stock to less than zero or unexpected conditions occurred. The demonstration looked very impressive but it was fragile. The system had a printer attached but if you attampted to print and do order entry at the same time the system would hang. The demonstration was carefully rehearsed to show users order entry and then move the audience to the printer to look at output (while no order entry was running). All the time ITSWITCH was monitored behind the scenes for a potential system failure.

S/38 Missed First Customer Ship

This application helped convince management that S/38 was ready for announcement before it was actually ready. As a result the approval for the announcement of the system was made and then the first customer ship date changed because the system reliability did not meed minimal standards. This was a black eye for the development team but an essential delay because customers did not have ITSWITCH to keep their system up and running.

Reflecting back on the S/38 on announcement day there were a lot of very elegant features, such as the integrated database, subfiles and user defined CL commands and a CL language that could be compiled into a program. But there were a lot of features missing as well for example, the communications capability was not added until release 2.

Design Flaws

CL

One of the design mistakes the CL team made became apparent in during the development of CPF release 2. If IBM added a keyword to an existing CL command every program that used that command would need to be recompiled. This caused a redesign of the method used by compiled CL programs that required that customers recompile all CL programs when release 2 was available. There was a lot of interest in S/38 but many customers had not made the purchase decision so only a few early adopters were impacted by the force recompile of programs.

Copy File Interactive

In the first release of CPF IBM shipped what would now be called a wizard to simplify the copy file function. THis copy file interactive function would prompt the user for a simplified copy file commands and based ont he users response to those prompts other prompts were presented. This was implemented as multiple CL commands that would prompt the user for responses and simply the CPYF command. As the number of options grew this alternative became impossible so CPYFINT (copy file interactive) was withdrawn in release 2. (Does anyone know the exact date copy file interactive was dropped?)

Main Memory

The cost of main memory was significant so the S/38 did not have the large memory sizes supported today on the AS/400. The early hardware systems were shipped with 512K memory and there was a significant design effort to get the entry design point down to 256K memory. The performance of S/38 and AS/400 is significantly improvied by the addition of main storage. It is hard to consider that early systems could run (slowly) in 512K.

Why Messages start wtih CPF

The AS/400 users may not recognize that the name of the S/38 Operating System was Control Program Facility (CPF). This abbreviation was used on all messages which users programs included in monitoring for messages. It cecame apparent that the number of message requried was larger than anticipated so additional variations useing the starting two letters CP abbreviation. Next time you get a CPF message from the AS/400 regconize the early efforts of the talented design team of the S/38 that left the AS/400 a valuable heritage.

Conclusion

Much press was made about the successful conversion of S/36 and S/38 in record time to make the AS/400. In my view the technical challenges faces by the early developers of S/38 were much larger than those faced by the conversion of S/38 to AS/400. Most of the fundamental design ground rules for the architecture were already established by the S/38 and the AS/400 function could be developed and testeed on the existing S/38 hardware.

Hanna Goodbar

The Pythia