Burroughs MCP: COBOL Programming

It's time for the final language supported by Burroughs MCP under time-sharing: COBOL. As always we will implement the TPK algorithm. However, this turned out to be quite a challenge.

Two view of COBOL

COBOL - the COmmon Business-Orientated Language - is probably the most successful language originating in the 1950s. Designed by a committee to standardise business processing, it was adopted by several computer manufacturers and powered many systems in the finance, retail and government space. Although not used for much new development today, it is still widely in use: the Wikipedia article mentions that as of 2020 "COBOL ran background processes 95% of the time a credit or debit card was swiped". It has an English-like syntax intended to make it easy to read, and has powerful I/O facilities.

On the other hand - it had a reputation for being verbose and over-complicated, with many keywords and quirks in how it processes code. It is rarely taught or studied in academia, and Jean Sammet, historian and one of the language's designers, said:

little attempt was made to cater to the professional programmer, in fact people whose main interest is programming tend to be very unhappy with COBOL

Personally I have never used COBOL nor know anyone who has. I am intrigued to learn more, but the size of the language makes it rather daunting. But before we continue, there is another question to answer.

Is this a suitable language to implement TPK?

TPK is a mathematical algorithm: it does floating point calculations and relies on functions such as sqrt and exponentiation. This is rather outside of COBOL's business-orientated domain:

  • COBOL uses integer and fixed precision numbers, with floating point support only standardised in the 1980s.
  • Mathematical functions like sqrt were also not a standard part of the language until then.

We also do not have contemporary documentation for COBOL on the B5500; the closest is the B7000/B6000 Series COBOL Reference Manual from 1977. This reserved word table starting on p404 notes comparability with the B5700, saying that SQRT was available but floating point numbers (using the COMP-4 type) were not, which seems to be confirmed by my testing.

The plan

So to make this a little more tractable, I will do the following:

  • Given the lack of floating point numbers, only allow positive integers of two digits as input, and display output as integers.
  • Use a LLM to help set up the initial code as a way to get to grips with the language.

(A note on AI: I do not use AI for writing this blog. I will experiment sometimes with using AI to write code, but will clearly mark when it is used).

Unsurprisingly, the LLM said it could not write COBOL for that specific machine, but would write generic COBOL for that era. It knew that sqrt was supported, though.

The code it produced had some formatting and syntax errors when I tried it on CANDE. By consulting the 1977 manual, and reading existing Burroughs batch COBOL programs from the CUBE tape I was able to get it working - but with no guarantees on the quality of the code.

TPK in COBOL

1000 * TPK PROGRAM IN COBOL FOR BURROUGHS MCP
1100 IDENTIFICATION DIVISION.
1200     PROGRAM-ID. TPK
1300     AUTHOR.     RUPERT LANE AND GEMINI
1400 ENVIRONMENT DIVISION.
1500 CONFIGURATION SECTION.
1600     SOURCE-COMPUTER. B-5500.
1700     OBJECT-COMPUTER. B-5500.
1800 INPUT-OUTPUT SECTION.
1900 FILE-CONTROL.
2000     SELECT PRINT-FILE ASSIGN TO PRINTER.
2100 DATA DIVISION.
2200 FILE SECTION.
2300     FD PRINT-FILE;
2400         DATA RECORDS ARE PRT1.
2500     01 PRT1.
2600          05  PRT2         PIC X(120).
2700 WORKING-STORAGE SECTION.
2800     01 NUMBER-TABLE.
2900         05 INPUT-NUMBER   PIC 99.
3000         05 NUMBER-X       PIC 99 OCCURS 11 TIMES.
3100     01 SUBSCRIPTS.
3200         05 NDX            PIC 99.
3300     01 CALCULATION-FIELDS.
3400         05 CURRENT-NUM    PIC 99.
3500         05 RESULT         PIC 9999999.
3600     01 DISPLAY-FIELDS.
3700         05 DISPLAY-RESULT PIC 999.
3800 PROCEDURE DIVISION.
3900 MAIN SECTION.
4000 000-MAIN-LOGIC.
4100     DISPLAY "PLEASE ENTER 11 NUMBERS"
4200     PERFORM 100-ACCEPT-NUMBERS
4300         VARYING NDX FROM 1 BY 1
4400         UNTIL NDX GREATER THAN 11.
4500     DISPLAY "RESULTS ARE".
4600     PERFORM 200-PROCESS-NUMBERS
4700         VARYING NDX FROM 11 BY -1
4800         UNTIL NDX LESS THAN 1.
4900     STOP RUN.
5000 100-ACCEPT-NUMBERS.
5100     ACCEPT INPUT-NUMBER.
5200     MOVE INPUT-NUMBER TO NUMBER-X(NDX).
5300 200-PROCESS-NUMBERS.
5400     MOVE NUMBER-X(NDX) TO CURRENT-NUM.
5500     COMPUTE RESULT = SQRT(ABS(CURRENT-NUM))
5600         + (5 * (CURRENT-NUM ** 3)).
5700     IF RESULT GREATER THAN 400
5800         DISPLAY "TOO LARGE"
5900     ELSE
6000         MOVE RESULT TO DISPLAY-RESULT
6100         DISPLAY DISPLAY-RESULT
6200     END-IF.
6300 END-OF-JOB.

So this is indeed quite verbose - about double the size of the other implementations of TPK I have done. But even without any knowledge of COBOL it is fairly readable.

As in the Algol and Fortran examples, the line numbers are used for entry to CANDE only and are not used by the program itself.

The code is made up of five main divisions - identification, environment, data and procedure. Each contains one or more sections,

The line format is slightly relaxed compared to batch COBOL, with division and section definitions along with labels having to appear on column 1 but other lines can be free format.

The identification division contains structured comments about the propose and authorship of the code.

The environment division is where machine specific details are supposed to be put, such as computer type and selection of peripheral devices. The file-control part sets up the printer as an output device: although I don't use this explicitly, the compiler refuses to work without something being defined here.

The data division contains file and working-storage sections. The file section defines the output format for the printer and is again unused but required. The working-storage section defines variables - these are all global. Variables can be grouped into structures, so NUMBER-TABLE contains INPUT-NUMBER and NUMBER-X. The former is a single variable, the latter is an array introduced by the OCCURS keyword.

The PIC - or picture - keyword defines the variable type. 9 denotes a single digit 0-9, so INPUT-NUMBER PIC 99 means this variable can accept 00-99. RESULT contains the output of the TPK formula so needs to have a capacity of 7 digits to support TPK(99). DISPLAY-RESULT is only 3 digits as we results greater than 400 will not be printed.

The program code is in the procedure division, with control starting at the top. This prompts the user with DISPLAY and then executes two loops with PERFORM, each of which takes a label as the code to execute.

100-ACCEPT-NUMBERS uses ACCEPT to get numbers (must be two digits) and store them. ACCEPT NUMBER-X(NDX) would seem to be the obvious way to get and store in the array, but this gives a syntax error so I have to use the simple variable INPUT-NUMBER and then MOVE it into the array.

200-PROCESS-NUMBERS. calculates TPK and prints the result. Functions are not really part of COBOL so we use the COMPUTE statement, with SQRT, ABS and the exponentiation operator ** being non-standard Burroughs extensions.

Compiling and running the program

One interesting observation is that the COBOL compiler takes longer to run compared to the other languages - on retro-b5500, which runs close to the original hardware speed, it takes 20s for COBOL to compile and 10s for Algol. (simh runs as fast as possible but the difference is still noticeable). This probably comes from the size and complexity of the language, and may partially explain why interactive COBOL was never a great success on time-sharing systems.

The compiler also seems less polished than the Algol one, with several crashes occurring as I worked on the program. For example, if you delete the END-OF-JOB. statement and compile it seems to start reading uninitialised memory:

**ERROR @??????: SEQUENCE NUMBER TRUNCATION -??-
**ERROR @??????: CARD TRUNCATION ??????

-EOF NO LABEL 1S002 RUPERT , S = 27, A = 164

Source and a transcript of its execution can be found on Github.

Further information

Apart from the B7000/B6000 Series COBOL Reference Manual mentioned above there is also Efficient B6700 COBOL from 1981 at the Charles Babbage Institute collection.

There is surprising little material about learning COBOL on the Internet - possibly because of a lack of interest from hobbyists. the University of Limerick had a COBOL course online once but this is now only available on archive.org

Questions, corrections, comments

I welcome any questions or comments, and also especially any corrections if I have got something wrong. Please email me at rupert@timereshared.com and I will add it here and update the main text.