ARX (Arnie Rhom's eXtractor)

Thomas Day <tday6@csc.com>

ARX (Arnie Rhom's eXtractor) was in use at the Boeing Company in 1978. I re-wrote the 757 Computer Aided Manufacturing Management Information System using ARX. ARX modules also provided the basis for the AWACS I on-board data storage.

ARX was an inverted list processor for IBM 370 architecture mainframes running VM/CMS. It supported TTY dumb terminals and later IBM 3270s. It was limited by the width of its retrieval path (8 bits) and the fact that it was an in-core processor. The entire database had to be read into memory. This led to the physical segmentation of what was logically a single database. I think that the maximum size for a single database (or segment of a database) was 3 million bytes.

ARX used an SQL-like query language that supported the usual operators plus an exclusive OR (XOR). It also had the 'blitz' language to "make explicit that which was implicit in the data." The blitz language was interpreted and did not have variables. If you wished to store the results for later query retrieval you had to add a column to the database to store the results in. The query report allowed the stacking of fields, so that values from different columns could be placed vertically in a single row return (a nice feature which I still miss) as well as a free form query result language which allowed the mixing of data and literals. ARX supported join and union functions. In fact, because of the necessity of segmenting databases, the join function was essential to ARX.

ARX had a "prune" capability where the user could select a sub-set of the whole and treat that as a database. The prune could be either virtual (in which case it would be updated as the underlying database was updated) or it could be real (in which case it became a new database). ARX also had a "n-tupple to bit vector" transformation capability, which allowed data to be viewed down the vertical axis as opposed to the more conventional horizontal axis.

There was no multi-user access or relational or data integrity -- pretty primitive in that regard.

Data was stored in a single file (the ARX file) of variable length records. Each record corresponded to a single attribute with each occurrence of the attribute being appended to the existing record. The first record in the file was a header record which described the logical layout of the subsequent records. When a new value (occurrence) was added to a record, the entire record would be resorted so that the values were alphabetical from left to right.

Users' data was stored in a second variable length file. The header record simply pointed to the data file (the ARX file). Each record (corresponding to an Oracle row) consisted of comma separated pointers to occurrences of values within attributes (records in the ARX file). Thus user's data of 2,4,57,1 meant the 2nd occurrence of the 1st attribute, the 4th occurrence of the 2nd attribute, etc. If data for a particular attribute was unique for each record then that attribute was stored as a literal within the users' data and did not have a row in the ARX file. The whole goal was to have a particular significant value occur once and only once in the entire stored database.

The major shortcoming of this scheme is that when a new value is entered, the attribute record in the ARX file is sorted and the position of every occurrence (potentially) changes. Thus every record in the users' data has to be touched and potentially updated. It was for this reason that the database was limited to in-core processing. That was the only was to get a reasonable response time.

There was no explicit commit. Data was committed by hitting the Return or Enter key. There were no redo logs or rollback segments. Writes from the memory version of the database to the physical version on disk were handled by the operating system. ARX consisted of 80+ modules (written in FORTRAN I believe) that were executed from the VM/CMS command line.

Arnie Rhom wrote it. Tony Casserino promoted it heavily and I wrote the users' manual and ended up as the product sponsor. It was only used internally to the Boeing Company. It supported the on-line (TTY to IBM mainframe) telephone directory, the personnel system, the FAA reporting system and a multitude of other applications.

DISCLAIMER: After all these years, I cannot guarantee the accuracy of all the above statements. They are, to the best of my ability, an accurate statement of what I understood at the time.


Other relational systems; System R home page; Paul McJones's home page
Internet addresspaul at mcjones dot org