Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

The classification of programming languages by usage

Doyle, John R. and Stretch, D. D. 1987. The classification of programming languages by usage. International Journal of Man-Machine Studies 26 (3) , pp. 343-360. 10.1016/S0020-7373(87)80068-8

Full text not available from this repository.


Relationships between 16 programming languages have been investigated using data from 1062 U.K. software firms. The number of firms which use both of a given pair of languages is recorded for all pairings of the 16 languages. Above average co-occurrence of a pair is taken as evidence of relationship between the two languages. Alternatively, the number of firms which use neither of a given pair of languages is recorded for all pairs of languages. The two methods of deriving similarity matrices we call the AND analysis (relationship by co-occurrence) and NOR (relationship by co-absence), by analogy with the Boolean operators. The AND and NOR similarity matrices first undergo separate quasi Chi-square fits to remove the size-contributions; the residuals (observed minus expected values) are then used as the raw input to a simple hierarchical clustering algorithm. Separate AND and NOR analyses reveal a consistent picture of inter-language relationships. Subjectively labelled, the broadest dichotomy seems to be between traditional languages, quite often considered clumsy (such as BASIC, COBOL, FORTRAN, Assembler…) and more modern, elegant languages (such as the Algol family and APL). Business vs scientific seems to be a secondary dichotomy. Dependence and dominance relationships can be examined by an XOR analysis: counting when one language of a pair is used while the other is not. Relative dominance (when the size-effect has been removed) is modelled by a simple directed graph, with five sub-groups of languages as the nodes. Some other similarity measures that might be used to relate programming languages are discussed in the Introduction, any of which may contribute to similarity by usage. Finally, the general method of analysis is applicable to many different situations in which binary data about co-occurrence of events is gathered across a large number of elements.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Business (Including Economics)
Subjects: H Social Sciences > H Social Sciences (General)
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Publisher: Elsevier
ISSN: 0020-7373
Last Modified: 05 Nov 2019 03:30

Citation Data

Cited 6 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item