A European Informational Website
learn more
Comparative linguistics (originally comparative philology) is a branch of historical linguistics that is concerned with comparing languages in order to establish their historical relatedness. Languages may be related by convergence through borrowing or by genetic descent.
Genetic relatedness implies a common origin or proto-language, and comparative linguistics aims to construct language families, to reconstruct proto-languages and specify the changes that have resulted in the documented languages. In order to maintain a clear distinction between attested and reconstructed forms, comparative linguists prefix an asterisk to any form that is not found in surviving texts.
The fundamental technique of comparative linguistics is to compare phonological systems, morphological systems, syntax and the lexicon of two or more languages using a technique known as the comparative method. In principle, every difference between two related languages should be explicable to a high degree of plausibility, and systematic changes, for example in phonological or morphological systems, are expected to be highly regular (ie consistent). In practise, the comparison may be more restricted, eg just to the lexicon. In some methods it may be possible to reconstruct an earlier proto-language. Although the proto-languages reconstructed by the comparative method are hypothetical, a reconstruction may have predictive power. The most notable example of this is Saussure's proposal that the Indo-European consonant system contained laryngeals, a type of consonant attested in no Indo-European language known at the time. The hypothesis was vindicated with the discovery of Hittite, which proved to have exactly the consonants Saussure had hypothesized in the environments he had predicted.
Where languages are derived from a very distant ancestor, and are thus more distantly related, the comparative method becomes impracticable. In particular, attempting to relate two reconstructed proto-languages by the comparative method has not generally produced results that have met with wide acceptance. A number of methods based on statistical analysis of vocabulary have been developed to overcome this limitation such as lexicostatistics and mass comparison. The former uses lexical cognates like the comparative method but the latter uses only lexical similarity. The theoretical basis of such methods is that vocabulary items can be matched without a detailed language reconstruction and that comparing enough vocabulary items will negate individual inaccuracies. Thus they can be used to determine relatedness but not to determine the proto-language.
The earliest method of this type was the comparative method, which was developed over many years, culminating in the nineteenth century. This uses a long word list and detailed study. However, it has been criticized for example as being subjective. In the twentieth century an alternative method, lexicostatistics, was developed which is mainly associated with Morris Swadesh but is based on earlier work. This uses a short word list of basic vocabulary in the various languages for comparisons. A further method, developed by Joseph Greenberg, is mass comparison. This also has been criticized, mainly because lexical comparison is considered to be less fundamental than cognate comparison. Recently computerised statistical hypothesis testing methods have been developed which are related to both the comparative method and lexicostistics.
An outgrowth of lexicostatistics is glottochronology, which proposed a mathematical formula for establishing the date when two languages separated, based on percentage of a core vocabulary of 100 (earlier 200) items that are cognate in the languages being compared. Glottochronology has met with continued scepticism, and is seldom applied today. Even more controversial is mass lexical comparison, which disavows any ability to date developments, aiming simply to show which languages are more and less close to each other, in a method similar to those used in cladistics in evolutionary biology. However, since mass comparison eschews the use of reconstruction and other traditional tools, it is flatly rejected by the majority of historical linguists.
Recently more sophisticated tree and network based cladistic methods have been used to investigate the relationships between languages. These are considered by many to show promise but are not wholely accepted.
Such vocabulary-based methods are able solely to establish degrees of relatedness and cannot be used to derive the features of a proto-language, apart from the fact of the shared items of compared vocabulary. These approaches have been challenged for their methodological problems - without a reconstruction or at least a detailed list of phonological correspondences there can be no demonstration that two words in different languages are cognate. However, lexical methods can be validated statistically and by their consistency with independent findings of history, archaeology and population genetics.
There are other branches of linguistics that involve comparing languages, which are not, however, part of comparative linguistics:
There is also a wide body of publications containing language comparisons that are considered pseudoscientific by linguists; see pseudoscientific language comparison.