|Edit this box|
Languages change over time. Eventually, they may change so much that there is no similarity to the original. Estimates vary, but one commonly cited opinion is that if a group of Americans were sent to a distant galaxy, after 10,000 years they would be speaking a language that would be no more similar to English than to Chinese or Arabic. However, other work (eg Isidore Dyen, Sergei Starostin) indicates that in fact words have wildly differing expected life spans; thus, for instance, a specialized word like "goshawk" might on average last a mere millennium or two, whereas extremely common words like "I" and "you" last so long that it is not possible to even estimate their life span without reconstructions going further back in time than those that are universally accepted.
Historical linguists construct family trees, an idea pioneered by the 19th century historical linguist August Schleicher. The basis for the trees is the comparative method: languages presumed to be related are compared with one another, and linguists look for regular sound correspondences based on what is generally known about how languages can change, and use them to reconstruct the best hypothesis about the nature of the common ancestor language from which the attested languages are descended.
Use of the comparative method is validated by its application to languages whose common ancestor is known. Thus, when the method is applied to the Romance languages (which include French, Spanish, Portuguese, Italian, and Romanian), the reconstructed common ancestor language comes out rather similar to Latin - not the classical Latin of Horace and Cicero, but Vulgar Latin, the colloquial Latin spoken in various dialects in the late Roman Empire.
The comparative method can be used to reconstruct languages for which no written records exist, either because none were preserved or because the speakers were illiterate. Thus, the Germanic languages (which include German, Dutch, English, Norwegian, Swedish, Danish, Faroese, Icelandic, and the extinct Gothic) can be compared to reconstruct Proto-Germanic, a language that was probably contemporaneous with Latin and for which no records are preserved.
Germanic and Latin (more precisely, Proto-Italic, the ancestor of Latin and a few of its neighbors) are themselves related, being co-descended from Proto-Indo-European, spoken perhaps 5000 years ago. Scholars have reconstructed Proto-Indo-European on the basis of data from its ten daughter branches, which are: Germanic, Italic, Celtic, Greek, Baltic, Slavic, Albanian, Armenian, Indo-Iranian, and the two dead branches Tocharian and Anatolian.
The comparative method allows us to distinguish true linguistic descent (that is, the passing of a language from parents to children, down through the generations) from accidental resemblance due to cultural contact. For example, the majority of the vocabulary of Persian (Farsi) is taken from Arabic, as a result of the Arab conquest of Iran in the 8th century and much subsequent cultural contact. Yet Persian is Indo-European, being a member of the Indo-Iranian branch that also includes Sanskrit and many of the languages of modern India. The clue that Persian is Indo-European is that its core vocabulary generally has Indo-European cognates (as in mâdar 'mother'), and its essential grammatical elements are likewise Indo-European (as in bûd 'was', which includes elements related to English "be" and the English past tense ending "-ed".)
The comparative method has been successfully used to reconstruct some very large language families, notably Austronesian (which includes Hawaiian, Tagalog, Indonesian, and Malagasy) and Niger-Congo (the majority of the languages of modern Africa). Once the various changes in the daughter branches have been worked out, and a fair amount of the core vocabulary and grammar of the protolanguage are understood, then scholars will quite generally agree that a relationship of genetic relatedness has been proven.
Vastly more controversial are hypotheses about relatedness which are not supported by application of the comparative method. Scholars who attempt to probe deeper than the comparative method supports (for example, by tabulating similarities found by mass comparison without setting up sound correspondences) are often accused of scholarly wishful thinking. The problem is that any two languages have a huge number of opportunities to resemble one another just by accident, so merely pointing out isolated resemblances has little evidentiary value. A famous example is the Persian word for "bad", which is pronounced (more or less) just like English "bad". It can be shown that the resemblance between these two words is completely accidental, and has nothing to do with the (rather remote) genetic connection between English and Persian. For further examples, see False cognate.
Since supporting distant genetic relationships is so difficult, and the methodology for finding and proving such relationships is not well established (in the way that the comparative method is), the field of locating remote relationships is riven with scholarly controversy. Nevertheless, the temptation to pursue remote relationships remains a powerful lure to many scholars--after all, Proto-Indo-European must have seemed a rather wild hypothesis to many when it was first proposed.
The ultimate in remote reconstruction is the recovery of a Proto-World language. Not all scholars believe that such a language even necessarily existed. Moreover, it is difficult to reconcile Proto-World with what we know about prehistory. Pat Ryan and Joseph H. Greenberg have suggested that people coming out of northeast Africa around 50,000 BC spoke Proto-World. But that would violate the claim that no relationships would be recognizable after 10,000 years; if that figure is accurate, then if all languages are observably related, such a relationship must have somehow formed more recently.
Dené-Caucasian has also been postulated to include Na-Dené (North America), Sino-Tibetan, Ket (Siberia), Burushashki (Pakistan), Caucasian (Chechen, Dagestan languages), and Basque. This language family is extremely hypothetical.
The Nostratic hypothesis was proposed by a Dane named Holger Pedersen, in 1903. The hypothesis claims that the Nostratic grouping includes such widely ranging language families as Indo-European, Afro-Asiatic, Uralic, Altaic, Sumerian, Elamo-Dravidian, and Kartvelian. Others claim other sets of languages. Some have speculated that the Nostratics were refugees from a Black Sea Flood of around 5600 BC, and some think this is the origin of Noah's Flood from the Bible. However, linguists have reached no firm conclusion about the validity of the Nostratic hypothesis. Its proponents, unlike Greenberg, use the traditional comparative method; however, their comparisons are often accused of being far-fetched or involving too many semantic shifts, while some also accuse them of simply grouping together the language families most familiar to them and neglecting to compare each of them to language families further afield.
See also Language families and languages.