Is Korean an Altaic language?

Typologically: yes. Genetically, no.

Typology refers to the structure of a language and, as is well known, modern Korean shares similar grammatical characteristics to Japanese and Mongolian as well as other more geographically distant “Altaic” languages such as Turkish, including a basic subject-object-verb word order, polysyllabic root structure and suffix based agglutinative morphology (the last being where grammatical particles and verb conjugations are directly attached to the end of, or come after, words).

However, the sense in which the Altaic language “family” was originally conceived and is still commonly thought of, is as a genetic language group equivalent to Indo-European or Sino-Tibetan and implying that the associated Altaic languages share a hypothesized common ancestor, known as a “proto” language, in this case “proto-Altaic”.

Why can’t the Altaic languages be considered a genetic language family?

In the case of defining a genetic language family, identifying basic vocabulary with shared etymologies (“cognate words”) between the candidate languages is more indicative and assertable as proof than typological similarities in grammar (the primary shared characteristic of the Altaic languages). The fundamental weakness of the Altaic language hypothesis is simply that the languages involved do not share very much basic vocabulary at all.

How are language families determined and what is Korean if it is not Altaic?

The complete vocabulary (“lexicon”) of any modern language can be understood as having been built up in layers over time in a manner similar to archaeological strata. Any language may include layers of foreign vocabulary such that the given language as a whole becomes a mix of more than one language family (or, in the case of English for example, a mix of separate branches of the same Indo-European). Out of this, the genetic family a language is ideally associated with is the oldest recoverable layer.

As is commonly known, the modern Korean language is in fact Sino-Korean and likely has been since the political formation of historical Korea. At least half, if not more, of the lexicon is “borrowed” Classical Chinese and on top of that, there is now much modern English vocabulary. The earliest Chinese layers may date to the period of the Han Commanderies, c.108BCE, (currently a politically and historiographically sensitive topic in Korea) or rather their subsequent downfall which, according to historical accounts, may have seen “Chinese” refugees enter the peninsula; prior even to that, the harshness of the Qin dynasty was also said to have caused a refugee influx conveniently resulting in the establishment of the Jinhan polity, but this latter may equally have been a fictitious Chinese claim. These layers were then followed more definitely by the introduction of Buddhism and Confucianism which were transmitted in written Chinese, then later reinforced with the ascendency of Neo-Confucianism from the 14th century onwards, and finally (so far) early modern Sinic vocabulary introduced first via Catholic missionaries active in China late 18th and early 19th centuries, and then in greater volume from Sino-Japanese during the late 19th and early 20th centuries; it should be noted much of these last strata were direct translations of European biblical, ideological and technical terms rather than Chinese though, in a similar manner, much of the early Buddhist Chinese vocabulary was also translated or transliterated from the original Indic Buddhist languages. In North Korea there has been a further layer of imported Sinic vocabulary associated with Marxism which would have first been introduced from Japan and later via Chinese.

However, the Korean part of Sino-Korean (aka the Korean language), what Koreans today refer to as “pure Korean” (순우리말 sun-uri-mal where, somewhat ironically, sun meaning “pure” is itself Chinese 純), understood as the earliest layer of the Korean language into which all the subsequent layers of Chinese were borrowed, can be identified as “Koreanic.” Thus there is a language family termed Koreanic of which only the Korean language survives. When there is only a single language attesting a language family, that language may be described as an “isolate”, so the modern Korean language is an isolate of Koreanic; historically, too, there are no other known Koreanic languages, that is, anything more distinct than regional dialectic variations.

Because Chinese is, of course, as traceably old as Korean, it is not impossible to argue that the Sino-Korean language is a Sinic language classifiable under the Sino-Tibetan language family; the Sinic vocabulary in Sino-Korean together with Sino-Japanese is useful in helping to reconstruct early Chinese phonology. However, the important thing in terms of taxonomy is that we can be certain that there was a prehistoric era in the ancient past when a Koreanic language directly ancestral to modern Korean was being spoken before it came into contact with ancient Chinese. What “pure Korean” nationalists today tend to misunderstand is that this period would have been much earlier than the formation of any “Korean” polity or cultural identity and geographically limited to only a small region, possibly the southeast of the peninsula and that only a tiny minority of the ancestors of the post Silla expansion population of the peninsula would ever have spoken this ancient Koreanic tongue whilst others, including the populations of Goguryeo and Baekje, would have spoken entirely different, quite likely non-Koreanic languages which would already have been infused with Chinese vocabulary before coming into contact with Koreanic. Historically, though, it was Koreanic which spread and either replaced or absorbed the other peninsula languages such that it was Koreanic which borrowed Chinese and other vocabulary into its lexicon meaning the oldest original stratum of the surviving Korean language is Koreanic and not Chinese. Of course, there would also have been regions and periods when Koreanic vocabulary was borrowed into other languages and in those cases it would not have been the oldest stratum, but those languages or idioms ultimately perished or, for example, may have survived outside of the peninsula such as is potentially the case of Japanese. But, in any event, this is why it is reasonable to term the modern (Sino-)Korean language as Koreanic.

The assumption then, is that Koreanic would be a branch of the Altaic language family collateral to other Altaic language groups (Turkic, Mongolic and Tungusic), all descended from a single proto-Altaic language (as the earliest hypothesized recoverable layer). “Recoverable” means that the former existence of an extinct language can be confidently postulated and some basic vocabulary reconstructed: this is the work of comparative linguists who use the “comparative method” of linguistics to accomplish, or at least attempt, this.

How does the comparative method work?

To scientifically prove that two or more languages are descended from a common ancestor it is not enough to simply find words which appear similar, although that tends to be the initial starting point, rather consistent sound correspondences have to be established. The theory is, when two languages split and are subsequently isolated from one another, over time the pronunciation of certain sounds in the language will naturally evolve and change in different directions; the key phenomenon exploited by comparative linguists is that the sound changes are internally consistent within the languages, so not just one word changes its pronunciation by chance but the same sounds (certain consonants or vowels in certain positions for example) as they occur in all words in the language change in the same manner. Additionally, however, there are also other changes or exceptions which may occur to pronunciation including the influence of secondary (or multiple) borrowings between genetically related languages which have split and this all muddies the waters.

When trying to identify the sound laws dictating regular correspondences to other genetically related languages the other secondary influences on given pronunciation need to be accurately identified mainly in order to disregard them. This understanding of the historical development of a language allows for “internal reconstruction” of its vocabulary; that is, before comparing a look-a-like cognate word in one language to another, it is necessary to establish as far back as possible the original shape of the word. Two words which happen to look similar in two languages today (even if the two languages are in fact related) may in the past have been quite different from one another and only come to appear similar by coincidental or secondary processes, in which case they cannot be considered indicative of a genetic relationship.

When attempting to identify potential cognates between two or more candidate languages, focus needs to be directed on basic vocabulary items as these are most likely to be the oldest parts of the language whilst any more complex or conceptually abstract words are more likely to be new or borrowed from neighbouring languages. Basic vocabulary may include the numerals 1-9, body parts, weather, natural geographic features (river, mountain etc), native flora and fauna and primary colours, but even in these cases there is often secondary borrowing from other languages so nothing is certain without rigorous investigation. Potential cognates should also have relatively similarly meanings as otherwise it is simply too easy to find look-a-like words in other languages: for example, if the word for “tree” and word for “sea” are similar this is more likely to be a coincidence, but the words for “lake” and “sea” obviously could have evolved from whichever word referred to a body of water depending on whether the homeland of the proto language was beside a lake or ocean.

A key challenge in establishing genetic cognates is that it is ultimately very difficult to prove whether look-a-like words in two or more languages are the results of borrowing or genetic affinity. In fact, if words look too similar it should raise suspicion that they are borrowings as it implies they are, in relative terms, more recent and have had less time to change. For this reason, the better proof of a genetic relationship between languages comes through words which on the surface do not look alike but can still be connected through sound laws.

Aside from politics and racial theory, the reason it is useful to establish a genetic relationship between two surviving or historically recorded languages is because through the theory of regular sound changes (which have to be identified), vocabulary from ancestral languages going back to a common proto ancestral language, can be deduced and reconstructed. In this way the comparative method is a natural science on which predictions can be made; reconstructed vocabulary, always marked with an asterisk * prefix in academic papers, are the predictions which may ultimately be proven only through discovery of ancient texts containing the older languages. By contrast, there is currently no productive or known theory relating to the typology of languages, it is simply descriptive.

The fundamental weakness of the Altaic hypothesis:

With this in mind we can return to the idea of the genetic Altaic language hypothesis and why Korean cannot be classified as such. In short there are two problems: one is that the original genetic relationship between the “core” Altaic language groups of Turkic, Mongolic and Tungusic has not been satisfactorily established through the comparative method, so there is no Altaic language family within which Koreanic could be included; the other is that Koreanic shares little to no basic vocabulary with any of the said core groups.

Turkic, Mongolic and Tungusic share much secondary vocabulary most likely owing to areal contact (meaning they have interacted in close geographic proximity allowing for the borrowing of vocabulary into one another’s languages). In particular, there is shared vocabulary between Turkic and Mongolic, and Mongolic and Tungusic, but less so between Turkic and Tungusic which is all indicative of the processes of areal contact rather than the three language groups having a shared genealogy. However, the corpuses of proposed Altaic cognates have been built up usually on the premise that Turkic, Mongolic and Tungusic are equal candidate branches of Altaic; this means that when hypothesized Altaic cognates are sought for in Koreanic, Japonic (Japanese) or other languages, there are three language sources from which to pick the most convenient look-a-like word. Through this “omni-comparative” approach which untrained linguists tend to adopt and willing believers accept, many secondary borrowed items are mis-identified as genetic cognates, but they are not supported by regular sound changes and so the situation remains that there is little proven shared basic vocabulary between the core language groups and especially so with Koreanic.

Why, then, do the Altaic languages appear so seductively similar?!

Despite the lack of a genetic relationship, there has been a close cultural relationship and long early history of interaction between the speakers of the Altaic languages which are now spread in an expansive arc across the central Eurasian steppe. Intensive borrowing of vocabulary between the core languages (as mentioned, particularly between Turkic and Mongolic, and Mongolic and Tungusic, but not Turkic and Tungusic) and their similar grammatical structures tell us that their homelands were once in closer proximity, and from relatively early on (by 1930s) this has generally been agreed to have been around the region of southern Manchuria, and not the Altai mountains after which the proposed language family was evocatively named when it was initially suggested to have originated from the central area of its contemporary known spread (that is, at a time when the Uralic languages were also thought to be a part of the Altaic complex – see more on this in the next post).

However, once more in contrast, Koreanic shares very little borrowed vocabulary indicating that it was isolated from the other proposed Altaic languages particularly early on; there may have been later interaction with the Tungusic Jurchenic branch (ancestral to Manchu) during the Three Kingdoms period of Korean history (assuming Koreanic was the primary language of Silla and Jurchenic spoken to the north in the continental territory of Goguryeo) and, as a much more secondary and ultimately quite limited influence, historical contact with Mongolian under the Yuan dynasty. Otherwise Koreanic appears to have evolved and survived in relative isolation.

So, it must have been at a prehistoric stage of settlement in the peninsula that Koreanic speakers interacted with those speaking other languages of the Altaic typology. It is not known under what circumstances this occurred; for example, it may have been that Koreanic entered into the peninsula from the north having previously passed through, or evolved alongside, Altaic languages in southern Manchuria; or there may have been other Altaic type languages, subsequently lost, spread across the peninsula which came into contact with Koreanic in its historical southeastern homeland region.

It is not unreasonable to postulate and, in fact, vital to keep in mind that aside from the historically known languages there would have been many other languages and language families which were absorbed or forced into extinction by the languages which subsequently survived. Travelling back in time, the linguistic map would not become reduced to only the few proto-languages that are discussed today: it would be just as complex as ever with any number of languages we have no knowledge of having existed alongside the proto-languages we do know of. The proto-languages discussed today are not the oldest languages, only the oldest recoverable layers of known languages; they in turn belonged to earlier language families which, where necessary, can be termed “pre-proto”.

One explanation for why so many distinct language families arose in the region of Manchuria may be the number of river basins which could support the development of several cultures whilst allowing for their independence; the spread of the Altaic languages, particularly Turkic and Mongolic, westwards would have been enabled through the adoption of nomadic pastoralism which first required horses and was greatly enhanced with the introduction of stirrups that allowed for mounted archery and success in warfare.

The “out of Manchuria” expansion of the languages was subsequent to the initial development of their Altaic typologies but most of the borrowing of vocabulary would have occurred in the context of historically known interaction during and post expansion, for example the Turkic speaking Xiongnu and Mongolic Xianbei when the Xiongnu occupied what later became known as the Mongol steppe and the Xianbei were in the Liaoxi (遼西 “west of the Liao river”) region directly adjacent to their east. Koreanic speakers were more isolated in the peninsular and so there was less borrowing.

How is it that languages could interact enough to influence one another’s structures without imparting vocabulary?

In short, rules are not known, but interaction (“areal contact”) can occur between languages in any different number of ways depending on such factors as the relative ratios of populations involved and their respective stages of cultural and political development. Areal contact of languages is directly related to the concept of layers where one language will expand over another, or may survive under an expansion itself; obvious examples occur in the case of invasions and colonization, but whether it is an entire population expanding over less populated or political developed regions, or an elite takeover of an otherwise established civilization, will make a difference to how the languages interact.

A pertinent example of a language being influenced typologically but not lexically (the borrowing of vocabulary) is Mandarin Chinese which, as a northern variant of Chinese that came to dominance during the Manchu Qing dynasty, exhibits many Altaic features absent from other Sino-Tibetan languages, namely more polysyllabic vocabulary, fewer tones and greater use of suffix-based morphology (“morphology” referring to the shape of words) implying it has undergone a process of partial “Altaicization” without absorbing new vocabulary.

The following post will discuss the relationship between Korean and Japanese as well as the historical context of the Altaic hypothesis and reasons for its enduring popularity amongst Koreans today.