Latin alphabet

The Latin alphabet, sometimes called the Roman alphabet or by the name of one of the languages that uses it (e.g. English alphabet) is the most widespread writing system in the world and based on the alphabet used by the Ancient Romans to write Latin. Apparently it would technically be more proper to refer to it as “Latin script” (unless you are literally referring to the orthography of the language Latin), but I’m more accustomed to referring to it as an alphabet. “Script” to me is more about how you form the letters, as in Uncial, Carolingian, Fraktur…

The Latin alphabet was based on the Etruscan alphabet, which itself was based on the Greek alphabet, which in turn was based on Phoenician. In the Archaic Latin alphabet (used before the 3rd century BCE) there were only 21 letters (from the standpoint of our modern alphabet, it was missing G, J, U, W and Y). K and Z were rarely used, while C stood in for both /k/ and /g/ sounds (probably because Etruscan itself had no voicing distinction for velar consonants). In the 3rd century BCE, Latin removed the letter Z and added the letter G in its place (i.e. eighth in the alphabet) by adding a little bar to C. Once G was added, Latin could distinguish in spelling C /k/ and G /g/.

In the first century BCE, Rome conquered Greece and borrowed/reborrowed the letters Y and Z, which it added to the end of the alphabet. As such we arrive at the Classical Latin alphabet, with 23 letters (missing our modern J, U and W).

During the Middle Ages, W was added to the alphabets of the West Ger­ma­n­ic languages (before that it was thought of as two consecutive Vs), and spread to a number of neighbouring languages from there. During the Ren­ai­ss­ance, the convention was established of I and U denoting vowels, and J and V denoting consonants (previously I/J and U/V had been thought of as alternative ways of writing the same letter). As such we arrive at the “basic” 26-letter Latin alphabet.

There are, however, a number of other letters that are, or have been, used in various languages’ Latin-based alphabets. For example:

  • Old English used wynn ⟨Ƿ ƿ⟩, eth ⟨Ð ð⟩, and thorn ⟨Þ þ⟩, replaced today by W (for wynn) and “th” (for eth and thorn). Middle English also adopted the Irish letter yogh ⟨Ȝ ȝ⟩, replaced today by “gh”.
  • Icelandic still uses eth ⟨Ð ð⟩ and thorn ⟨Þ þ⟩, and Faroese still uses eth ⟨Ð ð⟩.
  • The ligature ash ⟨Æ æ⟩ is considered an independent letter in Norwegian, Danish, Icelandic and Faroese.
  • The ligature eszett ⟨ß⟩ is used and considered its own letter in German.
  • Several African languages use additional letters, often based on the letters of the International Phonetic Alphabet, like ⟨Ɛ ɛ⟩, ⟨Ŋ ŋ⟩ and ⟨Ɔ ɔ⟩.
  • Turkish, Kazakh and Azerbaijani separate ⟨I i⟩ out into dotted ⟨İ i⟩ and dotless ⟨I ı⟩ forms.
  • Azerbaijani also includes the letter ⟨Ə ə⟩, which is used to represent the phoneme /æ/ (not schwa – just to be confusing!).

Many languages also count as separate letters graphemes which are basically “one of Latin’s original letters with something added”. For example, in Spanish Ñ is a separate letter from N, with a separate section in the dictionary and everything (while its vowels with accents are not separate letters). Some languages also consider certain digraphs to be their own letters – to use Spanish examples again, traditionally ⟨ch⟩ and ⟨ll⟩ were considered their own letters, but Spanish basically gave up on this with the advent of the computer age and computerised alphabetical sorting.

While many letters in the Latin alphabet mean relatively consistent things across different languages (like ⟨L l⟩ or ⟨P p⟩), some are much more ambiguous. Some of the most ambiguous ones include:

  • ⟨C c⟩ used in Latin originally for /k/, palatalised before /e i/ in the Romance languages and can variably mean /s/, /tʃ/, /θ/, /ts/ in that position in these languages (while still meaning /k/ elsewhere). Some other languages have settled on their own consistent meanings for ⟨C c⟩; for example, it always means /tʃ/ in Indonesian and /ts/ in Albanian and Slavic languages written in the Latin script.
  • ⟨G g⟩ was used in Latin for /g/, but like with ⟨C c⟩ in the Romance languages it palatalised before /e i/ and came to mean /ʒ/ or /dʒ/ (in Spanish it evolved further and came to mean /x/). Generally elsewhere and in non-Romance languages it reliably means /g/, though.
  • ⟨J j⟩ derives from the original Latin letter ⟨I i⟩ and came to mean the semivowel /j/. This is still what it means in most of the Germanic languages, Albanian, and the Slavic languages that use the Latin alphabet. But in the Romance languages and English, ⟨J j⟩ merged with “soft G” (see previous dot point) and denotes the same sound. ⟨J j⟩ also means /ʒ/ in Turkish, and /dʒ/ in Indonesian.
  • ⟨W w⟩ can mean /w/ (as it does in English and Indonesian, for example), /v/ (in languages like German or Polish) or even a vowel like /ʊ/ (as in Welsh and Cornish, although before another vowel it represents /w/).