Albanian is an Indo-European language which is the sole surviving member of its branch of the Indo-European family. It’s descended from one of the Paleo-Balkan languages of antiquity, but it’s not totally clear which one. Traditionally Albanian’s said to be the descendant of Illyrian, primarily because Illyrian was spoken in the same part of the Balkan Peninsula where Albania is located today. However, there’s not enough evidence to rule out it being the descendant of a different one, like Thracian or Daco-Moesian. At any rate, Albanian itself is attested from the 15th century onwards, and today is estimated to have 7.5 million speakers, of which 6 million live in the Balkans.
There are two main dialects of Albanian, Gheg (spoken to the north of the Shkumbin River) and Tosk (spoken to the south). They are mutually intelligible. The dialects seem to have diverged after the region converted to Christianity in the 4th century CE, but before the 6th century. Standard Albanian, based on Tosk, is the official language of Albania and Kosovo, and co-official in Montenegro and North Macedonia. There are centuries-old Albanian communities in countries near Albania, for example the Arbëreshë in Italy, the Arvanites in Greece, and the Arbanasi in Croatia.
The first written mention of the Albanian language is in a document in Latin from 13th century Dubrovnik. The first surviving document in Albanian dates back to 1462, but we know there are earlier documents written in Albanian that have been lost; a letter in 1332 explicitly mentions the existence of written Albanian. The oldest surviving books in Gheg and Tosk also demonstrate that clear orthographic norms already existed before the time they were written. Albanian was not recognised for almost the entirety of its five centuries under Ottoman rule, until Albanian-medium schools were permitted in 1909.
Albanian has borrowed a huge amount of its vocabulary, with some estimates even saying that Albanian has lost as much as 90% of its original (inherited from PIE) vocabulary (although other estimates say it’s not quite that high). The biggest influences on Albanian’s vocabulary have been Latin and the Romance languages, with 60% of Albanian words estimated to come from one of those sources. There’ve also been many borrowings from the South Slavic languages and Turkish, a smaller number from Greek, and even a really small number from Gothic. Additionally, it seems there were also a certain amount of borrowings from Albanian into Romanian, but not vice versa. Some of the vocab Albanian did directly inherit from PIE has shifted in meaning (e.g. motër means “sister”, but descends from the PIE for “mother”).
Standard Albanian has seven vowel phonemes /a ɛ i ɔ u y ə/, with the letter ⟨ë⟩ used for schwa, and 29 consonant phonemes (nothing really crazy; it has dental fricatives /θ ð/ like English, distinguishes between flap /ɾ/ and alveolar /r/ like Spanish, has “light” /l/ and “dark” /ɫ/ as separate phonemes, and its most unusual feature is probably the palatal affricates, voiced /ɟ͡ʝ/ and voiceless /c͡ç/ – sort of similar to English /gj/ and /kj/).
Albanian retains PIE’s three grammatical genders (masculine, feminine, neuter) and six of its cases (nominative, accusative, genitive, dative, ablative, vocative). The genitive and dative cases are identical, so to signify the genitive an extra particle is required on the front of the noun. Some dialects also retain a locative case, which is gone in Standard Albanian. Definite articles are post-posed onto the ends of nouns, like in most other languages of the Balkan sprachbund.
Albanian verbs have more of an analytic structure in place of the synthetic one an earlier stage of the language would’ve had. Verbs can have six moods, and eight tenses (three simple and five compound). Like some other languages in the Balkan sprachbund, there is an “admirative” mood, used either to express the speaker’s surprise or that they’re relaying something they’ve heard and haven’t witnessed directly. The Albanian word for “not” depends on the verb’s mood. Word order is relatively free because the language overall is not analytic, but the most common order is SVO.