模組:Languages/data/3/c
在"incategory ..."過濾器後添加搜索文字:
該模塊包含了語言代碼之定義和元數據。欲見其他相關資訊,請參見附录:語言列表。
此模塊不得直接用於其他模塊或模板。資料與數據應通過Module:languages獲取。
必填值
表中的每個項目必須包含以下索引字段:
1
- 該語言的「規範」名稱,舊版模塊中寫作
canonicalName
。這是維基詞典詞條和分類名稱中使用的名字。 2
- 該語言在維基數據的ID(以Q開頭)。這取代了舊的
wikipedia_article
屬性。若該語言在維基數據沒有對應的實體,可將其設為nil
。
選填值
3
- 該語言所屬的語系,請參見Wiktionary:語系。
屬性 entry_name
和 sort_key
用於替換文本;它們替換或刪除某些字符或字符集。其工作原理相似,且都是可選的。它們都可以是表,sort_key
可以是一個模塊的名稱,該模塊接收一個條目名並生成一個排序鍵(用於在類別頁面上對條目進行排序)。
if sort_key
is the name of a module, the module must contain a sortkey-generating function that is named makeSortKey
. This function must take the arguments text, lang, sc
, where text
is the page name (or other text in the language), lang
is the language code (not the language object), and sc
is the script code (not the script object). The returned value should always be a string, or there will be a module error in the Language:makeSortKey()
function.
If either one is a table, it must contain two tables inside it: one named from
and one named to
. These two tables are organised pairwise: each element in from
is a pattern to identify which characters in the term to replace, while the corresponding element in to
defines what to replace them with.
If the replacement is not present or if it is false
or nil
, it defaults to an empty replacement, meaning that the matching characters are removed altogether. This means that the from
list can be longer than the to
list, and an empty replacement will be assumed for any elements in from
that have no counterpart in to
.
The tables can contain literal characters, or the patterns (a type of regular expressions) that are used by the standard Scribunto mw.ustring.gsub
function. See the Scribunto reference manual for more information.
At the top of the module, there is a list of combining characters with names. These are provided for convenience and readability, as combining characters generally do not display properly inside the module code (although they do not affect the actual operation of the module).
entry_name
- Defines replacements to create the entry name from the displayed form of a term. This can be used to remove certain diacritical marks according to the customs or standard practice of the language. For example, it is used to remove accent marks from Russian words (
ру́сский
→русский
), or macrons from Latin or Old English words (ōs
→os
), as these are not used in the normal written form of these languages. This is used bymakeEntryName
in Module:languages. sort_key
- Defines replacements to create a category sort key from the page name. The purpose is to remove any characters that are ignored in sorting, and to replace similar characters with identical ones if the sorting rules for that language do not distinguish them. For example, in German, the characters "ä" and "a" are considered equivalent for sorting, and are both treated as "a". The page name is converted to lowercase before applying the replacements, so you should not add uppercase letters to the "from" lists. This is used by
makeSortKey
in Module:languages.
These are other optional values:
otherNames
- 該語言除了標準名稱外的所有名稱的表格。該表格不僅應包括該語言的同義詞,而且應包括指歸入同一類別的語言變體的名稱。例如,雖然佛蘭芒語不是荷蘭語的同義詞,但佛蘭芒語被認為是荷蘭語的「一部分」,因此該名稱被列入荷蘭語其中。
type
- 語言的類型(會影響它在Wiktionary上的處理方式)。可用的值包括:
regular
-這個值是默認值,所以不需要特別指定。這表示該語言的詞彙符合Wiktionary:收录标准,因此被允許放在主命名空間中。該語言可能也有重構的詞彙,則這些詞彙應被放在重構(Reconstructed)命名空間中,且必須在前面加「*」表示重構。(註:中文維基詞典暫無 Reconstructed 空間。)reconstructed
- 此語言不符合Wiktionary:收录标准,因此只允許在重構命名空間中使用。這類語言中的所有詞彙都是重建得到的,必須在前面加上「*」。appendix-constructed
- 這種語言已經通過驗證,但不符合對重構語言的額外要求(Wiktionary:收录标准#構建語言)。因此,其詞彙必須放在附錄(Appendix)命名空間中;又因為它們不是透過重建得到的,因此不應該在鏈接中加上「*」前綴。
scripts
- A list of script codes, see Wiktionary:Scripts. These represent all the scripts (writing systems) that this language uses in the real world, as well as the ones that Wiktionary uses. The scripts that are used most often on Wiktionary should be first in the list, as this will speed up script detection.
- Many templates and modules detect the script of text in a particular language using the
findBestScript
function in Module:scripts. This function goes down the list of scripts and counts how many characters in the text belong to each script. If all the characters belong to one script, that script will be returned; otherwise, the script with the most characters will be returned. Thus, script detection will be faster if the most frequently used scripts are first in the list. translit_module
- The name of a module that is used to generate transliterations of terms, without the Module: prefix. This module must export a function named
tr
that is defined as follows:tr(text, lang, sc)
- The three parameters are the text to be transliterated, the language code, and the script code. The function can ignore the language and script codes, but they are provided for cases when a language has more than one script, or when a single function is used to transliterate multiple languages sharing the same script.
ancestors
- A table listing the language codes of the direct ancestors of this language. For example, the ancestor of English is listed as
enm
(Middle English);ang
(Old English, the ancestor of Middle English),gem-pro
(Proto-Germanic, the ancestor of Old English), andine-pro
(Proto-Indo-European, the ancestor of Proto-Germanic) are not listed. - For most languages, only one ancestor code should be given, but multiple ancestors can be listed for pidgins, creoles and mixed languages.
- The ancestor language table should not be included if the language's direct ancestor is the proto-language of the family to which the language belongs. In such a case, if the family code has been provided, Module:languages will automatically add the proto-language as the language's ancestor. For example, Proto-Germanic (
gem-pro
) belongs to the Indo-European (ine
) family, and its direct ancestor is Proto-Indo-European (ine-pro
). Because Proto-Indo-European is the proto-language of the Indo-European languages, Proto-Germanic does not need anancestors
table; Proto-Indo-European will be automatically returned as its ancestor by thegetAncestors
function. wikimedia_codes
- A table listing the Wikimedia language codes that this language maps to. This is used to translate Wiktionary codes to Wikimedia codes, which are usually the same but there are a few languages where it is different. The language codes must be valid Wikimedia codes (as determined by the wiki software), and if they are not defined in one of the language data modules, they must be defined in Module:wikimedia languages/data.
wikipedia_article
- The name of the Wikipedia article for the language. Should normally only be supplied if the Wikidata id cannot be used.
local m_lang = require("Module:languages")
local m_langdata = require("Module:languages/data")
local u = require("Module:string utilities").char
local c = m_langdata.chars
local p = m_langdata.puaChars
local s = m_langdata.shared
local m = {}
m["caa"] = {
"喬爾蒂語",
35177,
"myn",
"Latn",
}
m["cab"] = {
"加里富納語",
35490,
"awd-taa",
"Latn",
ancestors = "crb",
}
m["cac"] = {
"丘赫語",
35233,
"myn",
"Latn",
}
m["cad"] = {
"喀多語",
56756,
"cdd",
"Latn",
}
m["cae"] = {
"Laalaa",
35564,
"alv-cng",
"Latn",
}
m["caf"] = {
"南達凱爾語",
12953426,
"ath-nor",
"Latn",
}
m["cag"] = {
"尼瓦克萊語",
3182557,
"sai-mtc",
"Latn",
}
m["cah"] = {
"Cahuarano",
2933175,
"sai-zap",
"Latn",
}
m["caj"] = {
"Chané",
56721,
"awd",
"Latn",
}
m["cak"] = {
"喀克其奎語",
35115,
"myn",
"Latn",
}
m["cal"] = {
"加羅林語",
28427,
"poz-mic",
"Latn",
}
m["cam"] = {
"卡穆希語",
3009690,
"poz-cln",
"Latn",
}
m["can"] = {
"Chambri",
5069707,
"paa-lsp",
"Latn",
}
m["cao"] = {
"Chácobo",
2591202,
"sai-pan",
"Latn",
}
m["cap"] = {
"奇帕亞語",
35235,
"sai-ucp",
"Latn",
}
m["caq"] = {
"卡爾尼科巴語",
35156,
"aav-nic",
"Latn",
}
m["car"] = {
"加勒比語",
56611,
"sai-gui",
"Latn",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. "`" .. "'%-%s"},
entry_name = {
remove_diacritics = c.acute,
from = {"â", "ê", "î", "ô", "û", "ŷ"},
to = {"à", "è", "ì", "ò", "ù", "ỳ"}
},
}
m["cas"] = {
"奇馬內語",
35950,
"qfa-iso",
"Latn",
}
m["cav"] = {
"卡維內納語",
524102,
"sai-tac",
"Latn",
}
m["caw"] = {
"卡拉瓦亞語",
266417,
"qfa-mix",
"Latn",
}
m["cax"] = {
"溪吉丹諾語",
1844993,
"qfa-iso",
"Latn",
}
m["cay"] = {
"卡尤加語",
32967,
"iro-nor",
"Latn",
}
m["caz"] = {
"Canichana",
2936374,
"qfa-iso",
"Latn",
}
m["cbb"] = {
"Cabiyarí",
3450660,
"awd-nwk",
"Latn",
}
m["cbc"] = {
"卡拉帕納語",
924405,
"sai-tuc",
"Latn",
}
m["cbd"] = {
"Carijona",
3446655,
"sai-tar",
"Latn",
}
m["cbg"] = {
"Chimila",
2963680,
"cba",
"Latn",
}
m["cbi"] = {
"查茨語",
2591329,
"sai-bar",
"Latn",
}
m["cbj"] = {
"Ede Cabe",
33112829,
"alv-ede",
"Latn",
}
m["cbk"] = {
"查瓦卡諾語",
33281,
"crp",
"Latn",
ancestors = "es",
entry_name = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer}},
sort_key = {
Latn = {
from = {"ch", "ll", "ñ", "r"},
to = {"c" .. p[1], "l" .. p[1], "n" .. p[1], "r" .. p[1]}
},
},
standardChars = {
Latn = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnÑñOoPpQqRrSsTtUuVvWwXxYyZz",
c.punc
},
}
m["cbl"] = {
"Bualkhaw Chin",
9229830,
"tbq-kuk",
"Latn",
}
m["cbn"] = {
"涅固爾語",
116849,
"mkh-mnc",
"Thai",
ancestors = "omx",
sort_key = "Thai-sortkey",
}
m["cbo"] = {
"Izora",
3915454,
"nic-jer",
"Latn",
}
m["cbq"] = {
"Shuba",
62603062,
"nic-knj",
"Latn",
}
m["cbr"] = {
"Cashibo-Cacataibo",
5359560,
"sai-pan",
"Latn",
}
m["cbs"] = {
"嘉西納瓦語",
2591230,
"sai-pan",
"Latn",
}
m["cbt"] = {
"Chayahuita",
1526525,
"sai-cah",
"Latn",
}
m["cbu"] = {
"Candoshi-Shapra",
642843,
"qfa-iso",
"Latn",
}
m["cbv"] = {
"Cacua",
3192052,
"sai-nad",
"Latn",
ancestors = "mbr",
}
m["cbw"] = {
"Kinabalian",
6410324,
"phi",
"Latn",
}
m["cby"] = {
"Carabayo",
3441762,
"sai-tyu",
"Latn",
}
m["cca"] = {
"Cauca",
5054242,
"sai-chc",
"Latn",
}
m["ccc"] = {
"查米庫羅語",
2155119,
"awd",
"Latn",
}
m["ccd"] = {
"Cafundó",
3331506,
"roa-ibe",
"Latn",
ancestors = "pt",
}
m["cce"] = {
"朝比語",
3437616,
"bnt-bso",
"Latn",
}
m["ccg"] = {
"Chamba Daka",
33120805,
"nic-dak",
"Latn",
}
m["cch"] = {
"阿燦語",
34794,
"nic-kne",
"Latn",
}
m["ccj"] = {
"Kasanga",
35542,
"alv-nyn",
"Latn",
}
m["ccl"] = {
"Cutchi-Swahili",
5196729,
"crp",
"Latn",
ancestors = "sw",
}
m["ccm"] = {
"馬六甲克里奧爾馬來語",
12636092,
"crp",
"Latn",
ancestors = "ms",
}
m["cco"] = {
"科馬爾特佩克-奇南特克語",
2963735,
"omq-chi",
"Latn",
}
m["ccp"] = {
"查克馬語",
32952,
"inc-eas",
"Cakm, Beng, Latn",
ancestors = "inc-obn",
translit = {
Cakm = "Cakm-translit",
--Beng = "Beng-translit",
},
}
m["ccr"] = {
"Cacaopera",
3438338,
"nai-min",
"Latn",
}
m["cda"] = {
"卓尼語",
2964447,
"sit-tib",
}
m["cde"] = {
"Chenchu",
32981,
"dra-tel",
"Telu",
}
m["cdf"] = {
"Chiru",
5102016,
"tbq-kuk",
"Latn, Beng",
}
m["cdh"] = {
"昌貝阿里語",
12953424,
"him",
"Deva, Takr",
translit = {Deva = "hi-translit"},
}
m["cdi"] = {
"楚德里語",
5103788,
"inc-bhi",
"Gujr",
}
m["cdj"] = {
"楚拉希語",
12629039,
"him",
"Deva, Takr",
translit = {Deva = "hi-translit"},
}
m["cdm"] = {
"切彭語",
5091700,
"sit-gma",
"Deva",
}
m["cdn"] = {
"Chaudangsi",
5088056,
"sit-alm",
}
m["cdo"] = {
"閩東語",
36455,
"zhx-com",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["cdr"] = {
"Cinda-Regi-Tiyal",
35596,
"nic-kmk",
"Latn",
}
m["cds"] = {
"乍得手語",
10322099,
"sgn",
"Latn", -- when documented
}
m["cdy"] = {
"茶洞語",
926742,
"qfa-kms",
}
m["cdz"] = {
"Koda",
6425038,
"mun",
"Beng",
}
m["cea"] = {
"下奇黑利斯語",
6693377,
"sal",
"Latn",
}
m["ceb"] = {
"宿霧語",
33239,
"phi",
"Latn, Tglg",
translit = {
Tglg = "ceb-translit"
},
override_translit = true,
entry_name = {
Latn = {
remove_diacritics = c.grave .. c.acute .. c.circ
}
},
standardChars = {
Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy",
c.punc
},
sort_key = {Latn = "tl-sortkey"},
}
m["ceg"] = {
"沙馬可可語",
3436637,
"sai-zam",
"Latn",
}
m["cen"] = {
"Cen",
12628777,
"nic-plc",
"Latn",
ancestors = "izr",
}
m["cet"] = {
"Centúúm",
33608,
"qfa-iso",
"Latn",
}
m["cfa"] = {
"Dijim-Bwilim",
3438350,
"alv-wjk",
"Latn",
}
m["cfd"] = {
"Cara",
35048,
"nic-beo",
"Latn",
}
m["cfg"] = {
"Como Karim",
35304,
"nic-jkn",
"Latn",
}
m["cfm"] = {
"法蘭欽語",
56815,
"tbq-kuk",
"Beng, Latn",
}
m["cga"] = {
"Changriwa",
5072105,
"paa-yua",
"Latn",
}
m["cgc"] = {
"卡加揚語",
6346422,
"mno",
"Latn",
}
m["cgg"] = {
"奇加語",
3270727,
"bnt-nyg",
"Latn",
}
m["cgk"] = {
"喬孔卡語",
56604,
"sit-tib",
"Tibt",
ancestors = "xct",
translit = "Tibt-translit",
override_translit = true,
display_text = s["Tibt-displaytext"],
entry_name = s["Tibt-entryname"],
sort_key = "Tibt-sortkey",
}
m["chb"] = {
"奇布查語",
2356431,
"cba",
}
m["chc"] = {
"卡托巴語",
5051602,
"nai-cat",
"Latn",
}
m["chd"] = {
"高地瓦哈卡瓊塔爾語",
2964457,
"nai-tqn",
"Latn",
}
m["chf"] = {
"塔巴斯科瓊塔爾語",
35175,
"myn",
"Latn",
}
m["chg"] = {
"察合臺語",
36831,
"trk-kar",
"Arab",
ancestors = "zkh",
entry_name = {
remove_diacritics = c.kashida .. c.fathatan .. c.dammatan .. c.kasratan .. c.fatha .. c.damma .. c.kasra .. c.shadda .. c.sukun .. c.superalef,
from = {u(0x0671)},
to = {u(0x0627)}
},
}
m["chh"] = {
"Chinook",
6693380,
"nai-ckn",
"Latn",
}
m["chj"] = {
"奧希特蘭奇南特克語",
5100110,
"omq-chi",
"Latn",
}
m["chk"] = {
"楚克語",
33161,
"poz-mic",
"Latn",
}
m["chl"] = {
"卡維拉語",
56438,
"azc-cup",
"Latn",
entry_name = {remove_diacritics = c.acute .. c.macron},
}
-- chm "Mari" is not recognized as a language, but it is a family code
m["chn"] = {
"契努克語",
35173,
"crp",
"Latn, Dupl",
ancestors = "chh, nuk",
}
m["cho"] = {
"喬克托語",
32979,
"nai-mus",
"Latn",
sort_key = {remove_diacritics = c.macronbelow .. "-"},
entry_name = {remove_diacritics = c.acute .. c.dotbelow},
}
m["chp"] = {
"契帕瓦語",
27692,
"ath-nor",
"Latn, Cans",
}
m["chq"] = {
"基奧基特佩克奇南特克語",
5758709,
"omq-chi",
"Latn",
}
m["chr"] = {
"切羅基語",
33388,
"iro",
"Cher",
translit = "Cher-translit",
}
m["cht"] = {
"Cholón",
2591243,
nil,
"Latn",
}
m["chw"] = {
"Chuabo",
5118412,
"bnt-mak",
"Latn",
}
m["chx"] = {
"Chantyal",
4926344,
"sit-tam",
"Deva",
}
m["chy"] = {
"夏延語",
33265,
"alg",
"Latn",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.macron .. c.dotabove .. "-"},
standardChars = "AaÁáÀàĀāȦȧEeÉéÈèĒēĖėHhKkMmNnOoÓóÒòŌōȮȯPpSsŠšTtVvXx" .. c.punc, --umlaut and circumflex not allowed
}
m["chz"] = {
"奧蘇馬辛奇南特克語",
5100111,
"omq-chi",
"Latn",
}
m["cia"] = {
"吉阿吉阿語",
35284,
"poz-mun",
"Hang, Latn, Arab",
}
m["cib"] = {
"Ci Gbe",
12952445,
"alv-gbe",
"Latn",
}
m["cic"] = {
"奇卡索語",
33192,
"nai-mus",
"Latn",
}
m["cid"] = {
"Chimariko",
1294251,
"qfa-iso",
"Latn",
}
m["cie"] = {
"Cineni",
56243,
"cdc-cbm",
"Latn",
}
m["cih"] = {
"奇納里語",
11855245,
"inc",
"Deva",
ancestors = "sa",
}
m["cik"] = {
"奇特庫利金瑙里語",
15615982,
"sit-kin",
}
m["cim"] = {
"辛布里語",
37053,
"gmw-hgm",
"Latn",
ancestors = "bar",
sort_key = {remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer .. c.ringabove .. c.caron},
}
m["cin"] = {
"粗腰語",
5121095,
"tup",
"Latn",
}
m["cip"] = {
"恰帕內克語",
3364475,
"omq",
"Latn",
}
m["cir"] = {
"Tiri",
7862281,
"poz-cln",
"Latn",
}
m["ciy"] = {
"Chaima",
12628867,
"sai-ven",
"Latn",
}
m["cja"] = {
"西占語",
12645578,
"cmc",
"Latn, Arab, Khmr", -- Western Cham script is not yet available. Also, Arabic script is missing some glyphs.
}
m["cje"] = {
"朱魯語",
2967321,
"cmc",
"Latn",
}
m["cjh"] = {
"上奇黑利斯語",
2962074,
"sal",
"Latn",
}
m["cji"] = {
"查馬拉爾語",
56567,
"cau-and",
"Cyrl",
translit = "cau-nec-translit",
override_translit = true,
display_text = {Cyrl = s["cau-Cyrl-displaytext"]},
entry_name = {Cyrl = s["cau-Cyrl-entryname"]},
}
m["cjk"] = {
"Chokwe",
2422065,
"bnt-clu",
"Latn",
}
m["cjm"] = {
"東占語",
2948019,
"cmc",
"Latn, Cham",
}
m["cjn"] = {
"Chenapian",
5091044,
"paa-spk",
"Latn",
}
m["cjo"] = {
"帕胡納爾-阿舍寧卡語",
3450481,
"awd",
"Latn",
}
m["cjp"] = {
"卡貝卡語",
27878,
"cba",
"Latn",
}
m["cjs"] = {
"紹爾語",
34139,
"trk-ssb",
"Cyrl",
}
m["cjv"] = {
"Chuave",
5115226,
"ngf",
"Latn",
}
m["cjy"] = {
"晉語",
56479,
"zhx",
"Hants",
ancestors = "ltc",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["ckb"] = {
"中庫爾德語",
36811,
"ku",
"ku-Arab",
translit = "ckb-translit",
entry_name = {remove_diacritics = c.kasra .. c.sukun},
}
m["ckh"] = {
"Chak",
12628870,
"sit-luu",
"Latn",
ancestors = "kdv",
}
m["ckl"] = {
"Cibak",
56279,
"cdc-cbm",
"Latn",
}
m["ckn"] = {
"Kaang Chin",
6343432,
"tbq-kuk",
"Latn",
}
m["cko"] = {
"阿努福語",
34845,
"alv-ctn",
"Latn",
}
m["ckq"] = {
"Kajakse",
3440422,
"cdc-est",
"Latn",
}
m["ckr"] = {
"凱拉克語",
3503002,
"paa-bng",
"Latn",
}
m["cks"] = {
"Tayo",
1133089,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["ckt"] = {
"楚科奇語",
33170,
"qfa-ckn",
"Cyrl",
entry_name = {
from = {"['’]"},
to = {"ʼ"}
},
sort_key = {
from = {"ё", "ӄ", "ԓ", "ӈ"},
to = {"е" .. p[1], "к" .. p[1], "л" .. p[1], "н" .. p[1]}
},
}
m["cku"] = {
"科阿薩提語",
35162,
"nai-mus",
"Latn",
}
m["ckv"] = {
"噶瑪蘭語",
716627,
"map",
"Latn",
}
m["ckx"] = {
"Caka",
5018037,
"nic-tvc",
"Latn",
}
m["cky"] = {
"Cakfem-Mushere",
3441199,
"cdc-wst",
"Latn",
}
m["ckz"] = {
"Cakchiquel-Quiché Mixed Language",
5054550,
"qfa-mix",
"Latn",
ancestors = "cak, quc"
}
m["cla"] = {
"Ron",
3440432,
"cdc-wst",
"Latn",
}
m["clc"] = {
"奇爾科廷語",
28535,
"ath-nor",
"Latn",
}
m["cld"] = {
"迦勒底新亞拉姆語",
33236,
"sem-are",
"Syrc",
entry_name = "Syrc-entryname",
}
m["cle"] = {
"萊勞奇南特克語",
6509365,
"omq-chi",
"Latn",
}
m["clh"] = {
"Chilisso",
3250629,
"inc-koh",
}
m["cli"] = {
"查卡里語",
35206,
"nic-gnw",
"Latn",
}
m["clj"] = {
"Laitu Chin",
6474196,
"tbq-kuk",
}
m["clk"] = {
"義都語",
56412,
"sit-gsi",
"Tibt, Deva",
translit = {Tibt = "Tibt-translit"},
override_translit = true,
display_text = {Tibt = s["Tibt-displaytext"]},
entry_name = {Tibt = s["Tibt-entryname"]},
sort_key = {Tibt = "Tibt-sortkey"},
}
m["cll"] = {
"查拉語",
35190,
"nic-gne",
"Latn",
}
m["clm"] = {
"克拉勒姆語",
33404,
"sal",
"Latn",
}
m["clo"] = {
"低地瓦哈卡瓊塔爾語",
2964450,
"nai-tqn",
"Latn",
}
m["clt"] = {
"Lautu Chin",
6502107,
"tbq-kuk",
}
m["clu"] = {
"卡魯亞農語",
32964,
"phi",
"Latn",
}
m["clw"] = {
"楚利姆語",
33125,
"trk-ssb",
"Latn, Cyrl",
}
m["cly"] = {
"東部高地查蒂諾語",
12642078,
"omq-cha",
"Latn",
}
m["cma"] = {
"Maa",
12953680,
"mkh-ban",
"Latn",
}
m["cme"] = {
"基爾馬語",
35074,
"nic-gur",
"Latn",
}
m["cmg"] = {
"古典蒙古語",
5128303,
"xgn-cen",
"Mong, Soyo, Zanb",
translit = {Mong = "Mong-translit"},
display_text = {Mong = s["Mong-displaytext"]},
entry_name = {Mong = s["Mong-entryname"]},
}
m["cmi"] = {
"Emberá-Chamí",
3052042,
"sai-chc",
"Latn",
}
m["cml"] = {
"占巴拉宜安語",
5027893,
"poz-ssw",
"Latn",
}
m["cmm"] = {
"Michigamea",
12636809,
"sio-msv",
"Latn",
}
m["cmn"] = {
"官話",
9192,
"zhx-man",
"Hants, Latn, Bopo",
wikimedia_codes = "zh",
generate_forms = "zh-generateforms",
translit = {
Hani = "zh-translit",
Bopo = "zh-translit",
},
sort_key = {
Hani = "Hani-sortkey",
Latn = {
from = {
-- Sort terms with tone numbers immediately after equivalent terms with diacritics.
"[aeiouv][" .. c.circ .. c.diaer .. "]?[nr]?g?[0-5]",
-- Add temporary breaks between syllables.
"([aeiouvmn][" .. c.circ .. c.diaer .. "]?[" .. c.macron .. c.acute .. c.caron .. c.grave .. "]?n?ŋ?g?r?)([bpmfdtnlgkhjqxzcsywrv']h?[aeiouvmn ])", p[1] .. "([ngr])$", p[1] .. "([ngr][%s%-'" .. p[1] .. "])",
-- Substitute diacritics for syllable-final tone numbers, and add tone 0 where necessary.
c.macron, c.acute, c.caron, c.grave, "([1-4])([^%s%p" .. p[1] .. "]+)", "([^0-5])%f[%z%s%p" .. p[1] .. "]",
-- Substitute "v" shorthand for "ü" for a temporary placeholder, so that the (very rare) "v" initial is not affected by the later shorthand substitutions.
"([^ " .. p[1] .. "])v",
-- Remove temporary breaks.
p[1],
-- Substitute shorthands for full forms, and sort them immediately after equivalent terms.
"%S*[csz]" .. c.circ .. "%S*", "%S*[ŋ" .. p[2] .. "]%S*", "ĉ", "ŝ", "ŋ", p[2], "ẑ",
-- "ê" comes after "e", "ü" comes after "u" and apostrophes are removed (as their function is replaced by tone numbers).
"[" .. c.circ .. c.diaer .. "]", "'",
-- Sort numbered tone 5 after tone 0.
"5!"
},
to = {
"%0!",
"%1" .. p[1] .. "%2", "%1", "%1",
"1", "2", "3", "4", "%2%1", "%10",
"%1" .. p[2],
"",
"%0\"", "%0\"", "ch", "sh", "ng", "ü", "zh",
p[1], "",
"0!!"
}
},
},
}
m["cmo"] = {
"中墨儂語",
33369881,
"mkh-ban",
}
m["cmr"] = {
"Mro Chin",
16889978,
"tbq-kuk",
}
m["cms"] = {
"梅薩比語",
36383,
"ine",
"Latn, Ital, Grek",
}
m["cmt"] = {
"Camtho",
10441336,
"crp",
"Latn",
ancestors = "fly, zu"
}
m["cna"] = {
"羌塘語",
12952322,
"sit-lab",
"Tibt",
translit = "Tibt-translit",
override_translit = true,
display_text = s["Tibt-displaytext"],
entry_name = s["Tibt-entryname"],
sort_key = "Tibt-sortkey",
}
m["cnb"] = {
"Chinbon Chin",
12952327,
"tbq-kuk",
}
m["cnc"] = {
"貢語 (越南)",
5202780,
"tbq-bis",
"Latn",
}
m["cng"] = {
"北羌語",
56559,
"sit-qia",
}
m["cnh"] = {
"哈卡欽語",
3250286,
"tbq-kuk",
}
m["cni"] = {
"亞夏尼加語",
3437230,
"awd",
"Latn",
}
m["cnk"] = {
"庫米欽語",
56308,
"tbq-kuk",
}
m["cnl"] = {
"拉拉納奇南特克語",
12953437,
"omq-chi",
"Latn",
}
m["cno"] = {
"Con",
3440883,
"mkh-pal",
}
m["cnp"] = {
"北部平話",
84302463,
"zhx-pin",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["cns"] = {
"中阿斯馬特語",
11732048,
"ngf",
"Latn",
}
m["cnt"] = {
"特佩托圖特拉奇南特克語",
5100113,
"omq-chi",
"Latn",
}
m["cnu"] = {
"Chenoua",
33276,
"ber",
}
m["cnw"] = {
"Ngawn Chin",
6583675,
"tbq-kuk",
}
m["cnx"] = {
"中古康沃爾語",
12642603,
"cel-brs",
"Latn",
ancestors = "oco",
}
m["coa"] = {
"科科斯馬來語",
3441699,
"crp",
"Latn",
ancestors = "ms",
}
m["cob"] = {
"奇科穆塞爾特克語",
3307204,
"myn",
"Latn",
}
m["coc"] = {
"科科帕語",
33044,
"nai-yuc",
"Latn",
}
m["cod"] = {
"科卡馬語",
33317,
"tup",
"Latn",
}
m["coe"] = {
"Koreguaje",
3198924,
"sai-tuc",
"Latn",
}
m["cof"] = {
"薩菲吉語",
2567055,
"sai-bar",
"Latn",
}
m["cog"] = {
"仲語",
3914630,
"mkh-pea",
"Thai, Khmr",
sort_key = {Thai = "Thai-sortkey"},
}
m["coh"] = {
"齊瓊依-齊基哈納-奇考瑪語",
12629011,
"bnt-mij",
"Latn",
}
m["coj"] = {
"Cochimi",
3915551,
"nai-yuc",
"Latn",
}
m["cok"] = {
"聖特雷莎科拉語",
12641754,
"azc",
"Latn",
}
m["col"] = {
"哥倫比亞-韋納奇語",
3324744,
"sal",
"Latn",
}
m["com"] = {
"科曼奇語",
32972,
"azc-num",
"Latn",
}
m["con"] = {
"科梵語",
2669254,
"qfa-iso",
"Latn",
}
m["coo"] = {
"科莫克斯語",
13583746,
"sal",
"Latn",
}
m["cop"] = {
"科普特語",
36155,
"egx",
"Copt",
translit = "Copt-translit",
ancestors = "egx-dem",
entry_name = {remove_diacritics = c.grave .. c.macron .. c.overline .. c.diaer .. "ˋ"},
sort_key = "cop-sortkey",
}
m["coq"] = {
"Coquille",
12953452,
"ath-pco",
"Latn",
}
m["cot"] = {
"Caquinte",
3915557,
"awd",
"Latn",
}
m["cou"] = {
"Wamey",
36935,
"alv-ten",
"Latn",
}
m["cov"] = {
"草苗語",
2936935,
"qfa-tak",
}
m["cow"] = {
"考利茲語",
3001877,
"sal",
"Latn",
}
m["cox"] = {
"Nanti",
15342275,
"awd",
"Latn",
}
m["coy"] = {
"Coyaima",
56450,
"sai-car",
"Latn",
}
m["coz"] = {
"喬喬特克語",
2964262,
"omq-pop",
"Latn",
}
m["cpa"] = {
"帕蘭特拉奇南特克語",
5100112,
"omq-chi",
"Latn",
}
m["cpb"] = {
"Ucayali-Yurúa Ashéninka",
3501858,
"awd",
"Latn",
}
m["cpc"] = {
"Ajyíninka Apurucayali",
3327405,
"awd",
"Latn",
}
m["cpg"] = {
"卡帕多細亞希臘語",
853414,
"grk",
"Grek, fa-Arab",
ancestors = "gkm",
translit = {Grek = "el-translit"},
entry_name = {Grek = {remove_diacritics = c.caron .. c.diaerbelow .. c.brevebelow}},
sort_key = {Grek = s["Grek-sortkey"]},
}
m["cpi"] = {
"洋涇浜英語",
3435078,
"crp",
"Latn, Hant",
ancestors = "en",
sort_key = {Hant = "Hani-sortkey"},
}
m["cpn"] = {
"Cherepon",
35181,
"alv-gng",
"Latn",
}
m["cpo"] = {
"Kpee",
6435722,
"dmn-jje",
}
m["cps"] = {
"卡皮塞尼奧語",
2937525,
"phi",
"Latn",
}
m["cpu"] = {
"Pichis Ashéninka",
7190661,
"awd",
"Latn",
}
m["cpx"] = {
"莆仙語",
56583,
"zhx-com",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["cpy"] = {
"South Ucayali Ashéninka",
3501868,
"awd",
"Latn",
}
m["cqd"] = {
"川黔滇苗語",
121627627,
"hmn",
"Latn, Plrd",
}
m["cra"] = {
"Chara",
5073694,
"omv",
"Latn",
}
m["crb"] = {
"Island Carib",
3450735,
"awd-taa",
"Latn",
}
m["crc"] = {
"Lonwolwol",
3259216,
"poz-vnc",
"Latn",
}
m["crd"] = {
"Coeur d'Alene",
32915,
"sal",
"Latn",
}
m["crf"] = {
"Caramanta",
3504195,
"sai-chc",
"Latn",
}
m["crg"] = {
"Michif",
13315,
"qfa-mix",
"Latn",
ancestors = "cr, fr",
}
m["crh"] = {
"克里米亞韃靼語",
33357,
"trk-kcu",
"Latn, Cyrl",
dotted_dotless_i = true,
sort_key = {
Latn = {
from = {
"[ıi]" .. c.breve, -- Convert ĭ into PUA so that the decomposed form does not get caught by the next step. Also cover decomposed forms with ı and i, as decomposed Ĭ is converted to ı + ̆ due to the dotted dotless I logic).
"i", -- Ensure "i" comes after "ı".
"â", "ç", "ğ", "ı", p[3], "ñ", "ö", "ş", "ü"
},
to = {
p[3],
"i" .. p[1],
"a", "c" .. p[1], "g" .. p[1], "i", "i" .. p[2], "n" .. p[1], "o" .. p[1], "s" .. p[1], "u" .. p[1],
}
},
Cyrl = {
from = {"гъ", "ё", "къ", "нъ", "дж"},
to = {"г" .. p[1], "е" .. p[1], "к" .. p[1], "н" .. p[1], "ч" .. p[1]}
},
},
}
m["cri"] = {
"聖多美語",
36536,
"crp",
"Latn",
ancestors = "pt",
}
m["crj"] = {
"南部東克里語",
12953464,
"alg",
"Latn, Cans",
ancestors = "cr",
translit = {Cans = "cr-translit"},
}
m["crk"] = {
"平原克里語",
56699,
"alg",
"Latn, Cans",
ancestors = "cr",
}
m["crl"] = {
"北部東克里語",
12642195,
"alg",
"Latn, Cans",
ancestors = "cr",
translit = {Cans = "cr-translit"},
}
m["crm"] = {
"穆斯克里語",
3446671,
"alg",
"Latn, Cans",
ancestors = "cr",
}
m["crn"] = {
"科拉語",
12953454,
"azc",
"Latn",
}
m["cro"] = {
"克勞語",
1207611,
"sio-mor",
"Latn",
}
m["crq"] = {
"Iyo'wujwa Chorote",
3540927,
"sai-mtc",
"Latn",
}
m["crr"] = {
"卡羅來納阿爾岡昆語",
16113723,
"alg-eas",
"Latn",
}
m["crs"] = {
"塞舌爾克里奧爾語",
34015,
"crp",
"Latn",
ancestors = "fr",
sort_key = s["roa-oil-sortkey"],
}
m["crt"] = {
"Iyojwa'ja Chorote",
3504118,
"sai-mtc",
"Latn",
}
m["crv"] = {
"Chaura",
2605680,
"aav-nic",
}
m["crw"] = {
"遮羅語",
5105629,
"mkh-ban",
"Latn",
}
m["crx"] = {
"達凱爾語",
12953431,
"ath-nor",
"Latn, Cans",
}
m["cry"] = {
"Cori",
35204,
"nic-plc",
"Latn",
}
m["crz"] = {
"Cruzeño",
2967636,
"nai-chu",
"Latn",
}
m["csa"] = {
"奇爾特佩克奇南特克語",
12953435,
"omq-chi",
"Latn",
}
m["csb"] = {
"卡舒比語",
33690,
"zlw-pom",
"Latn",
}
m["csc"] = {
"加泰羅尼亞手語",
35768,
"sgn",
"Latn", -- when documented
}
m["csd"] = {
"清邁手語",
5095211,
"sgn",
}
m["cse"] = {
"捷克手語",
5201809,
"sgn",
"Latn", -- when documented
}
m["csf"] = {
"古巴手語",
5192046,
"sgn",
"Latn", -- when documented
}
m["csg"] = {
"智利手語",
3322112,
"sgn",
"Latn", -- when documented
}
m["csh"] = {
"Asho Chin",
12627282,
"tbq-kuk",
}
m["csi"] = {
"海岸米沃克語",
2981109,
"nai-you",
"Latn",
}
m["csj"] = {
"Songlai Chin",
7561280,
"tbq-kuk",
}
m["csk"] = {
"Jola-Kasa",
3446622,
"alv-jol",
"Latn",
}
m["csl"] = {
"中國手語",
1094190,
"sgn",
}
m["csm"] = {
"中部山地米沃克語",
2944443,
"nai-you",
"Latn",
}
m["csn"] = {
"哥倫比亞手語",
2748229,
"sgn",
"Latn", -- when documented
}
m["cso"] = {
"索奇亞帕姆奇南特克語",
7550388,
"omq-chi",
"Latn",
}
m["csp"] = {
"南部平話",
84302019,
"zhx-pin",
"Hants",
generate_forms = "zh-generateforms",
translit = "zh-translit",
sort_key = "Hani-sortkey",
}
m["csq"] = {
"克羅地亞手語",
3507506,
"sgn",
}
m["csr"] = {
"哥斯達黎加手語",
5174901,
"sgn",
"Latn", -- when documented
}
m["css"] = {
"南奧龍尼語",
25559664,
"nai-you",
"Latn",
}
m["cst"] = {
"北奧龍尼語",
25559666,
"nai-you",
"Latn",
}
m["csv"] = {
"Sumtu Chin",
7638087,
"tbq-kuk",
}
m["csw"] = {
"沼澤克里語",
56696,
"alg",
"Latn, Cans",
ancestors = "cr",
}
m["csy"] = {
"Siyin Chin",
7533375,
"tbq-kuk",
}
m["csz"] = {
"Coos",
3126783,
"nai-coo",
"Latn",
}
m["cta"] = {
"塔塔爾特佩克查蒂諾語",
7687853,
"omq-cha",
"Latn",
}
m["ctc"] = {
"Chetco-Tolowa",
12628946,
"ath-pco",
"Latn",
}
m["ctd"] = {
"梯頂語",
56357,
"tbq-kuk",
"Latn, Pauc",
}
m["cte"] = {
"特皮納帕奇南特克語",
12953443,
"omq-chi",
"Latn",
}
m["ctg"] = {
"吉大港語",
33173,
"inc-eas",
"Beng",
ancestors = "inc-obn",
}
m["cth"] = {
"Thaiphum Chin",
16912048,
"tbq-kuk",
}
m["ctl"] = {
"特拉科亞津特佩克奇南特克語",
12643657,
"omq-chi",
"Latn",
}
m["ctm"] = {
"奇蒂馬查語",
1294227,
"qfa-iso",
"Latn",
}
m["ctn"] = {
"Chhintange",
32994,
"sit-kie",
"Deva",
}
m["cto"] = {
"Emberá-Catío",
3052039,
"sai-chc",
"Latn",
}
m["ctp"] = {
"西部高地查蒂諾語",
32861734,
"omq-cha",
"Latn",
entry_name = {remove_diacritics = "¹²³⁴⁵"},
sort_key = {remove_diacritics = c.acute},
}
m["cts"] = {
"北卡坦端內斯比科拉諾語",
7130477,
"phi",
"Latn",
}
m["ctt"] = {
"Wayanad Chetti",
7975850,
"dra-mal",
"Taml",
}
m["ctu"] = {
"喬爾語",
35179,
"myn",
"Latn",
}
m["ctz"] = {
"薩卡特佩克查蒂諾語",
8063754,
"omq-cha",
"Latn",
}
m["cua"] = {
"戈語",
3441115,
"mkh-ban",
"Latn",
}
m["cub"] = {
"庫貝歐語",
3006705,
"sai-tuc",
"Latn",
}
m["cuc"] = {
"尤斯拉奇南特克語",
7901979,
"omq-chi",
"Latn",
}
m["cug"] = {
"Cung",
35194,
"nic-bbe",
"Latn",
}
m["cuh"] = {
"Chuka",
12952344,
"bnt-kka",
"Latn",
}
m["cui"] = {
"Cuiba",
2980421,
"sai-guh",
"Latn",
}
m["cuj"] = {
"Mashco Piro",
3446596,
"awd",
"Latn",
}
m["cuk"] = {
"庫那語",
12953659,
"cba",
"Latn",
}
m["cul"] = {
"Culina",
2475442,
"auf",
"Latn",
}
m["cuo"] = {
"Cumanagoto",
5193784,
"sai-cpc",
"Latn",
}
m["cup"] = {
"庫佩諾語",
143130,
"azc-cup",
"Latn",
}
m["cuq"] = {
"仡隆語",
2475478,
"qfa-lic",
"Latn",
}
m["cur"] = {
"Chhulung",
5116126,
"sit-kie",
"Deva",
}
m["cut"] = {
"特烏蒂拉奎卡特克語",
12953453,
"omq-cui",
"Latn",
}
m["cuu"] = {
"傣雅語",
3441122,
"qfa-tak",
"Latn",
}
m["cuv"] = {
"Cuvok",
3515056,
"cdc-cbm",
"Latn",
}
m["cuw"] = {
"Chukwa",
12629033,
"sit-kic",
}
m["cux"] = {
"特佩烏希拉奎卡特克語",
20527242,
"omq-cui",
"Latn",
}
m["cuy"] = {
"Cuitlatec",
2030998,
"qfa-iso",
"Latn",
}
m["cvg"] = {
"Chug",
47683644,
"sit-khb",
}
m["cvn"] = {
"國家山谷奇南特克語",
12953442,
"omq-chi",
"Latn",
}
m["cwa"] = {
"Kabwa",
6344537,
"bnt-lok",
"Latn",
}
m["cwb"] = {
"Maindo",
11002891,
"bnt-mak",
"Latn",
ancestors = "chw",
}
m["cwd"] = {
"森林克里語",
56305,
"alg",
"Latn, Cans",
ancestors = "cr",
}
m["cwe"] = {
"Kwere",
779632,
"bnt-ruv",
"Latn",
}
m["cwg"] = {
"徹翁語",
646718,
"mkh-asl",
"Latn",
}
m["cwt"] = {
"Kuwaataay",
35699,
"alv-jol",
"Latn",
}
m["cya"] = {
"諾帕拉查蒂諾語",
15616302,
"omq-cha",
"Latn",
}
m["cyb"] = {
"Cayubaba",
3183382,
"qfa-iso",
"Latn",
}
m["cyo"] = {
"庫約農語",
33153,
"phi",
"Latn",
}
m["czh"] = {
"徽語",
56546,
"zhx",
"Hants", -- ?
ancestors = "ltc",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["czk"] = {
"Knaanic",
56384,
"zlw",
"Hebr",
ancestors = "zlw-ocs",
entry_name = {Hebr = {remove_diacritics = u(0x0591) .. "-" .. u(0x05BD) .. u(0x05BF) .. "-" .. u(0x05C5) .. u(0x05C7) .. c.CGJ}},
}
m["czn"] = {
"森松特佩克查蒂諾語",
603106,
"omq-cha",
"Latn",
}
m["czo"] = {
"閩中語",
56435,
"zhx-inm",
"Hants",
generate_forms = "zh-generateforms",
sort_key = "Hani-sortkey",
}
m["czt"] = {
"佐通語",
8074599,
"tbq-kuk",
"Latn",
}
return m_lang.finalizeLanguageData(m_lang.addDefaultTypes(m, true))