lingtypology
: Download
typological databases’ datalingtypology
provides an ability to download data from
these typological databases
All database function names have identical structure:
database_name.feature. All functions have as first
argument feature
. All functions create dataframe with
column language
that can be used in
map.feature()
function. It should be noted that all
functions cut out the data that can’t be maped, so if you want to
prevent functions from this behavior set argument na.rm
to
FALSE
.
The names of the WALS features can be typed in a lower case. This
function preserves coordinates from WALS, so you can map coordinates
from the WALS or use coordinates from lingtypology
.
Don't forget to cite the source (modify in case of using individual chapters):
Dryer, Matthew S. & Haspelmath, Martin (eds.) 2013. The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology.
(Available online at https://wals.info/, Accessed on 2024-03-13.)
@book{wals,
address = {Leipzig},
editor = {Matthew S. Dryer and Martin Haspelmath},
publisher = {Max Planck Institute for Evolutionary Anthropology},
title = {WALS Online},
url = {https://wals.info/},
year = {2013}
}
The names of the Grambank features can be typed in a lower case. This
function preserves coordinates from Grambank, so you can map coordinates
from the WALS or use coordinates from lingtypology
.
Grambank v.1.0.3
Don't forget to cite the source (modify in case of using individual chapters):
Hedvig Skirg
{a}rd et al., Grambank reveals the importance of genealogical
constraints on linguistic diversity and highlights the impact of language
loss.Sci. Adv.9, eadg6175(2023). DOI:10.1126/sciadv.adg6175
@article{
doi:10.1126/sciadv.adg6175,
author = {Hedvig Skirgård and Hannah J. Haynie and Damián E. Blasi and Harald Hammarström and Jeremy Collins and Jay J. Latarche and Jakob Lesage and Tobias Weber and Alena Witzlack-Makarevich and Sam Passmore and Angela Chira and Luke Maurits and Russell Dinnage and Michael Dunn and Ger Reesink and Ruth Singer and Claire Bowern and Patience Epps and Jane Hill and Outi Vesakoski and Martine Robbeets and Noor Karolin Abbas and Daniel Auer and Nancy A. Bakker and Giulia Barbos and Robert D. Borges and Swintha Danielsen and Luise Dorenbusch and Ella Dorn and John Elliott and Giada Falcone and Jana Fischer and Yustinus Ghanggo Ate and Hannah Gibson and Hans-Philipp Göbel and Jemima A. Goodall and Victoria Gruner and Andrew Harvey and Rebekah Hayes and Leonard Heer and Roberto E. Herrera Miranda and Nataliia Hübler and Biu Huntington-Rainey and Jessica K. Ivani and Marilen Johns and Erika Just and Eri Kashima and Carolina Kipf and Janina V. Klingenberg and Nikita König and Aikaterina Koti and Richard G. A. Kowalik and Olga Krasnoukhova and Nora L. M. Lindvall and Mandy Lorenzen and Hannah Lutzenberger and Tânia R. A. Martins and Celia Mata German and Suzanne van der Meer and Jaime Montoya Samamé and Michael Müller and Saliha Muradoglu and Kelsey Neely and Johanna Nickel and Miina Norvik and Cheryl Akinyi Oluoch and Jesse Peacock and India O. C. Pearey and Naomi Peck and Stephanie Petit and Sören Pieper and Mariana Poblete and Daniel Prestipino and Linda Raabe and Amna Raja and Janis Reimringer and Sydney C. Rey and Julia Rizaew and Eloisa Ruppert and Kim K. Salmon and Jill Sammet and Rhiannon Schembri and Lars Schlabbach and Frederick W. P. Schmidt and Amalia Skilton and Wikaliler Daniel Smith and Hilário de Sousa and Kristin Sverredal and Daniel Valle and Javier Vera and Judith Voß and Tim Witte and Henry Wu and Stephanie Yam and Jingting Ye and Maisie Yong and Tessa Yuditha and Roberto Zariquiey and Robert Forkel and Nicholas Evans and Stephen C. Levinson and Martin Haspelmath and Simon J. Greenhill and Quentin D. Atkinson and Russell D. Gray},
title = {Grambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss},
journal = {Science Advances},
volume = {9},
number = {16},
pages = {eadg6175},
year = {2023},
doi = {10.1126/sciadv.adg6175},
URL = {https://www.science.org/doi/abs/10.1126/sciadv.adg6175},
eprint = {https://www.science.org/doi/pdf/10.1126/sciadv.adg6175}}
map.feature(df$grambank.name,
features = df$`GB042`,
latitude = df$latitude,
longitude = df$longitude,
label = df$language,
title = "Is there productive overt morphological singular marking on nouns?")
Warning: Language Ancient Greek is absent in our version of the Glottolog database. Did you mean Maniot Greek, Ionic-Attic Ancient Greek, West Ancient Greek, Northwestern Ancient Greek?
Warning: Language Batu (Indonesia) is absent in our version of the Glottolog database. Did you mean Bati (Indonesia)?
Warning: Language Cakfem-Mushere-Jibyal is absent in our version of the Glottolog database. Did you mean Cakfem-Mushere?
Warning: Language Central Dusun is absent in our version of the Glottolog database. Did you mean Central Bunun, Central Nusu?
Warning: Language Colloquial Jakarta Indonesian is absent in our version of the Glottolog database. Did you mean Basilectal Colloquial Jakarta Indonesian, Acrolectal Colloquial Jakarta Indonesian?
Warning: Language Colloquial Malay is absent in our version of the Glottolog database. Did you mean Central Malay?
Warning: Language Finallig is absent in our version of the Glottolog database. Did you mean Kivalliq, Fingallian, Tinauli?
Warning: Language Francisco León Zoque is absent in our version of the Glottolog database. Did you mean Franciso León?
Warning: Language Guahibo is absent in our version of the Glottolog database. Did you mean Guahiboan, Nuclear Guahiboan, Central Guahibo, Guahibo-Playero?
Warning: Language Indonesian is absent in our version of the Glottolog database. Did you mean Micronesian, Indonesian Sign, Indonesian Bajau, Peranakan Indonesian, Standard Indonesian, Basilectal Colloquial Jakarta Indonesian, Riau Indonesian, Acrolectal Colloquial Jakarta Indonesian, Standard Malay-Indonesian?
Warning: Language Karok is absent in our version of the Glottolog database. Did you mean Karuk, Karoka, Karo, Tarok, Barok, Karon?
Warning: Language Kulon-Pazeh is absent in our version of the Glottolog database. Did you mean Kulon, Pazeh, Kalondama, Kulina Pano, Lundayeh, Kompane, Kuto-Kute, Kunabe, Eggon-Ake, Kohnadeh, Kalounaye, Kondazi, Long Bleh, Kulamanen, Koron Panda, Konawe, Kele-Poke, Mulonga?
Warning: Language Lowland Tarahumara is absent in our version of the Glottolog database. Did you mean Chinatú Tarahumara?
Warning: Language N||ng-Danster !Ui is absent in our version of the Glottolog
database. Did you mean Danster !Ui, Fulniô, Dem, Mixe-Zoque, Mure, North
Halmahera, Tonkawa, Otomaco-Taparita, Klamath-Modoc, Fasu, Chibchan,
Nimboranic, Atakapa, Natchez, Turkic, Mombum-Koneraw, Saharan, Camsá, Surmic,
Fuyug, Mimi-Gaudefroy, Atacame, Amto-Musan, Ijoid, Muniche, Trumai, Comecrudan,
Anim, Wakashan, Koman, Katla-Tima, Tangkic, Minkin, Taruma, Zaparoan, Bunaban,
Guahiboan, Doso-Turumsa, Nihali, Bookkeeping, Lule, Ainu, Kadugli-Krongo,
Cofán, Chicham, Huavean, Bororoan, Mato Grosso Arára, Songhay, Shabo, Yámana,
Tupian, Massep, Cuitlatec, Left May, Yangmanic, Cariban, Eastern Jebel, Tuxá,
Narrow Talodi, Sandawe, Mor (Bomberai Peninsula), Purari, Laragia, Umbugarla,
Etruscan, Kunimaipan, Shom Peng, Shastan, Kusunda, Tabo, Candoshi-Shapra,
Indo-European, Xincan, Duna, Kenaboi, Mayan, Papi, Zuni, Limilngan-Wulna,
Puinave, Karami, Chimariko, Kru, Peba-Yagua, Kuot, East Kutubu, Kayagaric,
Uru-Chipaya, Jarrakan, Yuchi, Kaki Ae, Ndu, Páez, Chiquitano, Betoi-Jirara,
Kapori, Yuat, Usku, Tambora, Tiwi, Leco, Guaicuruan, Nuclear Torricelli, Guató,
Kawesqar, Yokutsan, Burushaski, Culli, Aymaran, Western Tasmanian, Nara, Ramu,
Marori, Sumerian, Afro-Asiatic, Esselen, Mpur, Dogon, Itonama, Máku, Kalapuyan,
Mande, Chapacuran, Kwalean, Pele-Ata, Timor-Alor-Pantar, Pauwasi, Matanawí,
Miwok-Costanoan, Sahaptian, Hadza, Molof, Wintuan, Mochica, Great Andamanese,
Speech Register, Siuslaw, Tunica, Kol (Papua New Guinea), Worrorran, Central
Sudanic, Pyu, Sechuran, Baining, Lavukaleve, Walioic, Tor-Orya, Jirajaran,
Pumé, Timucua, Canichana, Sulka, Pama-Nyungan, Hmong-Mien, Uto-Aztecan,
Somahai, Maybrat-Karon, Aewa, Lencan, Chumashan, Chinookan, Tungusic, Arafundi,
Blue Nile Mao, Garrwan, Salishan, Kamula-Elevala, Giimbiyu, South Bird's Head
Family, South Bougainville, Khoe-Kwadi, Eskimo-Aleut, Tinigua, Kibiri, Kolopom,
Japonic, Arawakan, Algic, Mirndi, Oyster Bay-Big River-Little Swanport,
Temeinic, Chitimacha, Caddoan, Wiru, Berta, Adai, Yuracaré, North Bougainville,
Kamakanan, Basque, Meroitic, Wadjiginy, Mailuan, Matacoan, Maiduan, Huitotoan,
Wageman, Sko, North-Eastern Tasmanian, Abinomn, Siouan, Arutani, Kxa, Cayuse,
Elseng, Sign Language, Iwaidjan Proper, Cayubaba, Yale, Irántxe-Münkü, Taiap,
Asabano, Suki-Gogodala, Pomoan, Touo, Kariri, Yeniseian, Koreanic, Barbacoan,
Senagi, Sause, Pawaia, Mixed Language, Pirahã, Border, Hurro-Urartian, Bangime,
Bogia, Huarpean, Keram, Bogaya, Kehu, Coahuilteco, East Bird's Head,
Guaicurian, Hattic, Mosetén-Chimané, Athabaskan-Eyak-Tlingit, Kresh-Aja,
Takelma, Kakua-Nukak, Tucanoan, Damal, Bayono-Awbono, Otomanguean, Cahuapanan,
Piawi, Salinan, Seri, Eastern Trans-Fly, Eleman, East Strickland, Gunwinyguan,
Heibanic, Kwomtari-Nai, Movima, Maratino, Yukaghir, Guriaso, Pidgin, Molale,
Sapé, Chonan, Siamou, Pankararú, Kaure-Kosare, South Omotic, Geelvink Bay,
Kiwaian, Baibai-Fas, Katukinan, Southern Daly, Pano-Tacanan, Nubian, Yanomamic,
Andoque, Nivkh, Guamo, Bulaka River, Unclassifiable, West Bomberai, Harakmbut,
Payagua, Yele, Maningrida, Timote-Cuica, Yareban, Nuclear Trans New Guinea,
Austronesian, Koiarian, Tallán, Manubaran, Beothuk, Urarina, Eastern Daly,
Tarascan, Yawa-Saweru, Namla-Tofanma, Yam, Teberan, Hatam-Mansim, Uralic,
Waorani, Sepik, Western Daly, Maban, Tanahmerah, Haida, Greater Kwerba, Karuk,
Odiai, Naduhup, Lepki-Murkim-Kembra, Jalaa, Yuki-Wappo, Nilotic, Dizoid,
Inanwatan, Totonacan, Atlantic-Congo, Quechuan, Taushiro, Lower Sepik,
Misumalpan, Warao, Artificial Language, Furan, Vilela, Karankawa, Yerakai,
Palaihnihan, Hruso, Sino-Tibetan, Hibito-Cholon, Lengua-Mascoy, Unattested,
Anem, Ongota, Yurumanguí, Nambiquaran, Chukotko-Kamchatkan, Andaqui,
Ta-Ne-Omotic, Coosan, Arawan, Dibiyaso, Savosavo, Pahoturi, Tauade, Lafofa,
Zamucoan, Dajuic, Jarawa-Onge, Oti, Nyulnyulan, Northern Daly, Tai-Kadai,
Iroquoian, Rashad, Abkhaz-Adyge, Kujarge, Kwaza, Konda-Yahadian, Yetfa,
Sentanic, Charruan, Kutenai, Mawes, Hoti, Kiowa-Tanoan, Jicaquean, Kunama,
Chimakuan, Ticuna-Yuri, Cotoname, Tsimshian, Nuclear-Macro-Je, Mairasic,
Kartvelian, South-Eastern Tasmanian, Tuu, Guachi, Araucanian, Xukurú, Dagan,
Angan, Abun, Washo, Muskogean, Elamite, Chocoan, Boran, West Bird's Head,
Nyimang, Tequistlatecan, Turama-Kikori, Lakes Plain, Ramanos, Kuliak, Kanoê,
Mangarrayi-Maran, Burmeso, Chono, Cochimi-Yuman, Dravidian, Bosavi,
Puri-Coroado, Kungarakany, Keresan, Bilua, Puelche, Iberian, Kunza, Laal,
Alsea-Yaquina, Gumuz, Tamaic, Mongolic-Khitan, Austroasiatic, Puquina, Gule,
Kimki, Yana, Omurano, Marrku-Wurrugu, Gaagudju, Nakh-Daghestanian,
Taulil-Butam, Saliban, Aikanã, Fulniô Proper, Yatê, Zoque, Mixe, Northern North
Halmahera, West Makian, Taparita, Otomaco, Kaibu, Some, Namome, Core Chibchan,
Pech, Outer Nimboranic, Nimboran, Western Atakapa, Eastern Atakapa, Bolgar,
Common Turkic, Mombum, Koneraw, Eastern Saharan, Western Saharan, Majang, South
Surmic, Northeast Fuyug, West Fuyug, North-South Udab, Central Udab, Siawi,
Amto, Defaka, Ijo, Garza, Mamulique, Comecrudo, Tirio, Inland Gulf of Papua,
Marind-Boazi-Yaqai, Southern Wakashan, Northern Wakashan, Gwama, Central Koman,
Tima, Katla-Julud, Southern Tangkic, Lardil, Zaparo-Abishira, Iquito-Arabela,
Gooniyandi, Bunaba, Guayabero, Nuclear Guahiboan, Doso, Turumsa, Horuru,
Pending Report Release, Shuadit, Sansu, Olkol, Tlalitzlipa Nahuatl, Northern
Zhuang, K'iche', Cunén, Aghu Tharnggalu (Retired), K'iche', Eastern, Southern
Pesisir, Tzeltal, Bachajón, Chilean Quechua, Yarsun, Wahau Kayan, Tareng,
Ndonde Hamba, Jaruára, Nde-Gbite, Laba, Ija-Zuba, Chipiajes, Khiamniungan Naga
(Retired), Ayi (China), Savara, Aramanik, Arakwal, Silt'e (Retired), Takpa
(Retired), Amapá Creole, Mam, Central, Tetete, Phangduwali, Malakhel,
Northwestern Tamang, Tingal, Lang'e, Adap, Tzotzil, Zinacantán, Geman Deng,
Southwestern Tamang, Ranau (Retired), Beti (Cameroon), Kiorr, Desiya, Sufrai,
Quetzaltepec Mixe, Chuj, Ixtatán, Southern Betsimisaraka Malagasy (Retired),
Wirangu-Nauo, Lyons Sign Language, Taensa, Bisu (Retired), Bhalay, Alaguilac,
Dhanwar (India), Yuanjiang-Mojiang Yi, Maramba, Tai Pao (Retired), Sara Dunjo,
Aiku, Buxinhua, Madang (Malaysia), Lambichhong, Khua, Lua', Gbati-ri, Kwak,
Toala', Kaqchikel, Santo Domingo Xenacoj, Malay (individual language), Wumeng
Yi, Chittagonian (Retired), Calo, Piru, Gowli, Auvergnat, Cagua, K'iche', San
Andrés, Gamo-Gofa-Dawro, Ranya, Parsi-Dari, Pu Ko, Laopang, Yauma, Tanda,
Kabixí, Gugu Mini, Idesa, Welaung, Wasulu (Retired), Orokaiva (Retired), Buso,
Khuai, Tomedes, Upper Tanudan Kalinga, Tutong 1, Nupbikha, Southern Marakwet,
Gey, Yarí, Nyadu, Balau, Patla-Chicontla Totonac, Bumang, Mawakwa, Lower
Pokomo, Pula Yi, Kemiehua, Seru, Kayan River Kenyah, Mam, Southern, Babalia
Creole Arabic, Lahta Karen, Bikaru-Bragge, Kahumamahon Saluan, Bubia, Bhatola,
Khumi Awa Chin, Lama (Myanmar), Ahe, Purum (Retired), Parsi, Poqomchi',
Western, Tushan Names, Izere (Retired), Lamam, Maya, Chan Santa Cruz, Sindang
Kelingi, Salumá, Mediak, Anasi, Songa, Kpatili, Huastec, Southeastern, Mam,
Todos Santos Cuchumatán, Chakato, Yendang (Retired), Chetco, Baga Kaloum,
Bandjigali, Tzotzil, Chenalhó, Ngombe (Central African Republic), Lumba-Yakkha,
Mam, Tajumulco, Bakung Kenyah, Katingan, Nootka, Omejes, Lachirioag Zapotec,
Southwestern Nisu, Coxima, Elpaputih, Ir, Kaqchikel, Eastern, Kisankasa,
Yinglish, Con, Upper Baram Kenyah, Yangbye, Garreh-Ajuran, Walo Kumbe Dogon,
Achi', Cubulco, Cauca, Yamphe, Old Turkish, Chorotega, Odut, Atuence, Palu,
Ponares, Iapama, Putoh, Narau, Aariya, Kuku-Mangk, Chuanqiandian Cluster Miao,
Ngarla on the Ashburton river south of the Hamersley range, Natügu (Retired),
Tangshewi, Carútana, Kuanhua, Khamla, K'iche', West Central, Pamona (Retired),
Kamba (Brazil), Lui, Chaungtha, Mina (India), Tanjong, Western Frisian
(Retired), Nyeng, Judeo-Berber, Kaqchikel, Akatenango Southwestern, Wagawaga
(Retired), Huastec, San Luís Potosí, Dafla, Samre of Siem Reap, Yiddish Sign
Language, Kaqchikel, Santa María de Jesús, Rajbanshi (Retired), Natagaimas,
Adzera (Retired), Jakalteko, Western, Laos Sign Language, Shakara, Jiarong,
Warning: Language North Levantine Arabic is absent in our version of the Glottolog database. Did you mean South Levantine Arabic?
Warning: Language Nuclear Nias is absent in our version of the Glottolog database. Did you mean Nuclear Iau, Nuclear Zia, Nuclear Nisu, Nuclear Ndau?
Warning: Language Nuclear Western Fijian is absent in our version of the Glottolog database. Did you mean Nuclear Eastern Fijian?
Warning: Language Para Naga is absent in our version of the Glottolog database. Did you mean Maram Naga, Paranawa, Lama Naga?
Warning: Language Ritarungo is absent in our version of the Glottolog database. Did you mean Ritharrngu?
Warning: Language Samba Daka is absent in our version of the Glottolog database. Did you mean Sama Daka?
Warning: Language South Efate is absent in our version of the Glottolog database. Did you mean South Efatic, North Efate?
Warning: Language Southern Lengua is absent in our version of the Glottolog database. Did you mean Southern Lendu?
Warning: Language Toba-Maskoy is absent in our version of the Glottolog database. Did you mean Tamako, Tolomako, Taumako?
Warning: Language Turkic Khalaj is absent in our version of the Glottolog database. Did you mean Bukit Malay?
The AUTOTYP features are listed on the
GitHub page. You can use more human way with spaces. You can also
use a module names in order to download all variables in a module (e. g.
Gender
):
Don't forget to cite a source:
Bickel, Balthasar, Nichols, Johanna, Zakharko, Taras, Witzlack-Makarevich, Alena, Hildebrandt, Kristine, Rießler, Michael, Bierkandt, Lennart, Zúñiga, Fernando & Lowe, John B. 2022. The AUTOTYP database (v1.1.0). https://doi.org/10.5281/zenodo.5931509
@misc{AUTOTYP,
author = {
Bickel, Balthasar and
Nichols, Johanna and
Zakharko, Taras and
Witzlack-Makarevich, Alena and
Hildebrandt, Kristine and
Rie{\ss}ler, Michael and
Bierkandt, Lennart and
Z{\'u}{\~n}iga, Fernando and
Lowe, John B
},
doi = {10.5281/zenodo.6793367},
title = {The AUTOTYP database (v1.1.1)},
url = {https://doi.org/10.5281/zenodo.6793367},
year = {2022}
}
I used only four features from PHOIBLE: the number of phonemes, the
number of consonants, the number of tones and the number of vowels. If
you need only a set of them, just specify it in the
features
argument. Since there is a lot of doubling
information in the PHOIBLE database, there is an argument
source
.
Don't forget to cite a source:
Moran, Steven & McCloy, Daniel & Wright, Richard (eds.) 2014. PHOIBLE Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at https://phoible.org/, Accessed on ...)
A BibTeX entry for LaTeX users is
@book{phoible,
address = {Leipzig},
editor = {Steven Moran and Daniel McCloy and Richard Wright},
publisher = {Max Planck Institute for Evolutionary Anthropology},
title = {PHOIBLE Online},
url = {https://phoible.org/},
year = {2014}
}
The AfBo database has a lot of features that distinguish affix functions, but again you can use a bare function without any arguments to download the whole database. There will be no difference in time, since this function downloads the whole database to your PC. The main destinction is that this database provides recipient and donor languages, so other column names should be used.
Don't forget to cite a source:
Seifart, Frank. 2013. AfBo: A world-wide survey of affix borrowing. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at https://afbo.info/, Accessed on ...)
A BibTeX entry for LaTeX users is
@book{afbo,
address = {Leipzig},
editor = {Frank Seifart},
publisher = {Max Planck Institute for Evolutionary Anthropology},
title = {AfBo: A world-wide survey of affix borrowing},
url = {https://afbo.info/},
year = {2013}
}
The SAILS database provide a lot of features, so the function work with their ids:
Don't forget to cite a source (modify in case of using individual chapters):
Muysken, Pieter, Harald HammarstrÖm, Olga Krasnoukhova, Neele MÜller, Joshua Birchall, Simon van de Kerke, Loretta O'Connor, Swintha Danielsen, Rik van Gijn & George Saad. 2016. South American Indigenous Language Structures (SAILS) Online. Leipzig: Online Publication of the Max Planck Institute for Evolutionary Anthropology. (Available at https://sails.clld.org/)
The ABVD database is a lexical database, so it is different from clld databases. First of all, ABVD has its own language classification ids. The information about the same language from different sources can be received from these database different ids. So I select several languages and map them coloring by word with the meaning ‘hand’.
uralex.feature
downloads data from UraLex basic
vocabulary dataset. Original language names are stored in the
uralex.name
variable. Converted language names for
map.feature
are stored in the language
variable.
Don't forget to cite a source:
Kaj Syrjänen, Jyri Lehtinen, Outi Vesakoski, Mervi de Heer, Toni Suutari, Michael Dunn, Urho Määttä & Unni-Päivä Leino (2018). UraLex basic vocabulary dataset.