Mapping Lexical Dialect Variation in British English Using Twitter.
British English
Twitter
big data
dialectology
lexical variation
social media
sociolinguistics
spatial analysis
Journal
Frontiers in artificial intelligence
ISSN: 2624-8212
Titre abrégé: Front Artif Intell
Pays: Switzerland
ID NLM: 101770551
Informations de publication
Date de publication:
2019
2019
Historique:
received:
02
04
2019
accepted:
12
06
2019
entrez:
18
3
2021
pubmed:
12
7
2019
medline:
12
7
2019
Statut:
epublish
Résumé
There is a growing trend in regional dialectology to analyse large corpora of social media data, but it is unclear if the results of these studies can be generalized to language as a whole. To assess the generalizability of Twitter dialect maps, this paper presents the first systematic comparison of regional lexical variation in Twitter corpora and traditional survey data. We compare the regional patterns found in 139 lexical dialect maps based on a 1.8 billion word corpus of geolocated UK Twitter data and the BBC Voices dialect survey. A spatial analysis of these 139 map pairs finds a broad alignment between these two data sources, offering evidence that both approaches to data collection allow for the same basic underlying regional patterns to be identified. We argue that these results license the use of Twitter corpora for general inquiries into regional lexical variation and change.
Identifiants
pubmed: 33733100
doi: 10.3389/frai.2019.00011
pmc: PMC7861259
doi:
Types de publication
Journal Article
Langues
eng
Pagination
11Informations de copyright
Copyright © 2019 Grieve, Montgomery, Nini, Murakami and Guo.
Références
PLoS One. 2014 Nov 19;9(11):e113114
pubmed: 25409166