Entropy Estimation Using a Linguistic Zipf-Mandelbrot-Li Model for Natural Sequences.

Zipf–Mandelbrot–Li law entropy estimation language models probabilistic natural sequences

Journal

Entropy (Basel, Switzerland)
ISSN: 1099-4300
Titre abrégé: Entropy (Basel)
Pays: Switzerland
ID NLM: 101243874

Informations de publication

Date de publication:
24 Aug 2021
Historique:
received: 15 07 2021
revised: 14 08 2021
accepted: 19 08 2021
entrez: 28 9 2021
pubmed: 29 9 2021
medline: 29 9 2021
Statut: epublish

Résumé

Entropy estimation faces numerous challenges when applied to various real-world problems. Our interest is in divergence and entropy estimation algorithms which are capable of rapid estimation for natural sequence data such as human and synthetic languages. This typically requires a large amount of data; however, we propose a new approach which is based on a new rank-based analytic Zipf-Mandelbrot-Li probabilistic model. Unlike previous approaches, which do not consider the nature of the probability distribution in relation to language; here, we introduce a novel analytic Zipfian model which includes linguistic constraints. This provides more accurate distributions for natural sequences such as natural or synthetic emergent languages. Results are given which indicates the performance of the proposed ZML model. We derive an entropy estimation method which incorporates the linguistic constraint-based Zipf-Mandelbrot-Li into a new non-equiprobable coincidence counting algorithm which is shown to be effective for tasks such as entropy rate estimation with limited data.

Identifiants

pubmed: 34573725
pii: e23091100
doi: 10.3390/e23091100
pmc: PMC8468050
pii:
doi:

Types de publication

Journal Article

Langues

eng

Subventions

Organisme : University of Queensland
ID : 2019002828
Organisme : Trusted Autonomous Systems Defence Cooperative Research Centre
ID : 2019002828

Références

J Exp Psychol Hum Percept Perform. 2006 Jun;32(3):535-57
pubmed: 16822123
Proc Biol Sci. 2001 Nov 7;268(1482):2261-5
pubmed: 11674874
Percept Psychophys. 1982 Aug;32(2):141-52
pubmed: 7145584
J Exp Psychol Learn Mem Cogn. 2014 Jul;40(4):938-61
pubmed: 24749964
Proc Natl Acad Sci U S A. 2011 Mar 1;108(9):3526-9
pubmed: 21278332
Phys Rev Lett. 1994 Dec 5;73(23):3169-72
pubmed: 10057305
Proc Natl Acad Sci U S A. 2003 Feb 4;100(3):788-91
pubmed: 12540826
J Comput Biol. 1999 Spring;6(1):125-42
pubmed: 10223669
Entropy (Basel). 2020 Feb 17;22(2):
pubmed: 33285998
Trends Cogn Sci. 2019 May;23(5):389-407
pubmed: 31006626
Am J Psychol. 1957 Jun;70(2):311-4
pubmed: 13424784
Entropy (Basel). 2019 Jun 25;21(6):
pubmed: 33267337
Phys Rev E Stat Nonlin Soft Matter Phys. 2009 Apr;79(4 Pt 2):046208
pubmed: 19518313
IEEE Trans Biomed Eng. 2001 Nov;48(11):1282-91
pubmed: 11686627
Phys Rev E. 2018 Nov;98(5):
pubmed: 30984901
PLoS One. 2015 Jul 09;10(7):e0129031
pubmed: 26158787
Chaos. 1996 Sep;6(3):414-427
pubmed: 12780271
Psychon Bull Rev. 2014 Oct;21(5):1112-30
pubmed: 24664880
Neural Comput. 2004 Apr;16(4):717-36
pubmed: 15025827
Cogn Sci. 2013 Jul;37(5):800-28
pubmed: 23489148
Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 1995 Dec;52(6):6841-6854
pubmed: 9964199
PLoS One. 2010 Mar 09;5(3):e9411
pubmed: 20231884
Sci Rep. 2015 Aug 11;5:12209
pubmed: 26259699
Cognition. 2020 Feb;195:104076
pubmed: 31756684

Auteurs

Andrew D Back (AD)

School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, QLD 4072, Australia.

Janet Wiles (J)

School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, QLD 4072, Australia.

Classifications MeSH