Publications and Research

Word-Length Correlations and Memory in Large Texts: A Visibility Network Analysis

Lev Guzmán-Vargas, Instituto Politécnico Nacional
Bibiana Obregón-Quintana, Universidad Nacional Autónoma de México
Daniel Aguilar-Velázquez, Instituto Politécnico Nacional
Ricardo Hernández-Pérez, Instituto Politécnico Nacional
Larry S. Liebovitch, CUNY Queens CollegeFollow

Document Type

Article

Publication Date

11-20-2015

Abstract

We study the correlation properties of word lengths in large texts from 30 ebooks in the English language from the Gutenberg Project (www.gutenberg.org) using the natural visibility graph method (NVG). NVG converts a time series into a graph and then analyzes its graph properties. First, the original sequence of words is transformed into a sequence of values containing the length of each word, and then, it is integrated. Next, we apply the NVG to the integrated word-length series and construct the network. We show that the degree distribution of that network follows a power law, P(k)∼k−γP(k)∼k-γ, with two regimes, which are characterized by the exponents γs≈1.7γs≈1.7 (at short degree scales) and γl≈1.3γl≈1.3 (at large degree scales). This suggests that word lengths are much more strongly correlated at large distances between words than at short distances between words. That finding is also supported by the detrended fluctuation analysis (DFA) and recurrence time distribution. These results provide new information about the universal characteristics of the structure of written texts beyond that given by word frequencies.

Comments

This article originally appeared in Entropy, available at DOI: 10.3390/e17117798

This article is distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Download

Included in

Digital Humanities Commons, Psychology Commons

COinS

CUNY Academic Works

Publications and Research

Word-Length Correlations and Memory in Large Texts: A Visibility Network Analysis

Document Type

Publication Date

Abstract

Comments

Included in

Browse

Search

Author Corner

Links

CUNY Academic Works

Publications and Research

Word-Length Correlations and Memory in Large Texts: A Visibility Network Analysis

Authors

Document Type

Publication Date

Abstract

Comments

Included in

Share

Browse

Search

Author Corner

Links