TY - JOUR
T1 - Quantitative analysis of culture using millions of digitized books
AU - Michel, Jean Baptiste
AU - Kui Shen, Yuan
AU - Presser Aiden, Aviva
AU - Veres, Adrian
AU - Gray, Matthew K.
AU - Pickett, Joseph P.
AU - Hoiberg, Dale
AU - Clancy, Dan
AU - Norvig, Peter
AU - Orwant, Jon
AU - Pinker, Steven
AU - Nowak, Martin A.
AU - Aiden, Erez Lieberman
PY - 2011/1/14
Y1 - 2011/1/14
N2 - We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of 'culturomics,' focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.
AB - We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of 'culturomics,' focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.
UR - http://www.scopus.com/inward/record.url?scp=78651466764&partnerID=8YFLogxK
U2 - 10.1126/science.1199644
DO - 10.1126/science.1199644
M3 - Article
C2 - 21163965
AN - SCOPUS:78651466764
SN - 0036-8075
VL - 331
SP - 176
EP - 182
JO - Science
JF - Science
IS - 6014
ER -