The Journal of Society for Dance Documentation & History

pISSN: 2383-5214 /eISSN: 2733-4279


Journal Detail

Journal Detail

Export Citation Download PDF PMC Previewer
Big Data Analysis for Dance Studies Using Text Mining 텍스트 마이닝을 기반으로 한 무용학 자료의 빅데이터 분석 ×
  • EndNote
  • RefWorks
  • Scholar's Aid
  • BibTeX

Export Citation Cancel

ISSN : 2383-5214(Print)
ISSN : 2733-4279(Online)
Asian Dance Journal Vol.42 pp.191-212
DOI : 10.26861/sddh.2016.42.191

텍스트 마이닝을 기반으로 한 무용학 자료의 빅데이터 분석

Big Data Analysis for Dance Studies Using Text Mining


Lee, Jungmin,Jun, Eunja,Chae, Jungmin


The purpose of this study is to develop interdisciplinary research between dance studies and big data analysis. To this end, the text mining technique, which extracts meaningful information from text, was adopted as the research methodology. In the process of text mining, original PDF texts on the themes of Chum/Muyong(dance), morphological analysis, user dictionary construction, and social network analysis were collected to extract significant named entities and clarify the relations between them. The outcomes of the process, which comprised the extracted text data (total 10,231 copies), a named entity classification table, and a network of named entities, were loaded into the big data analysis system under development. The findings of the study are as follows: First, there were 25 total morpheme types, with 24,691 words with a frequency of more than 100. From these, a second morphemic analysis of sentences containing words such as “Chum” (춤), “Mu” (무), and “dance” (댄스) was conducted. It was revealed that in parts of speech with a frequency of 10 or more, there were 3,057 nouns, 602 proper nouns, 352 verbs, 205 numbers, 135 adjectives, and 35 adverbs. Second, a user dictionary was developed in the form of a taxonomy with stratification between hyperonym and hyponyms. The dictionary contained 2,404 words, which were classified by theme, person, dance piece, genre, theory, function, element, and period. Third, social network analysis revealed that the terms “Muyong,” “Chum,” and “arts and culture” were closely interconnected at the heart of the network. In contrast, dance deviated somewhat from the center. “Dance” was the only word to be connected with the network of dance sports and jazz. This study is significant because it represents the first attempt to apply text mining to written records on dance. In addition, it could suggest ways to expand the use of big data analysis to dance studies. Based on the study, a big data analysis system that is specialized in dance was developed, and the contents will be updated continuously.

Export citation