2021 Abstract

Title2-2. YANG WonSeok ; ‘토픽 모델링(topic modeling)’을 활용한 <詩經> 텍스트 분석 방법 연구 A study on the The Book of Poetry-text analysis using topic modeling(利用主题模型<詩經>文本分析方法硏究)2021-10-04 11:08
Writer Level 10

‘토픽 모델링(topic modeling)’을 활용한 <詩經> 텍스트 분석 방법 연구 

A study on the The Book of Poetry-text analysis using topic modeling

利用主题模型<詩經>文本分析方法硏究


  • YANG WonSeok(楊沅錫, Departmen of Sino-Korean Literature, KOREA UNIVERSITY, South Korea)
  • JUNG SUNG HOON (鄭性勳, Department of Korean Language & Literature, National Mokpo UNIVERSITY, South Korea)  

This paper presents a method of analyzing the text of The Book of Poetry(詩經) using'topic modeling', and also using this method to analyze the commentary texts of researchers in the late Joseon dynasty and finding a certain trend here. The purpose.
As is known, The Book of Poetry is a collection of poetry songs that have been sung in the middle of the Huang River in China for about 500 years, from 西周初期 (around 1,100 BC) to 春秋中期 (around 600 BC). It is a collection of poetry and can be said to be a masterpiece of literary works of Chinese and other East Asian Chinese cultures.
Since the introduction to the Korean Peninsula, many scholars commented on it and commented on it, leaving various achievements. Research on this has been steadily conducted, but most of the studies have been conducted on the individual commentators and major controversies in poetry and poetry, and studies have not been conducted on all or a large number of commentators The Book of Poetry in the late Joseon(朝鮮) period. This is also due to the lack of manpower to read the text. In order to solve this problem, that is, to proceed with a study targeting all or a large number of commentaries of the late Joseon Dynasty, this paper intends to use a method called 'topic modeling'.
The 'topic model' is a field of machine learning and natural language processing, and is a statistical model for discovering abstract 'topic' in a set of documents. In other words, it is a text mining technique used to discover the hidden semantic structure of the text body, and accordingly it is also called the 'probabilistic topic model'. 'Topic modeling' refers to the use of such a topic model. It converts a document (natural language text) into DTM, receives LDA, DTM, and returns an output value as an association matrix, and the output matrix and detailed information It is a method of going through the process of confirmation.
Text analysis of The Book of Poetry texts using this topic modeling is the first research method used in the study of Chinese literature in Korea. In particular, it plays a role in adding quantitative analysis results to the results that have been previously qualitatively analyzed, and thus a more in-depth The Book of Poetry text research. It can be meaningful in that it allows you to perform.