Visualizing text similarities from a graph-based SOM
DOI:
https://doi.org/10.24297/ijct.v14i7.1889Keywords:
Clustering, visualization, self-organizing map, text similarity, Google PageRankAbstract
Text in articles is based on expert opinion of a large number of people including the views of authors. These views are based on cultural or community aspects, which make extracting information from text very difficult. This paper introduced how to utilize the capabilities of a modified graph-based Self-Organizing Map (SOM) in showing text similarities. Text similarities are extracted from an article using Google's PageRank algorithm. Sentences from an input article are represented as graph model instead of vector space model. The resulted graph can be shown in a visual animation for eight famous graph algorithms execution with animation speed control.
The resulted graph is used as an input to SOM. SOM clustering algorithm is used to construct knowledge from text data. We used a visual animation for eight famous graph methods with animation speed control and according to similarity measure; an adjustable number of most similar sentences are arranged in visual form. In addition, this paper presents a wide variety of text searching. We had compared our project with famous clustering and visualization project in term of purity, entropy and F measure. Our project showed accepted results and mostly superiority over other projects.