Google Research

ConceptVector: Text Visual Analytics via Interactive Lexicon Building using Word Embedding

  • Deok Gun Park
  • Seungyeon Kim
  • Jurim Lee
  • Jaegul Choo
  • Nicholas Diakopoulos
  • Niklas Elmqvist
IEEE Transactions on Visualization and Computer Graphics (TVCG) (2017)


Central to many text analysis methods is the notion of a concept: a set of semantically related keywords characterizing a specific object, phenomenon, or theme. Advances in word embedding allow building such concepts from a small set of seed terms. However, naive application of such techniques may result in false positive errors because of the polysemy of human language. To mitigate this problem, we present a visual analytics system called ConceptVector that guides the user in building such concepts and then using them to analyze documents. Document-analysis case studies with real-world datasets demonstrate the fine-grained analysis provided by ConceptVector. To support the elaborate modeling of concepts using user seed terms, we introduce a bipolar concept model and support for irrelevant words. We validate the interactive lexicon building interface via a user study and expert reviews. The quantitative evaluation shows that the bipolar lexicon generated with our methods is comparable to human-generated ones.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work