Abstract
Keyphrase extraction is a fundamental task
in natural language processing that facilitates
mapping of documents to a set of representative
phrases. In this paper, we present an unsupervised
technique (Key2Vec) that leverages
phrase embeddings for ranking keyphrases
extracted from scientific articles. Specifically,
we propose an effective way of processing
text documents for training multi-word
phrase embeddings that are used for thematic
representation of scientific articles and ranking
of keyphrases extracted from them using
theme-weighted PageRank. Evaluations are
performed on benchmark datasets producing
state-of-the-art results.