Word2Vec在文本相似度计算中的应用

The application of Word2Vec in text similarity calculation

ES评分 0

DOI 10.12208/j.aics.20250022
刊名
Advances in International Computer Science
年,卷(期) 2025, 5(2)
作者
作者单位

浙江清桦智控科技有限公司 浙江杭州

摘要
随着文本数据激增,高效计算文本相似度成为 NLP 关键任务。Word2Vec 作为词嵌入技术,将词语映射为低维稠密向量,捕捉语义信息,革新文本相似度计算。其核心包括连续词袋与跳字模型,通过负采样、层序 softmax 等算法优化性能。结合余弦相似度、欧氏距离等度量方法,已广泛应用于信息检索、文本分类等领域。未来,Word2Vec 与新兴技术融合,应用前景广阔。
Abstract
With the surge in text data, efficiently calculating text similarity has become a key task in Natural Language Processing (NLP). Word2Vec, a word embedding technique, maps words into low-dimensional dense vectors to capture semantic information, revolutionizing text similarity calculation. Its core components include continuous bag-of-words and skip-word models, with performance optimized through techniques such as negative sampling and hierarchical softmax. By integrating metrics like cosine similarity and Euclidean distance, Word2Vec has found widespread application in information retrieval and text classification. Looking ahead, the integration of Word2Vec with emerging technologies promises a broad range of applications.
关键词
Word2Vec;文本相似度;词嵌入;自然语言处理;语义计算
KeyWord
Word2Vec; Text similarity; Word embedding; Natural language processing; Semantic computation
基金项目
页码 52-54
  • 参考文献
  • 相关文献
  • 引用本文

黄云磊. Word2Vec在文本相似度计算中的应用 [J]. 国际计算机科学进展. 2025; 5; (2). 52 - 54.

  • 文献评论

相关学者

相关机构