词嵌入-相似度计算

#利用维基百科训练的模型，完成词嵌入import tensorflow_hub as hubembed = hub.load("https://tfhub.dev/google/Wiki-words-500/2")embeddings = embed(["cat is on the mat", "dog is in the fog"])english_sentences = ["dog", "Pu

追光女孩儿

518人浏览 · 2021-08-24 15:44:00

追光女孩儿 · 2021-08-24 15:44:00 发布

#利用维基百科训练的模型，完成词嵌入
import tensorflow_hub as hub

embed = hub.load("https://tfhub.dev/google/Wiki-words-500/2")
embeddings = embed(["cat is on the mat", "dog is in the fog"])
english_sentences = ["dog", "Puppies are nice.", "I enjoy taking long walks along the beach with my dog."]
english_embedding=embed(english_sentences)


print(embeddings)
print(english_embedding)
print(english_embedding.shape)````


```python
#第二个词嵌入模型代码


import tensorflow_hub as hub
import numpy as np
import tensorflow_text

# Some texts of different lengths.
english_sentences = ["dog", "Puppies are nice.", "I enjoy taking long walks along the beach with my dog."]
italian_sentences = ["cane", "I cuccioli sono carini.", "Mi piace fare lunghe passeggiate lungo la spiaggia con il mio cane."]
chinese_sentences = ['狗','狗是友好的','我喜欢和狗狗一起散步']

embed = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual/3")

# Compute embeddings.
en_result = embed(english_sentences)
it_result = embed(italian_sentences)
ch_result = embed(chinese_sentences)

# Compute similarity matrix. Higher score indicates greater similarity.
similarity_matrix_it = np.inner(en_result, it_result)
similarity_matrix_ja = np.inner(en_result, ch_result)