Questions tagged [word-embedding]

Ask Question

For questions about word embedding, a language modelling technique in natural language processing. Questions can concern particular methods, such as Word2Vec, GloVe, FastText, etc, or word embeddings and their use in machine learning libraries in general.

1,112 questions

0 votes

0 answers

18 views

How to get multimodal embeddings from CLIP model?

I'm hoping to use CLIP to get a single embedding for rows of multimodal (image and text) data. Say I have the following model: from PIL import Image import torch from transformers import CLIPProcessor,...

T_d

asked Jul 15 at 19:53

-1 votes

0 answers

17 views

Models for getting similarity scores between categories and keywords [closed]

I want to get a similarity score between a category like vehicles and a list of words like headphone, water, truck, and green. The goal would be for each score to be low on words outside the category ...

Kayla Farivar

asked Jul 11 at 23:32

-1 votes

1 answer

22 views

Recreating Text Embeddings From An Example Dataset

I have a list of sentences, and a list of their ideal embeddings on a 25-dimensional vector. I am trying to use a neural network to generate new encodings, but I am struggling. While the model runs ...

slastine

asked Jul 10 at 1:39

1 vote

0 answers

59 views

Is there a way to use CodeBERT to embed source code without natural language in input?

On CodeBERTS github they provide an example of using a NL-PL pair with the pretrained base model to create an embedding. I am looking to create an embedding using just source code which does not have ...

Armand Mousavi

asked Jun 28 at 20:09

0 votes

0 answers

21 views

Small corpus, want to find associations. Word2Vec?

I'm a psychologist, and I'm diving into the field of AI. I could really use some help for a project. This semester, I discovered Word2Vec and was mesmerized by its capability to find associations. So, ...

Vinicius Fantini Marques Roja

asked Jun 28 at 4:52

0 votes

0 answers

48 views

Retreive a Metadata from the Chroma DB vector Store

I want to build a LLM application using Langchain, Ollama, RAG and Streamlit. My problem is: In streamlit application, after uploading the PDF, it takes so much time to generate and deliver the answer....

Urvesh

asked Jun 27 at 15:31

0 votes

1 answer

16 views

Calculation of document word vector in python. Sum or average word2vec?

I have some questions about generating a dissimilarity matrix of a bunch of text documents using word vectors. Here I tokenise the text, remove OOV and then sum the word vectors of each word to use as ...

D. Zammit

asked Jun 25 at 10:55

1 vote

3 answers

83 views

Getting GloVe embeddings using gensim, triu not found in scipy.linalg

I am trying to build a sentiment analysing model, using the GloVe word embeddings... I found multiple sources on how to import the embeddings into python, this one seemed to be the simplest... Trying ...

Mel7

asked Jun 23 at 6:37

0 votes

0 answers

72 views

Vector Embedding using Spark for compute

I have some large parquet files of data in Iceberg (which I have stored using Spark). My objective now is to pull these down using Spark, convert them into a spark dataframe, perform vector embedding ...

Mimis Chlympatsos

asked Jun 17 at 21:18

0 votes

0 answers

49 views

Question about encode_multi_process method of the SentenceTransformer

How can I leverage the encode_multi_process method of the SentenceTransformer class to encode a large list of sentences using multiple GPUs? I tried using the encode_multi_process method of the ...

Alexis López

asked Jun 12 at 13:54

0 votes

0 answers

18 views

Skip-Gram Model description in word2vec explanation article

In his article word2vec Parameter Learning Explained Xin Rong says (page 7): Each output is computed using the same hidden->output matrix: Looking into the word2vec source code I don’t see any “...

Damir Tenishev

2,329

asked Jun 10 at 20:37

1 vote

1 answer

187 views

Cannot access embeddings endpoint on vLLM hosting llama3-8b-instruct

I'm using vllm to run llama3-8b-instruct on a machine, I can access the chat endpoint, but when I access the embedding endpoint using following code I get NotFoundError: Error code: 404 - {'detail': '...

Derrick Zhang

21.4k

asked Jun 6 at 9:14

1 vote

1 answer

38 views

Fasttext pre-trained model is not producing OOV word vectors when using gensim downloader

I am having A LOT of trouble when trying to use all the fasttext libraries (in Jupyter with Anaconda3 on Windows 11) that I have found so far but this question is mainly about gensim's implementation. ...

D. Zammit

asked May 27 at 19:40

1 vote

3 answers

179 views

"Deadline" error when embedding video with Google Vertex AI multimodal embedding modal

I am currently using Vertex AI's Multimodal Embedding Model (https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings) I was able to get the Image and Text examples ...

theGreenCabbage

4,769

asked May 23 at 20:27

1 vote

0 answers

128 views

How to Convert HTML to Text Suitable for Vector Embedding Models

I would like to convert html files to plain text in a manner that preserves logical structuring of the html in terms of title <h1> subtitles <h2> sub sub titles <h3> and let's not ...

Draco

asked May 22 at 19:51

15 30 50 per page

2 3 4 5

…

75 Next

Collectives™ on Stack Overflow

Questions tagged [word-embedding]

How to get multimodal embeddings from CLIP model?

Models for getting similarity scores between categories and keywords [closed]

Recreating Text Embeddings From An Example Dataset

Is there a way to use CodeBERT to embed source code without natural language in input?

Small corpus, want to find associations. Word2Vec?

Retreive a Metadata from the Chroma DB vector Store

Calculation of document word vector in python. Sum or average word2vec?

Getting GloVe embeddings using gensim, triu not found in scipy.linalg

Vector Embedding using Spark for compute

Question about encode_multi_process method of the SentenceTransformer

Skip-Gram Model description in word2vec explanation article

Cannot access embeddings endpoint on vLLM hosting llama3-8b-instruct

Fasttext pre-trained model is not producing OOV word vectors when using gensim downloader

"Deadline" error when embedding video with Google Vertex AI multimodal embedding modal

How to Convert HTML to Text Suitable for Vector Embedding Models

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [word-embedding]

Related Tags