Newest 'large-language-model' Questions

0 votes

0 answers

9 views

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

am trying to output the response of the llama2 model that i installed locally, but when i try to execute the following lines: output = model.generate(**inputs, streamer=streamer, use_cache=True, ...

noureddine

3

asked 15 hours ago

0 votes

0 answers

9 views

Training LLM uses unexpected amount of GPU memory

I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G. I tried to delete some ...

StaEx_G

13

asked 15 hours ago

0 votes

0 answers

17 views

How to evaluate LLM response

I am retrieving response using QWEN 72B model. I want to validate my response and don’t have ground truth answers. How can I evaluate my response without help of ground truth answers. I want to use ...

Prashanth Kolaneru

15

asked 16 hours ago

-1 votes

0 answers

10 views

How to resolve ``` backticks error that occur while generating sql query in gemini llm to build a NL2SQL chatbot building

I am using llm to fetch data from my postgres db table This is the output that is being generated , Even though i have mentioned in the prompt to not add backticks while generating sql queries This is ...

Lad99

1

asked 19 hours ago

0 votes

0 answers

9 views

Unable to import SentenceTransformer

I am using Colab, I am trying to import SentenceTransformer: from sentence_transformers import SentenceTransformer However, I got this error: ttributeError Traceback (most ...

A1iMansour

11

asked yesterday

-2 votes

0 answers

19 views

training help hybrid based model that integrates contextual and numerical features for a classification problem [closed]

I want a critical production RISK analysis problem. So, based on a record I want to risk rank each record from 0 to 5. The training set is fairly imbalanced. > "0.0 964 > 1.0 393 &...

wayne halks

5

asked yesterday

0 votes

0 answers

18 views

Huggingface Trainer CUDA Out Of Memory for 500M Model

I'm training MobiLLama for classification. This model is just 500Million Parameters and when I fine-tune it for the downstream tasks, the trainer keep giving me the CUDA out of memory error. I faced ...

Hoangdz

188

asked yesterday

0 votes

0 answers

11 views

Defining Agent in LLamaIndex and Mistral 7B is throwing Attribute error

I am using llamaIndex and locally downloaded Mistral model (mistral-7b-instruct-v0.2.Q4_K_M.gguf). I have created the python binding for this model using "llama-cpp". On defining the agent ...

ritwikv

1

asked yesterday

0 votes

0 answers

13 views

'LlamaForCausalLM' object has no attribute 'max_seq_length'

I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...

Sarra Ben Messaoud

1

asked yesterday

-1 votes

0 answers

11 views

Measuring relevance of the knowledge base to user questions

I have a document that explains finance policies and processes of some company. The goal is to build a chatbot using RAG framework upon that document to serve employees who have queries related to ...

Mohamed Abd ElBaset

1

asked yesterday

0 votes

1 answer

24 views

Error when tracing llm calls with Langsmith (Failed to get info from https://eu.smith.langchain.com) (Failed to batch ingest runs: LangSmithError))

I have an issue with lansgmith setup. I tried to search on the web for this issue, but could not find a solution. I follow these steps: Created a new fresh environment: conda create --name ...

Andrea Neri

105

asked yesterday

0 votes

0 answers

39 views

Convert safetensors model format(LLaVA model) into gguf format

I want to do LLaVA inference in ollama, so I need to convert it in gguf file format. My model has the file format safetensors.(trained with lora) It seems that ollama supports only llama, but not ...

Jiyong Jeong

1

asked yesterday

-3 votes

0 answers

26 views

Integrating web scraping and LLMs [closed]

I wanted to extract some information about a specific drug (lets say Rolvedon) from this site. I tried using BeautifulSoup and Scrapy but they seem to be very format dependent. I want the code to be ...

Mandvi Shukla

1

asked yesterday

-1 votes

0 answers

17 views

Implementing Few-Shot Learning without Prompts for Llama2

I am working with the Llama2 model. I have successfully started and fine-tuned the model, and I have also used Few-Shot Prompting with and without LangChain. However, now I am looking for a method ...

user26411748

1

asked 2 days ago

0 votes

0 answers

13 views

How does the transformer model's attention mechanism deal with differing sequence lengths?

I am going through the architecture of the transformer and its attention mechanism. The thing I don't get about this mechanism is how it handles sequences of different lengths. For example: How does ...

Syed Mustaqhim

406

asked 2 days ago

Collectives™ on Stack Overflow

Questions tagged [large-language-model]

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

Training LLM uses unexpected amount of GPU memory

How to evaluate LLM response

How to resolve ``` backticks error that occur while generating sql query in gemini llm to build a NL2SQL chatbot building

Unable to import SentenceTransformer

training help hybrid based model that integrates contextual and numerical features for a classification problem [closed]

Huggingface Trainer CUDA Out Of Memory for 500M Model

Defining Agent in LLamaIndex and Mistral 7B is throwing Attribute error

'LlamaForCausalLM' object has no attribute 'max_seq_length'

Measuring relevance of the knowledge base to user questions

Error when tracing llm calls with Langsmith (Failed to get info from https://eu.smith.langchain.com) (Failed to batch ingest runs: LangSmithError))

Convert safetensors model format(LLaVA model) into gguf format

Integrating web scraping and LLMs [closed]

Implementing Few-Shot Learning without Prompts for Llama2

How does the transformer model's attention mechanism deal with differing sequence lengths?

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [large-language-model]

Related Tags