Newest 'fine-tuning' Questions

0 votes

0 answers

13 views

'LlamaForCausalLM' object has no attribute 'max_seq_length'

I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...

Sarra Ben Messaoud

1

asked yesterday

0 votes

0 answers

10 views

Knowing the format of dataset a pretrained model was trained on

i am working on a Multilingual TTS project , and developing a TTS for my regional language by using a pretrained model from Hugging Face hub , the model i am trying to fine tune is facebook-mms-tts ...

Injila

1

asked 2 days ago

0 votes

0 answers

19 views

Difference between batch size and train batch size and validation batch size

The finetuning job creation required us to specify the values for these three kinds of batch sizes in the Azure ML Studio but I do not understand why we are specifying the batch size initially if we ...

S R

11

asked Jul 17 at 7:49

0 votes

0 answers

20 views

Vertex AI Studio: Fine-tuned chat-bison@002 returns results are not in training data

I have a training dataset of about 1500 samples, in 1 JSONL file. I tried to fine-tune chat-bison@002 model but none of the answers in the test prompt is desired. Even when I try to copy a short ...

nogias

583

asked Jul 15 at 13:47

0 votes

1 answer

74 views

RuntimeError: Placeholder storage has not been allocated on MPS device while fine-tuning model on MacBook Pro M2

I'm trying to create a proof of concept (PoC) for a local code assistant by fine-tuning the tiny_starcoder_py-vi06 model on my MacBook Pro with an M2 chip. My dataset looks like this: [ { "...

Varuzhan

43

asked Jul 13 at 9:06

-2 votes

0 answers

18 views

Is there a way to fix NameError: name 'GPTQConfig' is not defined error when fine tuning model? [closed]

Whenever I am trying to run my code, it is giving me NameError: name 'GPTQConfig' is not defined error. I have the GPTQConfig defined and I have installed the necessary dependencies. I am trying to ...

Janavi Bhalala

1

asked Jul 12 at 18:34

0 votes

1 answer

42 views

Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match

I am fine-tuning the M2M model, with 1.2B model as the last checkpoint. But while training the model I am getting this error that it cannot load the paramters and the model architechure should match ...

KRISH MANTRI

1

asked Jul 7 at 13:00

-1 votes

1 answer

75 views

How to prepare data for batch-inference in Azure ML?

The data format (.csv) that I am using for inferencing produces the error : "each data point should be a conversation array" when running the batch scoring job. All the documentations ...

S R

11

asked Jul 5 at 4:55

0 votes

0 answers

75 views

The issue of bitsandbytes package supporting CUDA 12.4 version

when running the peft fine-tuning program, execute the following code: model = get_peft_model(model, peft_config) report errors: Could not find the bitsandbytes CUDA binary at WindowsPath('D:/Users/1/...

paul qin

1

asked Jul 1 at 8:19

1 vote

0 answers

78 views

Fine tune llama3 with message replies like dataset (slack)

I want to fine tune llama3 on a dataset in which the data structure is a list of messages considering the below rules: there are channels. in each channel there are messages from all sort of users. ...

Ben

423

asked Jun 29 at 20:35

0 votes

0 answers

13 views

Fine-tunning model vs training from scrath

I used two approaches to train a YOLOv8X detection model. In the first approach, I split the dataset into three parts, trained the model from scratch on the first part, then fine-tuned it on the ...

Krilaria

21

asked Jun 27 at 15:46

-1 votes

0 answers

15 views

Model Training Does Not Update .bin File Size Despite Training

I'm experiencing an issue with my Transformer-based model training where the .bin file size does not change after training, despite the loss decreasing over epochs. I suspect the model weights are not ...

Baskan Aqua

1

asked Jun 26 at 10:49

0 votes

0 answers

13 views

Layer "sequential_29" expects 1 input(s), but it received 3 input tensors

I am trying to use GridSearchCV on a trained model. But the following error occurs: Layer "sequential_29" expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf....

Adriana

1

asked Jun 25 at 16:14

0 votes

1 answer

42 views

Different results for the same epoch using different number of total epochs

I am training a Machine Learning model for STS task using the Sentence Transformers library. When I was testing it, I noticed that my model generated different results for the same number of epochs ...

Hígor Hahn

1

asked Jun 24 at 22:35

0 votes

0 answers

26 views

Pretrained Model Weights Not Updating During DPO Training

I'm trying to apply DPO to a pre-trained model. However, during the training process, the scores given by the pre-trained model and the fine-tuned model are identical, and the loss remains the same ...

jeash

1

asked Jun 24 at 19:48

Collectives™ on Stack Overflow

Questions tagged [fine-tuning]

'LlamaForCausalLM' object has no attribute 'max_seq_length'

Knowing the format of dataset a pretrained model was trained on

Difference between batch size and train batch size and validation batch size

Vertex AI Studio: Fine-tuned chat-bison@002 returns results are not in training data

RuntimeError: Placeholder storage has not been allocated on MPS device while fine-tuning model on MacBook Pro M2

Is there a way to fix NameError: name 'GPTQConfig' is not defined error when fine tuning model? [closed]

Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match

How to prepare data for batch-inference in Azure ML?

The issue of bitsandbytes package supporting CUDA 12.4 version

Fine tune llama3 with message replies like dataset (slack)

Fine-tunning model vs training from scrath

Model Training Does Not Update .bin File Size Despite Training

Layer "sequential_29" expects 1 input(s), but it received 3 input tensors

Different results for the same epoch using different number of total epochs

Pretrained Model Weights Not Updating During DPO Training

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [fine-tuning]

Related Tags