Questions tagged [fine-tuning]
The fine-tuning tag has no usage guidance.
fine-tuning
254
questions
0
votes
0
answers
13
views
'LlamaForCausalLM' object has no attribute 'max_seq_length'
I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...
0
votes
0
answers
10
views
Knowing the format of dataset a pretrained model was trained on
i am working on a Multilingual TTS project , and developing a TTS for my regional language by using a pretrained model from Hugging Face hub , the model i am trying to fine tune is facebook-mms-tts ...
0
votes
0
answers
19
views
Difference between batch size and train batch size and validation batch size
The finetuning job creation required us to specify the values for these three kinds of batch sizes in the Azure ML Studio but I do not understand why we are specifying the batch size initially if we ...
0
votes
0
answers
20
views
Vertex AI Studio: Fine-tuned chat-bison@002 returns results are not in training data
I have a training dataset of about 1500 samples, in 1 JSONL file. I tried to fine-tune chat-bison@002 model but none of the answers in the test prompt is desired. Even when I try to copy a short ...
0
votes
1
answer
74
views
RuntimeError: Placeholder storage has not been allocated on MPS device while fine-tuning model on MacBook Pro M2
I'm trying to create a proof of concept (PoC) for a local code assistant by fine-tuning the tiny_starcoder_py-vi06 model on my MacBook Pro with an M2 chip. My dataset looks like this:
[
{ "...
-2
votes
0
answers
18
views
Is there a way to fix NameError: name 'GPTQConfig' is not defined error when fine tuning model? [closed]
Whenever I am trying to run my code, it is giving me NameError: name 'GPTQConfig' is not defined
error. I have the GPTQConfig defined and I have installed the necessary dependencies.
I am trying to ...
0
votes
1
answer
42
views
Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match
I am fine-tuning the M2M model, with 1.2B model as the last checkpoint. But while training the model I am getting this error that it cannot load the paramters and the model architechure should match
...
-1
votes
1
answer
75
views
How to prepare data for batch-inference in Azure ML?
The data format (.csv) that I am using for inferencing produces the error :
"each data point should be a conversation array" when running the batch scoring job. All the documentations ...
0
votes
0
answers
75
views
The issue of bitsandbytes package supporting CUDA 12.4 version
when running the peft fine-tuning program, execute the following code:
model = get_peft_model(model, peft_config)
report errors:
Could not find the bitsandbytes CUDA binary at WindowsPath('D:/Users/1/...
1
vote
0
answers
78
views
Fine tune llama3 with message replies like dataset (slack)
I want to fine tune llama3 on a dataset in which the data structure is a list of messages considering the below rules:
there are channels.
in each channel there are messages from all sort of users.
...
0
votes
0
answers
13
views
Fine-tunning model vs training from scrath
I used two approaches to train a YOLOv8X detection model. In the first approach, I split the dataset into three parts, trained the model from scratch on the first part, then fine-tuned it on the ...
-1
votes
0
answers
15
views
Model Training Does Not Update .bin File Size Despite Training
I'm experiencing an issue with my Transformer-based model training where the .bin file size does not change after training, despite the loss decreasing over epochs. I suspect the model weights are not ...
0
votes
0
answers
13
views
Layer "sequential_29" expects 1 input(s), but it received 3 input tensors
I am trying to use GridSearchCV on a trained model. But the following error occurs:
Layer "sequential_29" expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf....
0
votes
1
answer
42
views
Different results for the same epoch using different number of total epochs
I am training a Machine Learning model for STS task using the Sentence Transformers library.
When I was testing it, I noticed that my model generated different results for the same number of epochs ...
0
votes
0
answers
26
views
Pretrained Model Weights Not Updating During DPO Training
I'm trying to apply DPO to a pre-trained model. However, during the training process, the scores given by the pre-trained model and the fine-tuned model are identical, and the loss remains the same ...