Skip to main content

Questions tagged [fine-tuning]

The tag has no usage guidance.

0 votes
0 answers
13 views

'LlamaForCausalLM' object has no attribute 'max_seq_length'

I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...
Sarra Ben Messaoud's user avatar
0 votes
0 answers
10 views

Knowing the format of dataset a pretrained model was trained on

i am working on a Multilingual TTS project , and developing a TTS for my regional language by using a pretrained model from Hugging Face hub , the model i am trying to fine tune is facebook-mms-tts ...
Injila's user avatar
  • 1
0 votes
0 answers
19 views

Difference between batch size and train batch size and validation batch size

The finetuning job creation required us to specify the values for these three kinds of batch sizes in the Azure ML Studio but I do not understand why we are specifying the batch size initially if we ...
S R's user avatar
  • 11
0 votes
0 answers
20 views

Vertex AI Studio: Fine-tuned chat-bison@002 returns results are not in training data

I have a training dataset of about 1500 samples, in 1 JSONL file. I tried to fine-tune chat-bison@002 model but none of the answers in the test prompt is desired. Even when I try to copy a short ...
nogias's user avatar
  • 583
0 votes
1 answer
74 views

RuntimeError: Placeholder storage has not been allocated on MPS device while fine-tuning model on MacBook Pro M2

I'm trying to create a proof of concept (PoC) for a local code assistant by fine-tuning the tiny_starcoder_py-vi06 model on my MacBook Pro with an M2 chip. My dataset looks like this: [ { "...
Varuzhan's user avatar
-2 votes
0 answers
18 views

Is there a way to fix NameError: name 'GPTQConfig' is not defined error when fine tuning model? [closed]

Whenever I am trying to run my code, it is giving me NameError: name 'GPTQConfig' is not defined error. I have the GPTQConfig defined and I have installed the necessary dependencies. I am trying to ...
Janavi Bhalala's user avatar
0 votes
1 answer
42 views

Exception: Cannot load model parameters from checkpoint /home/krish/content/1.2B_last_checkpoint.pt; please ensure that the architectures match

I am fine-tuning the M2M model, with 1.2B model as the last checkpoint. But while training the model I am getting this error that it cannot load the paramters and the model architechure should match ...
KRISH MANTRI's user avatar
-1 votes
1 answer
75 views

How to prepare data for batch-inference in Azure ML?

The data format (.csv) that I am using for inferencing produces the error : "each data point should be a conversation array" when running the batch scoring job. All the documentations ...
S R's user avatar
  • 11
0 votes
0 answers
75 views

The issue of bitsandbytes package supporting CUDA 12.4 version

when running the peft fine-tuning program, execute the following code: model = get_peft_model(model, peft_config) report errors: Could not find the bitsandbytes CUDA binary at WindowsPath('D:/Users/1/...
paul qin's user avatar
1 vote
0 answers
78 views

Fine tune llama3 with message replies like dataset (slack)

I want to fine tune llama3 on a dataset in which the data structure is a list of messages considering the below rules: there are channels. in each channel there are messages from all sort of users. ...
Ben's user avatar
  • 423
0 votes
0 answers
13 views

Fine-tunning model vs training from scrath

I used two approaches to train a YOLOv8X detection model. In the first approach, I split the dataset into three parts, trained the model from scratch on the first part, then fine-tuned it on the ...
Krilaria's user avatar
-1 votes
0 answers
15 views

Model Training Does Not Update .bin File Size Despite Training

I'm experiencing an issue with my Transformer-based model training where the .bin file size does not change after training, despite the loss decreasing over epochs. I suspect the model weights are not ...
Baskan Aqua's user avatar
0 votes
0 answers
13 views

Layer "sequential_29" expects 1 input(s), but it received 3 input tensors

I am trying to use GridSearchCV on a trained model. But the following error occurs: Layer "sequential_29" expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf....
Adriana's user avatar
0 votes
1 answer
42 views

Different results for the same epoch using different number of total epochs

I am training a Machine Learning model for STS task using the Sentence Transformers library. When I was testing it, I noticed that my model generated different results for the same number of epochs ...
Hígor Hahn's user avatar
0 votes
0 answers
26 views

Pretrained Model Weights Not Updating During DPO Training

I'm trying to apply DPO to a pre-trained model. However, during the training process, the scores given by the pre-trained model and the fine-tuned model are identical, and the loss remains the same ...
jeash's user avatar
  • 1

15 30 50 per page
1
2 3 4 5
17