Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load HF model #915

Open
lea11100 opened this issue Jul 4, 2024 · 3 comments
Open

Unable to load HF model #915

lea11100 opened this issue Jul 4, 2024 · 3 comments

Comments

@lea11100
Copy link

lea11100 commented Jul 4, 2024

Hey,

currently, I receive the error when I want to load a model from HuggingFace:

AttributeError: 'NoneType' object has no attribute 'cuda'

I ran the following code:

prompter = Prompt().load_model("nvidia/Llama3-ChatQA-1.5-8B", temperature=0.0, sample=False, from_hf=True)

@doberst
Copy link
Contributor

doberst commented Jul 4, 2024

@lea11100 - let me look into this - will revert back. Sorry you ran into this issue. :)

@doberst
Copy link
Contributor

doberst commented Jul 4, 2024

@lea11100 - thanks for raising this - it looks like a bug in the code. We recently shifted to dynamically importing torch only when needed - and in this code path, torch is not getting loaded, and that is creating the error. I get the same error. Will prioritize fixing this - should be merged into the main branch by tomorrow if you pull - or will be in the next pypi release. Will update you once done.

@doberst
Copy link
Contributor

doberst commented Jul 6, 2024

@lea11100 - the fix has been merged in the main branch, if you are cloning/pulling from the repo directly. If you prefer pip install, then it will be in llmware=0.3.3 (which should be available by tomorrow). A couple of quick tips:

# Option #1 - load the model from_hf directly into Prompt

prompter = Prompt().load_model("nvidia/LLama3-ChatQA-1.5-8B", temperature=0.0, sample=False, from_hf=True)

# set the prompter wrapper to 'llama_3_chat' after loading the model
prompter.llm_model.prompt_wrapper = "llama_3_chat"

# Option #2 - alternate (and recommended approach) 

from llmware.models import ModelCatalog
from llmware.prompts import Prompt

#  register the hf model in the ModelCatalog 
ModelCatalog().register_new_hf_generative_model("nvidia/Llama3-ChatQA-1.5-8B",
                                                llmware_lookup_name="my_nvidia_llama3",
                                                context_window=8192, prompt_wrapper="llama_3_chat")

# then use the model using only your given short name 
prompter = Prompt().load_model("my_nvidia_llama3")

Hope this solves the issue - please ping back if any ongoing issues! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants