Model File Manager #789

patrick-hovsepian · 2024-06-11T04:29:59Z

Create a simple manager to make administration of local models simpler

Configure a list of directories that will be scanned to find gguf files that are ready to be loaded
convenience method for loading a model that includes default params
tests

* seed manageR * model manager init * interface * test * tests --------- Co-authored-by: Pat Hov <hov@hov.com> Co-authored-by: pat_hov <hov@hovbook>

LLama/Model/ModelManager.cs

martindevans

I've left a few comments, the main issue is about being completely clear about ownership and resource disposal (models are expensive to load and keep loaded, so it's important that resource management is completely clear!)

* seed manageR * model manager init * interface * test * tests * no default configurator * Rename class * handle already disposed --------- Co-authored-by: Pat Hov <hov@hov.com> Co-authored-by: pat_hov <hov@hovbook>

Co-authored-by: pat_hov <hov@hovbook>

* organization * disposable and ref counter --------- Co-authored-by: pat_hov <hov@hovbook>

AsakusaRinne

Thank you for your contribution! I left some comments and please let me know if any help is needed.

LLama/Model/ModelCache.cs

LLama/Model/IModelCache.cs

LLama/LLamaWeights.cs

LLama/Model/IModelCache.cs

* organization * disposable and ref counter * separate concerns a bit more * check * tweak --------- Co-authored-by: pat_hov <hov@hovbook>

LLama/LLamaWeights.cs

* organization * disposable and ref counter * separate concerns a bit more * check * tweak * stash * note --------- Co-authored-by: pat_hov <hov@hovbook>

patrick-hovsepian · 2024-06-17T16:28:17Z

LLama/Model/HuggingFaceModelRepo.cs

essentially for demo purposes. wanted to see how abstract the interface is

If this is just a test do we want to keep it in this PR? I think @AsakusaRinne was working on HF integrations for model loading, so you might want to check what the status of that work is and add something in a separate PR?

Curious what the thoughts are on that

Sorry for the late reply, I agree that it could be removed from this PR and added in a separate PR to LLama.Experimental project. With HuggingfaceHub, it's easy to download a model from the huggingface. It's easy to implement a remote model manager for gguf files but the APIs might change in the future if we want to support other formats (.safetensors, .bin) based on GGMLSharp. So I would recommend putting it in the LLama.Experimental first.

patrick-hovsepian · 2024-06-17T16:34:37Z

LLama/Model/IModelCache.cs

+    /// <summary>
+    /// Unload and dispose of a model with the given id
+    /// </summary>
+    /// <param name="modelId"></param>
+    /// <returns></returns>
+    public bool UnloadModel(string modelId);
+
+    /// <summary>
+    /// Unload all currently loaded models
+    /// </summary>
+    public void UnloadAllModels();


There's a few challenges with supporting this and I'm thinking maybe it's better to just drop these methods and have the caller explicitly call Dispose on the weight?

A few questions

Should multiple instances of the same model be allowed to be loaded? I think yes. If so, we'll need a way to ensure calling unload is specific to a model.

Should we force model aliases in the cache to be unique? This would help the issue where I load multiple instances of the model and then calling dispose is guaranteed to operate on that same model. If we don't want to enforce this restriction, unloading the correct model because tricker and might require the original model to be passed in as opposed to the alias.

Are we better off getting rid of this class altogether? My main goal was to have something like IModelRepo and quickly went out of scope with some of this but if we think it'll be useful I'm happy to leave it but I'm struggling a bit to justify it especially given the unload/dispose challenges.

Should multiple instances of the same model be allowed to be loaded? I think yes. If so, we'll need a way to ensure calling unload is specific to a model.

Definitely. You can load a model with different settings (e.g. loras) which affect the output almost as if it's another model.

Should we force model aliases in the cache to be unique?

Do you mean the modelId string? If so then I would think loading a model with an ID which is in the cache would either:

throw an error

return the existing model

I'd lean towards throwign the error out of those choices.

Are we better off getting rid of this class altogether?

The model repo idea is interesting, and I think it's something Rinne has looked at as well. But yeah maybe it would be better to remove it from this PR, and make this PR all about the new shared lifetimes (which I think is a pretty big improvement in itself). The model repo can be done in a followup PR, fully taking advantage of the new lifetimes.

If anything, I'd rather keep the model repo and the file system class than this but I've further reworked this to be more explicit. I haven't been able to finalize what a "good" api is for something like this (name aside) but the discussion here has helped add context.

LLama/LLamaWeights.cs

LLama/Model/ModelCache.cs

cache semantics

LLama/Native/SafeLlamaModelHandle.cs

cleanup

patrick-hovsepian · 2024-06-20T05:12:40Z

LLama/Model/IModelSourceRepo.cs

have any thoughts on this @AsakusaRinne

AsakusaRinne

The overall of the code looks good. Since this PR added many new APIs, could you please add a document to introduce the usage of model cache and model repo? Please refer to #747 to see how to add a doc. :)

Mmwip (#2)

eb4e319

* seed manageR * model manager init * interface * test * tests --------- Co-authored-by: Pat Hov <hov@hov.com> Co-authored-by: pat_hov <hov@hovbook>

martindevans reviewed Jun 11, 2024

View reviewed changes

LLama/Model/ModelManager.cs Outdated Show resolved Hide resolved

martindevans reviewed Jun 11, 2024

View reviewed changes

LLama/Model/ModelManager.cs Outdated Show resolved Hide resolved

martindevans reviewed Jun 11, 2024

View reviewed changes

LLama/Model/ModelManager.cs Outdated Show resolved Hide resolved

martindevans reviewed Jun 11, 2024

View reviewed changes

LLama/Model/ModelManager.cs Outdated Show resolved Hide resolved

martindevans requested changes Jun 11, 2024

View reviewed changes

patrick-hovsepian and others added 4 commits June 11, 2024 19:25

Pr feedback (#3)

4b5a966

* seed manageR * model manager init * interface * test * tests * no default configurator * Rename class * handle already disposed --------- Co-authored-by: Pat Hov <hov@hov.com> Co-authored-by: pat_hov <hov@hovbook>

merge fix

fb46f57

organization (#4)

6b2d71d

Co-authored-by: pat_hov <hov@hovbook>

Mmwip disp (#5)

852e35d

* organization * disposable and ref counter --------- Co-authored-by: pat_hov <hov@hovbook>

AsakusaRinne requested changes Jun 13, 2024

View reviewed changes

LLama/Model/ModelCache.cs Outdated Show resolved Hide resolved

LLama/Model/ModelCache.cs Outdated Show resolved Hide resolved

LLama/Model/IModelCache.cs Outdated Show resolved Hide resolved