Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for LM Studio to be used instead of openAI #1280

Open
1 task done
Rho-9-Official opened this issue Jun 13, 2024 · 5 comments
Open
1 task done

feat: add support for LM Studio to be used instead of openAI #1280

Rho-9-Official opened this issue Jun 13, 2024 · 5 comments

Comments

@Rho-9-Official
Copy link

Is there an existing feature or issue for this?

  • I have searched the existing issues

Expected feature

so, LM Studio is a self hosting API server option for LLMs, and it's actually built to act as a drop in replacement for OpenAI API.

Alternative solutions

I would give other solutions, but honestly, LM Studio seems to be the only one that comes to mind that would be perfect, and 100% free.

Anything else?

No response

Copy link

👋 Hi @Rho-9-Official,
Issues is only for reporting a bug/feature request. Please read documentation before raising an issue https://rengine.wiki
For very limited support, questions, and discussions, please join reNgine Discord channel: https://discord.gg/azv6fzhNCE
Please include all the requested and relevant information when opening a bug report. Improper reports will be closed without any response.

@yogeshojha
Copy link
Owner

Hi @Rho-9-Official

reNgine 2.1.0 is just released with Ollama support. You can install LLMs locally now.

@Rho-9-Official
Copy link
Author

Rho-9-Official commented Jul 4, 2024

Ok, words aren't my strength, and I know you have better things to do than try to understand my gibberish😂...i'mma use llama2 to help me out a bit.

I was just thinking, using drop-in APIs would be a massive strength. Like, if I wanted to run rengine on a small, lightweight laptop, I’d be waiting a month for an AI-generated report!Likewise, sometimes we can't afford to use OpenAI API(I was under the impression that costs money), that's why I also run LM Studio—amazing software. It allows for a bit more flexibility in models, including the option to use LLaMA2 uncensored models, which won’t argue with being asked to generate vulnerability reports. Plus, it lets me run those models on a much more powerful machine and access it remotely using OpenAI's API module by changing the baseURL.

Here's an example of how you can use the API:

# Chat with an intelligent assistant in your terminal
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:8081/v1", api_key="lm-studio")

history = [
    {"role": "system", "content": "You are an intelligent assistant. You always provide well-reasoned answers that are both correct and helpful."},
    {"role": "user", "content": "Hello, introduce yourself to someone opening this program for the first time. Be concise."},
]

while True:
    completion = client.chat.completions.create(
        model="customized-models/zephyr-7B-beta-GGUF",
        messages=history,
        temperature=0.7,
        stream=True,
    )

    new_message = {"role": "assistant", "content": ""}
    
    for chunk in completion:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)
            new_message["content"] += chunk.choices[0].delta.content

    history.append(new_message)
    
    # Uncomment to see chat history
    # import json
    # gray_color = "\033[90m"
    # reset_color = "\033[0m"
    # print(f"{gray_color}\n{'-'*20} History dump {'-'*20}\n")
    # print(json.dumps(history, indent=2))
    # print(f"\n{'-'*55}\n{reset_color}")

    print()
    history.append({"role": "user", "content": input("> ")})

This setup makes it super flexible and powerful for various use cases!

You'd obviously end up needing to add the functionality to change between OpenAI and Drop in API....I'd change it myself,since I'd just always use my drop in..... but....I haven't found the files for that yet.

@yogeshojha
Copy link
Owner

I agree with you, this can be looked upon. There is another feature request for Groq, so think I can integrate this together.

@yogeshojha yogeshojha reopened this Jul 5, 2024
@yogeshojha yogeshojha self-assigned this Jul 5, 2024
@Rho-9-Official
Copy link
Author

Rho-9-Official commented Jul 7, 2024

I'd be more than happy to even try some testing to figure out which models work best, though, I think I'll hedge my bets on the llama2 7b uncensored model from TheBloke being the winner.

oh, I should also probably test to see if the smallest models can even understand the data, and I really want to check out some of those summarizer models, just to cover the obvious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment