Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

small chat feature #11

Open
not-nullptr opened this issue Aug 12, 2024 · 6 comments
Open

small chat feature #11

not-nullptr opened this issue Aug 12, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@not-nullptr
Copy link

would be super cute honestly !! it already has a tuned in counter and a websocket server running already. would be happy to implement this if i knew it was gonna be merged :3

@kennethnym
Copy link
Owner

that sounds like a really cool idea!! my only worry is that it's gonna be ruined by spam and the likes :(

@not-nullptr
Copy link
Author

hmm.. well it's already running inference for MusicLM which is notoriously hard to run already. does the server have enough VRAM left for a small LLM for sentiment anaylsis, maybe gemma2:2b which is insanely small for the quality? https://ollama.com/library/gemma2:2b

@kennethnym
Copy link
Owner

i have 6gb of vram left, should be able to run a small LLM? but the gpu is basically always at 100% usage due to it constantly churning out new clips, so i don't know if it can handle another llm. i definitely CANNOT afford to spin up another gpu 😭

Screenshot 2024-08-12 at 23 26 40

@not-nullptr
Copy link
Author

i've just looked into it, running an LLM probably isn't worth it. after some local testing they're way too overbearing to censor regular conversation. might just be worth having an IP rate limit

@kennethnym
Copy link
Owner

hmm i see, i will keep this opened for now, i do want to implement this in v2, but main focus of v2 right now is a fine-tuned model + dynamic prompt generation for more variety. thank u for this cute suggestion though!

feel free to drop a PR if u want, but again the model goes first, i will review it once the model is trained :)

@kennethnym kennethnym added the enhancement New feature or request label Aug 13, 2024
@aryanranderiya
Copy link

Hey you could look into Cloudflare Workers AI. You get like a 100k requests a day for free and maybe it's something that'll work out of the box? Not for fine-tuning tho

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants