janhq / cortex.tensorrt-llm Public

forked from NVIDIA/TensorRT-LLM

Notifications You must be signed in to change notification settings
Fork 2
Star 33

Code
Issues 13
Pull requests 5
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: janhq/cortex.tensorrt-llm

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

13 Open 8 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

bug: frequency_penalty Parameter in model.yml Only Functions Correctly with Value 1, Produces Gibberish for other values type: bug

Something isn't working

#61 opened Jul 18, 2024 by Van-QA

feat: Revamp the README.md file P1: important

Important feature / fix

#55 opened Jul 11, 2024 by irfanpena

[Request] Support for logits_prob P2: nice to have

Nice to have feature

type: feature request

A new feature

#54 opened Jul 9, 2024 by hiro-v

feat: use batch-manager instead of gpt-runtime

#51 opened Jul 1, 2024 by vansangpfiev

feat: support llama3

#49 opened Jul 1, 2024 by vansangpfiev

feat: Load multiple models P1: important

Important feature / fix

#33 opened Mar 21, 2024 by tikikun

feat: Unload the model P1: important

Important feature / fix

#32 opened Mar 21, 2024 by tikikun

feat: Stop inferencing P1: important

Important feature / fix

#31 opened Mar 21, 2024 by tikikun

feat: Enable the usage of InferenceRequest and stop_words_list P1: important

Important feature / fix

#30 opened Mar 21, 2024 by tikikun

feat: Enable inflight batching in nitro-tensorrt-llm P1: important

Important feature / fix

#29 opened Mar 21, 2024 by tikikun

Github CI windows for tensorrt_llm engine

#28 opened Mar 20, 2024 by hiro-v

bug: tensorRT - Switching between model is causing error satisfyProfile Runtime dimension does not satisfy any optimization profile

#27 opened Mar 18, 2024 by Van-QA

feat: Ultilize free_gpu_memory_fraction to control max VRAM consumption type: feature request

A new feature

#25 opened Mar 16, 2024 by hiro-v

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly