Alex
1198892888
Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format ( #44 )
...
* Remove Q4_3 support
* Add Q5_0, Q5_1, Q8_0 support
* Add more clear message when loading Q4_3 model
* Remove Q4_1_O format
* Fix indentation in .gitmodules
* Simplify sanitizer matrix
2023-04-29 17:39:11 +05:00
Alex
3587ff9e58
Sync ggml with upstream ( #38 )
...
* Sync ggml with upstream
* Remove file filters from Actions triggers
* Update ggml
* Add Q4_2 and Q4_3 support
* Improve output of perplexity measuring script
* Add tests for new formats
* Add token limit argument to perplexity measuring script
* Update README
* Update README
* Update ggml
* Use master branch of ggml
2023-04-22 20:25:29 +05:00
saharNooby
c40941d9d0
Add Q4_1_O format
2023-04-07 09:55:39 +04:00
saharNooby
935d16f5db
Move library wrapper to separate file, refactor code
2023-04-02 12:24:40 +04:00
saharNooby
972e28d48d
Implement INT4 conversion and inference
2023-04-01 19:22:01 +04:00