Commit Graph

9 Commits

Author SHA1 Message Date
Alex a3178b20ea
Various improvements (#52)
* Update ggml

* Add link to pre-quantized models in README

* Enable W4 for MSVC

* Fix warnings, clean up code

* Fix LoRA merge script
2023-05-08 14:28:54 +05:00
Alex 1198892888
Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format (#44)
* Remove Q4_3 support

* Add Q5_0, Q5_1, Q8_0 support

* Add more clear message when loading Q4_3 model

* Remove Q4_1_O format

* Fix indentation in .gitmodules

* Simplify sanitizer matrix
2023-04-29 17:39:11 +05:00
Alex 3587ff9e58
Sync ggml with upstream (#38)
* Sync ggml with upstream

* Remove file filters from Actions triggers

* Update ggml

* Add Q4_2 and Q4_3 support

* Improve output of perplexity measuring script

* Add tests for new formats

* Add token limit argument to perplexity measuring script

* Update README

* Update README

* Update ggml

* Use master branch of ggml
2023-04-22 20:25:29 +05:00
Alex 1be9fda248
Add robust automatic testing (#33) 2023-04-20 11:00:35 +05:00
saharNooby 1ecbad3a65 Remove unused files 2023-04-02 12:53:41 +04:00
saharNooby 935d16f5db Move library wrapper to separate file, refactor code 2023-04-02 12:24:40 +04:00
saharNooby 972e28d48d Implement INT4 conversion and inference 2023-04-01 19:22:01 +04:00
saharNooby a1e1d34c93 Add Python wrapper for C library 2023-04-01 16:02:22 +04:00
saharNooby ac03019fcf Move model to separate C library file 2023-04-01 14:38:50 +04:00