Commit Graph

13 Commits

Author SHA1 Message Date
ed barz 099392db01 ??? 2023-06-12 08:00:48 +02:00
ed barz 446a5ecf8c rm ggml 2023-06-12 08:00:15 +02:00
Alex dea929f8ca
Various improvements & upgrade ggml (#75)
* Use types from typing for better compatibility with older Python versions

* Split last double end of line token as per BlinkDL's suggestion

* Fix MSVC warnings

* Drop Q4_2 support

* Update ggml

* Bump file format version for quantization changes

* Apply suggestions
2023-05-27 16:02:24 +05:00
Alex a3178b20ea
Various improvements (#52)
* Update ggml

* Add link to pre-quantized models in README

* Enable W4 for MSVC

* Fix warnings, clean up code

* Fix LoRA merge script
2023-05-08 14:28:54 +05:00
Alex 5eb8f09c14
Various improvements (#47)
* Update ggml

* Pack only rwkv.dll for Windows releases

Test executables would not be packed anymore.

* Move test code into a separate file

* Remove redundant zeroing

* Refactor chat script
2023-04-30 20:27:14 +05:00
Alex 06dac0f80d
Use main ggml repo (#45) 2023-04-29 21:35:36 +05:00
Alex 1198892888
Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format (#44)
* Remove Q4_3 support

* Add Q5_0, Q5_1, Q8_0 support

* Add more clear message when loading Q4_3 model

* Remove Q4_1_O format

* Fix indentation in .gitmodules

* Simplify sanitizer matrix
2023-04-29 17:39:11 +05:00
Alex 3587ff9e58
Sync ggml with upstream (#38)
* Sync ggml with upstream

* Remove file filters from Actions triggers

* Update ggml

* Add Q4_2 and Q4_3 support

* Improve output of perplexity measuring script

* Add tests for new formats

* Add token limit argument to perplexity measuring script

* Update README

* Update README

* Update ggml

* Use master branch of ggml
2023-04-22 20:25:29 +05:00
Alex 1be9fda248
Add robust automatic testing (#33) 2023-04-20 11:00:35 +05:00
saharNooby 7b28076243 Fix Q4_1_O optimization 2023-04-18 16:46:27 +04:00
saharNooby 2ef7ee0fac Optimize Q4_1_O by moving outlier multiplication out of the dequantize+dot loop 2023-04-18 09:47:20 +04:00
saharNooby e29da07731 Fix warnings 2023-04-17 18:57:38 +04:00
saharNooby b2bdeb1d95 Use ggml as a submodule 2023-04-17 17:35:58 +04:00