Commit Graph

9 Commits

Author SHA1 Message Date
YorkZero 241350fde6
Feature add cublas support (#65)
* chore: add ggml import in the head of rwkv.h

* chore: add ggml import in the head of rwkv.h

* feat: add cublas support

* feat: update rwkv.cpp

* feat: remove unused change

* chore: fix linux build issue

* chore: sync ggml and offload tensor to gpu

* chore: comment out tensors which occurs error on GPU

* chore: update comment and readme

* chore: update ggml to recent

* chore: add more performance test results

* chore: add more performance test results

* chore: fix problem of reading file more than 2 gb

* chore: merge master

* chore: remove unused comment

* chore: fix for comments

* Update README.md

* Update rwkv.cpp

---------

Co-authored-by: Alex <saharNooby@users.noreply.github.com>
2023-05-29 17:10:19 +05:00
Alex dea929f8ca
Various improvements & upgrade ggml (#75)
* Use types from typing for better compatibility with older Python versions

* Split last double end of line token as per BlinkDL's suggestion

* Fix MSVC warnings

* Drop Q4_2 support

* Update ggml

* Bump file format version for quantization changes

* Apply suggestions
2023-05-27 16:02:24 +05:00
Alex 1198892888
Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format (#44)
* Remove Q4_3 support

* Add Q5_0, Q5_1, Q8_0 support

* Add more clear message when loading Q4_3 model

* Remove Q4_1_O format

* Fix indentation in .gitmodules

* Simplify sanitizer matrix
2023-04-29 17:39:11 +05:00
saharNooby e04baa032c Remove reference impl comparison test 2023-04-08 10:01:29 +04:00
PXLKSR 977efba905 we actually build a dylib on macos 2023-04-04 10:19:06 +02:00
hypnopump 0a0cabc4c7
for consistency 2023-04-03 08:27:00 +02:00
hypnopump a64aaa81ec
initial addition 2023-04-03 00:52:26 +02:00
saharNooby e0684e8104 Add text generation and chat scripts 2023-04-02 15:03:31 +04:00
saharNooby 935d16f5db Move library wrapper to separate file, refactor code 2023-04-02 12:24:40 +04:00