rwkv.cpp

e9ccfc44fd flask server added master ed barz 2023-07-18 22:27:09 +0200
70e5f07d5f README update ed barz 2023-06-12 08:07:14 +0200
099392db01 ??? ed barz 2023-06-12 08:00:48 +0200
446a5ecf8c rm ggml ed barz 2023-06-12 08:00:15 +0200
3f267fe1b9 ref fix ed barz 2023-06-12 07:58:11 +0200
d67fbad269 remove ref to ggml repo ed barz 2023-06-12 07:55:38 +0200
b88ae59604

Fix bug in world tokenizer (#93) Mathmagician8191 2023-06-11 18:46:54 +1200
82c4ac78f4

Add support for the world tokenizer (#86) Mathmagician8191 2023-06-08 23:37:18 +1200
09ec3145b3

Fix visual bug in quantization (#92) LoganDark 2023-06-07 04:45:21 -0700
5b41cd7e5d

Add capability for extra binaries to be built with rwkv.cpp (#87) LoganDark 2023-06-03 03:44:50 -0700
fb6708b555

Fix pytorch storage warnings, fixes #80 (#88) LoganDark 2023-06-03 03:09:51 -0700
3f8bb2c080

Allow creating multiple contexts per model (#83) LoganDark 2023-06-03 03:06:24 -0700
363dfb1a06

File parsing and memory usage optimization (#74) LoganDark 2023-05-31 04:31:19 -0700
241350fde6

Feature add cublas support (#65) YorkZero 2023-05-29 21:10:19 +0900
dea929f8ca

Various improvements & upgrade ggml (#75) Alex 2023-05-27 16:02:24 +0500
3ca9c7f785

Move graph building into its own function (#69) LoganDark 2023-05-26 05:30:07 -0700
b61d94aef0

Flush output every token in generate_completions.py (#73) LoganDark 2023-05-26 05:23:58 -0700
83983bbb84

last second move things over in the error enum (#71) LoganDark 2023-05-26 05:22:32 -0700
d26791b5bc

Silence PyTorch warnings by using untyped storage (#72) LoganDark 2023-05-26 05:21:18 -0700
7cbfbc55c8

Switch to fstat64 (#70) LoganDark 2023-05-26 05:20:51 -0700
9e2a0de843

Add rwkv_set_print_errors and rwkv_get_last_error (#68) LoganDark 2023-05-24 04:06:52 -0700
1c363e6d5f

Fix encoding issue when loading prompt data (#58) 柏园猫 2023-05-14 00:53:54 +0800
a3178b20ea

Various improvements (#52) Alex 2023-05-08 14:28:54 +0500
5eb8f09c14

Various improvements (#47) Alex 2023-04-30 20:27:14 +0500
3621172428

punish repetitions & break if END_OF_TEXT & decouple prompts from chat script (#37) Jarrett Ye 2023-04-30 21:50:05 +0800
06dac0f80d

Use main ggml repo (#45) Alex 2023-04-29 21:35:36 +0500
1198892888

Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format (#44) Alex 2023-04-29 17:39:11 +0500
c736ef5411

Improve chat_with_bot.py script (#39) Alex 2023-04-22 20:33:58 +0500
3587ff9e58

Sync ggml with upstream (#38) Alex 2023-04-22 20:25:29 +0500
ac663631e1

Improve the prompt & fix chinese display issue & support commands (#34) Jarrett Ye 2023-04-22 15:48:44 +0800
1be9fda248

Add robust automatic testing (#33) Alex 2023-04-20 11:00:35 +0500
7b28076243 Fix Q4_1_O optimization saharNooby 2023-04-18 16:46:27 +0400
2ef7ee0fac Optimize Q4_1_O by moving outlier multiplication out of the dequantize+dot loop saharNooby 2023-04-18 09:47:20 +0400
0a8157d1ee

Merge pull request #28 from saharNooby/ggml-to-submodule Alex 2023-04-17 20:18:02 +0500
82e2faa190 Update data type info saharNooby 2023-04-17 19:17:47 +0400
05825d2370 Fix GitHub Actions saharNooby 2023-04-17 19:04:55 +0400
e29da07731 Fix warnings saharNooby 2023-04-17 18:57:38 +0400
38eea116b8 Restore Q4_1_O support saharNooby 2023-04-17 18:53:48 +0400
28e354c183 Delete Makefile and make workflows saharNooby 2023-04-17 17:37:09 +0400
b2bdeb1d95 Use ggml as a submodule saharNooby 2023-04-17 17:35:58 +0400
a96ec01b1a Revert "Replace ggml_1_minus_x with ggml_sub" saharNooby 2023-04-17 16:47:11 +0400
189ad78a0d Replace ggml_1_minus_x with ggml_sub saharNooby 2023-04-17 16:46:55 +0400
2f37c6b019 Fix FP16 lookup table saharNooby 2023-04-17 16:39:43 +0400
678f5233a5 Add LoRA loading support saharNooby 2023-04-15 20:46:30 +0400
e4268a36c8 Update file format documentation saharNooby 2023-04-14 18:59:16 +0400
e84c446d95

Merge pull request #20 from BrutalCoding/patch-1 Alex 2023-04-10 09:48:31 +0500
70f7eece06

fix: Mention of incorrect filename for MacOS cmake build artifact Daniel Breedeveld 2023-04-10 02:01:28 +0800
4f315441ba Merge remote-tracking branch 'origin/master' saharNooby 2023-04-08 19:39:47 +0400
7437e1d860 Clarify that we now have binaries for Linux/MacOS saharNooby 2023-04-08 19:39:31 +0400
5d99741eab

Merge pull request #18 from yorkzero831/master Alex 2023-04-08 20:37:01 +0500
5662bf4b4f chore: make the asset file at the root of the zip file YorkZero 2023-04-09 00:32:32 +0900
a3fe1c63d8 chore: align asset file name YorkZero 2023-04-09 00:21:30 +0900
37f890ff3e chore: update github action YorkZero 2023-04-08 23:00:31 +0900
84e0698f2b

Merge pull request #16 from saharNooby/outliers-preserving-quantization-PR Alex 2023-04-08 16:51:47 +0500
874826cb20 Update README.md saharNooby 2023-04-08 10:45:42 +0400
85db23c7de Add script that measures perplexity saharNooby 2023-04-08 10:41:16 +0400
e04baa032c Remove reference impl comparison test saharNooby 2023-04-08 10:01:29 +0400
edd57a186c Update README.md saharNooby 2023-04-07 10:16:12 +0400
e26b408ea7 Add Q4_1_O test saharNooby 2023-04-07 10:12:19 +0400
18bf02fea4 Use ggml function for parameter size calculation saharNooby 2023-04-07 10:01:04 +0400
c40941d9d0 Add Q4_1_O format saharNooby 2023-04-07 09:55:39 +0400
ec99bc1765 Do not quantize head saharNooby 2023-04-06 16:26:18 +0400
058b5cd1e6 Show file compression ratio saharNooby 2023-04-04 20:20:34 +0400
fa9ad13a39 Free ggml context when model is garbage collected saharNooby 2023-04-05 15:55:47 +0400
ad3a4ebc57 Add missing labels and symbols for new operators saharNooby 2023-04-06 20:26:31 +0400
d12088e164 Minor formatting changes saharNooby 2023-04-05 15:31:23 +0400
dc679bf971

Merge pull request #14 from hypnopump/update_macos Alexander 2023-04-04 21:42:45 +0500
d3801340f3

streaming output hypnopump 2023-04-04 18:27:14 +0200
a9cb9adfd6

streaming output hypnopump 2023-04-04 18:27:04 +0200
c320573b5e

verify instructions can be followed hypnopump 2023-04-04 17:45:55 +0200
f5feb7470b

verify instructions can be followed hypnopump 2023-04-04 17:45:06 +0200
b75a805563

working on macos. no point in fp32 if all weights distributed in fp16 hypnopump 2023-04-04 17:39:21 +0200
77e19980e9

Merge pull request #13 from pixelkaiser/rwkv-macos Alexander 2023-04-04 14:24:21 +0500
977efba905 we actually build a dylib on macos PXLKSR 2023-04-04 10:19:06 +0200
aacc8b6872 Minor formatting changes saharNooby 2023-04-03 10:39:28 +0400
4f1df7c89e

Merge pull request #9 from hypnopump/more_instructions_works_linux Alexander 2023-04-03 11:35:38 +0500
fa74b016c6

more details for macos/linux hypnopump 2023-04-03 08:33:57 +0200
bea02c4b4c

Merge branch 'master' into more_instructions_works_linux Eric Alcaide 2023-04-03 08:29:55 +0200
0a0cabc4c7

for consistency hypnopump 2023-04-03 08:27:00 +0200
6f3fb01913

suggestions hypnopump 2023-04-03 08:25:54 +0200
3535476987 Update README.md: include info about pre-compiled library saharNooby 2023-04-03 09:48:53 +0400
5b2830ed30 Increase memory for overhead from 32 MB to 256 MB saharNooby 2023-04-03 09:32:58 +0400
a64aaa81ec

initial addition hypnopump 2023-04-03 00:52:26 +0200
d62a050144 Remove hardcoded memory requirements table saharNooby 2023-04-02 18:37:45 +0400
1262ad0456 Fix build errors and warnings saharNooby 2023-04-02 17:23:39 +0400
f2b1dad22b Add GitHub workflows file saharNooby 2023-04-02 16:56:04 +0400
6b4ebc328a Update README.md saharNooby 2023-04-02 15:28:34 +0400
e0684e8104 Add text generation and chat scripts saharNooby 2023-04-02 15:03:31 +0400
ee46ad208e Add quantization test back, run ggml tests on first context init saharNooby 2023-04-02 13:05:17 +0400
1ecbad3a65 Remove unused files saharNooby 2023-04-02 12:53:41 +0400
935d16f5db Move library wrapper to separate file, refactor code saharNooby 2023-04-02 12:24:40 +0400
38f9d02d52 Fix quantization from FP16 saharNooby 2023-04-01 20:01:06 +0400
972e28d48d Implement INT4 conversion and inference saharNooby 2023-04-01 19:22:01 +0400
b164bf4e27 Allocate memory as needed for specific configuration of model saharNooby 2023-04-01 17:15:23 +0400
a1e1d34c93 Add Python wrapper for C library saharNooby 2023-04-01 16:02:22 +0400
7130a89d1f [FILE FORMAT CHANGED] Reverse dimensions in ggml file (makes it more similar to llama.cpp format) saharNooby 2023-04-01 14:41:30 +0400
ac03019fcf Move model to separate C library file saharNooby 2023-04-01 14:38:50 +0400
f6d45baec0 Support FP16 inference saharNooby 2023-04-01 11:53:49 +0400
fe98c94a63 [FILE FORMAT CHANGED] Use ggml_get_rows to get embedding saharNooby 2023-04-01 11:28:32 +0400
16ec7a5c18 Add fail-fast version of the test saharNooby 2023-04-01 11:15:15 +0400

Commit Graph Select branches Hide Pull Requests master Mono Color

Commit Graph

Select branches

Hide Pull Requests

master