Jarrett Ye
|
ac663631e1
|
Improve the prompt & fix chinese display issue & support commands (#34)
* update the prompt
* Fix/chinese display issue
* remove debug code
* support commands (#1)
+reset +gen +i +qq +qa +++ ++ +
* run_rnn before decode
* remove debug code
* deep copy logits
* remove extra print()
* print newline if reach max_tokens_per_generation
* fix typo in init prompt
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* refine code & type annotation
* add comments for commands
* support change temp & top_p during chat.
* set default language & prompt
---------
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
|
2023-04-22 12:48:44 +05:00 |
Alex
|
1be9fda248
|
Add robust automatic testing (#33)
|
2023-04-20 11:00:35 +05:00 |
saharNooby
|
7b28076243
|
Fix Q4_1_O optimization
|
2023-04-18 16:46:27 +04:00 |
saharNooby
|
2ef7ee0fac
|
Optimize Q4_1_O by moving outlier multiplication out of the dequantize+dot loop
|
2023-04-18 09:47:20 +04:00 |
Alex
|
0a8157d1ee
|
Merge pull request #28 from saharNooby/ggml-to-submodule
Move ggml to submodule
|
2023-04-17 20:18:02 +05:00 |
saharNooby
|
82e2faa190
|
Update data type info
|
2023-04-17 19:17:47 +04:00 |
saharNooby
|
05825d2370
|
Fix GitHub Actions
|
2023-04-17 19:04:55 +04:00 |
saharNooby
|
e29da07731
|
Fix warnings
|
2023-04-17 18:57:38 +04:00 |
saharNooby
|
38eea116b8
|
Restore Q4_1_O support
|
2023-04-17 18:53:48 +04:00 |
saharNooby
|
28e354c183
|
Delete Makefile and make workflows
|
2023-04-17 17:37:09 +04:00 |
saharNooby
|
b2bdeb1d95
|
Use ggml as a submodule
|
2023-04-17 17:35:58 +04:00 |
saharNooby
|
a96ec01b1a
|
Revert "Replace ggml_1_minus_x with ggml_sub"
This reverts commit 189ad78a0d .
|
2023-04-17 16:47:11 +04:00 |
saharNooby
|
189ad78a0d
|
Replace ggml_1_minus_x with ggml_sub
|
2023-04-17 16:46:55 +04:00 |
saharNooby
|
2f37c6b019
|
Fix FP16 lookup table
|
2023-04-17 16:39:43 +04:00 |
saharNooby
|
678f5233a5
|
Add LoRA loading support
|
2023-04-15 20:46:30 +04:00 |
saharNooby
|
e4268a36c8
|
Update file format documentation
|
2023-04-14 18:59:16 +04:00 |
Alex
|
e84c446d95
|
Merge pull request #20 from BrutalCoding/patch-1
fix: Mention of incorrect filename for MacOS cmake build artifact
|
2023-04-10 09:48:31 +05:00 |
Daniel Breedeveld
|
70f7eece06
|
fix: Mention of incorrect filename for MacOS cmake build artifact
Executing the cmake build produces "librwkv.dylib" on MacOS (tested on Ventura 13.3.1)
|
2023-04-10 02:01:28 +08:00 |
saharNooby
|
4f315441ba
|
Merge remote-tracking branch 'origin/master'
|
2023-04-08 19:39:47 +04:00 |
saharNooby
|
7437e1d860
|
Clarify that we now have binaries for Linux/MacOS
|
2023-04-08 19:39:31 +04:00 |
Alex
|
5d99741eab
|
Merge pull request #18 from yorkzero831/master
Update github action to support linux and macos asset uploading
|
2023-04-08 20:37:01 +05:00 |
YorkZero
|
5662bf4b4f
|
chore: make the asset file at the root of the zip file
|
2023-04-09 00:32:32 +09:00 |
YorkZero
|
a3fe1c63d8
|
chore: align asset file name
|
2023-04-09 00:21:30 +09:00 |
YorkZero
|
37f890ff3e
|
chore: update github action
chore: update github action
chore: update github action
|
2023-04-08 23:18:31 +09:00 |
Alex
|
84e0698f2b
|
Merge pull request #16 from saharNooby/outliers-preserving-quantization-PR
Add Q4_1_O quantization format that preserves outliers in weights and does dot in FP32
|
2023-04-08 16:51:47 +05:00 |
saharNooby
|
874826cb20
|
Update README.md
|
2023-04-08 10:45:42 +04:00 |
saharNooby
|
85db23c7de
|
Add script that measures perplexity
|
2023-04-08 10:41:16 +04:00 |
saharNooby
|
e04baa032c
|
Remove reference impl comparison test
|
2023-04-08 10:01:29 +04:00 |
saharNooby
|
edd57a186c
|
Update README.md
|
2023-04-07 10:16:12 +04:00 |
saharNooby
|
e26b408ea7
|
Add Q4_1_O test
|
2023-04-07 10:12:19 +04:00 |
saharNooby
|
18bf02fea4
|
Use ggml function for parameter size calculation
|
2023-04-07 10:01:04 +04:00 |
saharNooby
|
c40941d9d0
|
Add Q4_1_O format
|
2023-04-07 09:55:39 +04:00 |
saharNooby
|
ec99bc1765
|
Do not quantize head
|
2023-04-06 20:30:32 +04:00 |
saharNooby
|
058b5cd1e6
|
Show file compression ratio
|
2023-04-06 20:29:58 +04:00 |
saharNooby
|
fa9ad13a39
|
Free ggml context when model is garbage collected
|
2023-04-06 20:27:33 +04:00 |
saharNooby
|
ad3a4ebc57
|
Add missing labels and symbols for new operators
|
2023-04-06 20:26:31 +04:00 |
saharNooby
|
d12088e164
|
Minor formatting changes
|
2023-04-05 15:31:23 +04:00 |
Alexander
|
dc679bf971
|
Merge pull request #14 from hypnopump/update_macos
Update macOS, better instructions, streaming output
|
2023-04-04 21:42:45 +05:00 |
hypnopump
|
d3801340f3
|
streaming output
|
2023-04-04 18:27:14 +02:00 |
hypnopump
|
a9cb9adfd6
|
streaming output
|
2023-04-04 18:27:04 +02:00 |
hypnopump
|
c320573b5e
|
verify instructions can be followed
|
2023-04-04 17:45:55 +02:00 |
hypnopump
|
f5feb7470b
|
verify instructions can be followed
|
2023-04-04 17:45:06 +02:00 |
hypnopump
|
b75a805563
|
working on macos. no point in fp32 if all weights distributed in fp16
|
2023-04-04 17:39:21 +02:00 |
Alexander
|
77e19980e9
|
Merge pull request #13 from pixelkaiser/rwkv-macos
we actually build a dylib on macos
|
2023-04-04 14:24:21 +05:00 |
PXLKSR
|
977efba905
|
we actually build a dylib on macos
|
2023-04-04 10:19:06 +02:00 |
saharNooby
|
aacc8b6872
|
Minor formatting changes
|
2023-04-03 10:39:28 +04:00 |
Alexander
|
4f1df7c89e
|
Merge pull request #9 from hypnopump/more_instructions_works_linux
Adds instructions and works on linux as well
|
2023-04-03 11:35:38 +05:00 |
hypnopump
|
fa74b016c6
|
more details for macos/linux
|
2023-04-03 08:33:57 +02:00 |
Eric Alcaide
|
bea02c4b4c
|
Merge branch 'master' into more_instructions_works_linux
|
2023-04-03 08:29:55 +02:00 |
hypnopump
|
0a0cabc4c7
|
for consistency
|
2023-04-03 08:27:00 +02:00 |