5eb8f09c14 
								
							 
						 
						
							
							
								
								Various improvements ( #47 )  
							
							... 
							
							
							
							* Update ggml
* Pack only rwkv.dll for Windows releases
Test executables would not be packed anymore.
* Move test code into a separate file
* Remove redundant zeroing
* Refactor chat script 
							
						 
						
							2023-04-30 20:27:14 +05:00  
				
					
						
							
							
								 
						
							
								3621172428 
								
							 
						 
						
							
							
								
								punish repetitions & break if END_OF_TEXT & decouple prompts from chat script ( #37 )  
							
							... 
							
							
							
							* punish repetitions & break if END_OF_TEXT
* decouple prompts from chat_with_bot.py
* improve code style
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* add types
* JSON prompt
---------
Co-authored-by: Alex <saharNooby@users.noreply.github.com> 
							
						 
						
							2023-04-30 18:50:05 +05:00  
				
					
						
							
							
								 
						
							
								06dac0f80d 
								
							 
						 
						
							
							
								
								Use main ggml repo ( #45 )  
							
							
							
						 
						
							2023-04-29 21:35:36 +05:00  
				
					
						
							
							
								 
						
							
								1198892888 
								
							 
						 
						
							
							
								
								Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format ( #44 )  
							
							... 
							
							
							
							* Remove Q4_3 support
* Add Q5_0, Q5_1, Q8_0 support
* Add more clear message when loading Q4_3 model
* Remove Q4_1_O format
* Fix indentation in .gitmodules
* Simplify sanitizer matrix 
							
						 
						
							2023-04-29 17:39:11 +05:00  
				
					
						
							
							
								 
						
							
								c736ef5411 
								
							 
						 
						
							
							
								
								Improve chat_with_bot.py script ( #39 )  
							
							
							
						 
						
							2023-04-22 20:33:58 +05:00  
				
					
						
							
							
								 
						
							
								3587ff9e58 
								
							 
						 
						
							
							
								
								Sync ggml with upstream ( #38 )  
							
							... 
							
							
							
							* Sync ggml with upstream
* Remove file filters from Actions triggers
* Update ggml
* Add Q4_2 and Q4_3 support
* Improve output of perplexity measuring script
* Add tests for new formats
* Add token limit argument to perplexity measuring script
* Update README
* Update README
* Update ggml
* Use master branch of ggml 
							
						 
						
							2023-04-22 20:25:29 +05:00  
				
					
						
							
							
								 
						
							
								ac663631e1 
								
							 
						 
						
							
							
								
								Improve the prompt & fix chinese display issue & support commands ( #34 )  
							
							... 
							
							
							
							* update the prompt
* Fix/chinese display issue
* remove debug code
* support commands (#1 )
+reset +gen +i +qq +qa +++ ++ +
* run_rnn before decode
* remove debug code
* deep copy logits
* remove extra print()
* print newline if reach max_tokens_per_generation
* fix typo in init prompt
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* Update rwkv/chat_with_bot.py
Co-authored-by: Alex <saharNooby@users.noreply.github.com>
* refine code & type annotation
* add comments for commands
* support change temp & top_p during chat.
* set default language & prompt
---------
Co-authored-by: Alex <saharNooby@users.noreply.github.com> 
							
						 
						
							2023-04-22 12:48:44 +05:00  
				
					
						
							
							
								 
						
							
								1be9fda248 
								
							 
						 
						
							
							
								
								Add robust automatic testing ( #33 )  
							
							
							
						 
						
							2023-04-20 11:00:35 +05:00  
				
					
						
							
							
								 
						
							
								7b28076243 
								
							 
						 
						
							
							
								
								Fix Q4_1_O optimization  
							
							
							
						 
						
							2023-04-18 16:46:27 +04:00  
				
					
						
							
							
								 
						
							
								2ef7ee0fac 
								
							 
						 
						
							
							
								
								Optimize Q4_1_O by moving outlier multiplication out of the dequantize+dot loop  
							
							
							
						 
						
							2023-04-18 09:47:20 +04:00  
				
					
						
							
							
								 
						
							
								0a8157d1ee 
								
							 
						 
						
							
							
								
								Merge pull request  #28  from saharNooby/ggml-to-submodule  
							
							... 
							
							
							
							Move ggml to submodule 
							
						 
						
							2023-04-17 20:18:02 +05:00  
				
					
						
							
							
								 
						
							
								82e2faa190 
								
							 
						 
						
							
							
								
								Update data type info  
							
							
							
						 
						
							2023-04-17 19:17:47 +04:00  
				
					
						
							
							
								 
						
							
								05825d2370 
								
							 
						 
						
							
							
								
								Fix GitHub Actions  
							
							
							
						 
						
							2023-04-17 19:04:55 +04:00  
				
					
						
							
							
								 
						
							
								e29da07731 
								
							 
						 
						
							
							
								
								Fix warnings  
							
							
							
						 
						
							2023-04-17 18:57:38 +04:00  
				
					
						
							
							
								 
						
							
								38eea116b8 
								
							 
						 
						
							
							
								
								Restore Q4_1_O support  
							
							
							
						 
						
							2023-04-17 18:53:48 +04:00  
				
					
						
							
							
								 
						
							
								28e354c183 
								
							 
						 
						
							
							
								
								Delete Makefile and make workflows  
							
							
							
						 
						
							2023-04-17 17:37:09 +04:00  
				
					
						
							
							
								 
						
							
								b2bdeb1d95 
								
							 
						 
						
							
							
								
								Use ggml as a submodule  
							
							
							
						 
						
							2023-04-17 17:35:58 +04:00  
				
					
						
							
							
								 
						
							
								a96ec01b1a 
								
							 
						 
						
							
							
								
								Revert "Replace ggml_1_minus_x with ggml_sub"  
							
							... 
							
							
							
							This reverts commit 189ad78a0d 
							
						 
						
							2023-04-17 16:47:11 +04:00  
				
					
						
							
							
								 
						
							
								189ad78a0d 
								
							 
						 
						
							
							
								
								Replace ggml_1_minus_x with ggml_sub  
							
							
							
						 
						
							2023-04-17 16:46:55 +04:00  
				
					
						
							
							
								 
						
							
								2f37c6b019 
								
							 
						 
						
							
							
								
								Fix FP16 lookup table  
							
							
							
						 
						
							2023-04-17 16:39:43 +04:00  
				
					
						
							
							
								 
						
							
								678f5233a5 
								
							 
						 
						
							
							
								
								Add LoRA loading support  
							
							
							
						 
						
							2023-04-15 20:46:30 +04:00  
				
					
						
							
							
								 
						
							
								e4268a36c8 
								
							 
						 
						
							
							
								
								Update file format documentation  
							
							
							
						 
						
							2023-04-14 18:59:16 +04:00  
				
					
						
							
							
								 
						
							
								e84c446d95 
								
							 
						 
						
							
							
								
								Merge pull request  #20  from BrutalCoding/patch-1  
							
							... 
							
							
							
							fix: Mention of incorrect filename for MacOS cmake build artifact 
							
						 
						
							2023-04-10 09:48:31 +05:00  
				
					
						
							
							
								 
						
							
								70f7eece06 
								
							 
						 
						
							
							
								
								fix: Mention of incorrect filename for MacOS cmake build artifact  
							
							... 
							
							
							
							Executing the cmake build produces "librwkv.dylib" on MacOS (tested on Ventura 13.3.1) 
							
						 
						
							2023-04-10 02:01:28 +08:00  
				
					
						
							
							
								 
						
							
								4f315441ba 
								
							 
						 
						
							
							
								
								Merge remote-tracking branch 'origin/master'  
							
							
							
						 
						
							2023-04-08 19:39:47 +04:00  
				
					
						
							
							
								 
						
							
								7437e1d860 
								
							 
						 
						
							
							
								
								Clarify that we now have binaries for Linux/MacOS  
							
							
							
						 
						
							2023-04-08 19:39:31 +04:00  
				
					
						
							
							
								 
						
							
								5d99741eab 
								
							 
						 
						
							
							
								
								Merge pull request  #18  from yorkzero831/master  
							
							... 
							
							
							
							Update github action to support linux and macos asset uploading 
							
						 
						
							2023-04-08 20:37:01 +05:00  
				
					
						
							
							
								 
						
							
								5662bf4b4f 
								
							 
						 
						
							
							
								
								chore: make the asset file at the root of the zip file  
							
							
							
						 
						
							2023-04-09 00:32:32 +09:00  
				
					
						
							
							
								 
						
							
								a3fe1c63d8 
								
							 
						 
						
							
							
								
								chore: align asset file name  
							
							
							
						 
						
							2023-04-09 00:21:30 +09:00  
				
					
						
							
							
								 
						
							
								37f890ff3e 
								
							 
						 
						
							
							
								
								chore: update github action  
							
							... 
							
							
							
							chore: update github action
chore: update github action 
							
						 
						
							2023-04-08 23:18:31 +09:00  
				
					
						
							
							
								 
						
							
								84e0698f2b 
								
							 
						 
						
							
							
								
								Merge pull request  #16  from saharNooby/outliers-preserving-quantization-PR  
							
							... 
							
							
							
							Add Q4_1_O quantization format that preserves outliers in weights and does dot in FP32 
							
						 
						
							2023-04-08 16:51:47 +05:00  
				
					
						
							
							
								 
						
							
								874826cb20 
								
							 
						 
						
							
							
								
								Update README.md  
							
							
							
						 
						
							2023-04-08 10:45:42 +04:00  
				
					
						
							
							
								 
						
							
								85db23c7de 
								
							 
						 
						
							
							
								
								Add script that measures perplexity  
							
							
							
						 
						
							2023-04-08 10:41:16 +04:00  
				
					
						
							
							
								 
						
							
								e04baa032c 
								
							 
						 
						
							
							
								
								Remove reference impl comparison test  
							
							
							
						 
						
							2023-04-08 10:01:29 +04:00  
				
					
						
							
							
								 
						
							
								edd57a186c 
								
							 
						 
						
							
							
								
								Update README.md  
							
							
							
						 
						
							2023-04-07 10:16:12 +04:00  
				
					
						
							
							
								 
						
							
								e26b408ea7 
								
							 
						 
						
							
							
								
								Add Q4_1_O test  
							
							
							
						 
						
							2023-04-07 10:12:19 +04:00  
				
					
						
							
							
								 
						
							
								18bf02fea4 
								
							 
						 
						
							
							
								
								Use ggml function for parameter size calculation  
							
							
							
						 
						
							2023-04-07 10:01:04 +04:00  
				
					
						
							
							
								 
						
							
								c40941d9d0 
								
							 
						 
						
							
							
								
								Add Q4_1_O format  
							
							
							
						 
						
							2023-04-07 09:55:39 +04:00  
				
					
						
							
							
								 
						
							
								ec99bc1765 
								
							 
						 
						
							
							
								
								Do not quantize head  
							
							
							
						 
						
							2023-04-06 20:30:32 +04:00  
				
					
						
							
							
								 
						
							
								058b5cd1e6 
								
							 
						 
						
							
							
								
								Show file compression ratio  
							
							
							
						 
						
							2023-04-06 20:29:58 +04:00  
				
					
						
							
							
								 
						
							
								fa9ad13a39 
								
							 
						 
						
							
							
								
								Free ggml context when model is garbage collected  
							
							
							
						 
						
							2023-04-06 20:27:33 +04:00  
				
					
						
							
							
								 
						
							
								ad3a4ebc57 
								
							 
						 
						
							
							
								
								Add missing labels and symbols for new operators  
							
							
							
						 
						
							2023-04-06 20:26:31 +04:00  
				
					
						
							
							
								 
						
							
								d12088e164 
								
							 
						 
						
							
							
								
								Minor formatting changes  
							
							
							
						 
						
							2023-04-05 15:31:23 +04:00  
				
					
						
							
							
								 
						
							
								dc679bf971 
								
							 
						 
						
							
							
								
								Merge pull request  #14  from hypnopump/update_macos  
							
							... 
							
							
							
							Update macOS, better instructions, streaming output 
							
						 
						
							2023-04-04 21:42:45 +05:00  
				
					
						
							
							
								 
						
							
								d3801340f3 
								
							 
						 
						
							
							
								
								streaming output  
							
							
							
						 
						
							2023-04-04 18:27:14 +02:00  
				
					
						
							
							
								 
						
							
								a9cb9adfd6 
								
							 
						 
						
							
							
								
								streaming output  
							
							
							
						 
						
							2023-04-04 18:27:04 +02:00  
				
					
						
							
							
								 
						
							
								c320573b5e 
								
							 
						 
						
							
							
								
								verify instructions can be followed  
							
							
							
						 
						
							2023-04-04 17:45:55 +02:00  
				
					
						
							
							
								 
						
							
								f5feb7470b 
								
							 
						 
						
							
							
								
								verify instructions can be followed  
							
							
							
						 
						
							2023-04-04 17:45:06 +02:00  
				
					
						
							
							
								 
						
							
								b75a805563 
								
							 
						 
						
							
							
								
								working on macos. no point in fp32 if all weights distributed in fp16  
							
							
							
						 
						
							2023-04-04 17:39:21 +02:00  
				
					
						
							
							
								 
						
							
								77e19980e9 
								
							 
						 
						
							
							
								
								Merge pull request  #13  from pixelkaiser/rwkv-macos  
							
							... 
							
							
							
							we actually build a dylib on macos 
							
						 
						
							2023-04-04 14:24:21 +05:00