rwkv.cpp

Commit Graph

Author	SHA1	Message	Date
Alex	5d99741eab	Merge pull request #18 from yorkzero831/master Update github action to support linux and macos asset uploading	2023-04-08 20:37:01 +05:00
YorkZero	5662bf4b4f	chore: make the asset file at the root of the zip file	2023-04-09 00:32:32 +09:00
YorkZero	a3fe1c63d8	chore: align asset file name	2023-04-09 00:21:30 +09:00
YorkZero	37f890ff3e	chore: update github action chore: update github action chore: update github action	2023-04-08 23:18:31 +09:00
Alex	84e0698f2b	Merge pull request #16 from saharNooby/outliers-preserving-quantization-PR Add Q4_1_O quantization format that preserves outliers in weights and does dot in FP32	2023-04-08 16:51:47 +05:00
saharNooby	874826cb20	Update README.md	2023-04-08 10:45:42 +04:00
saharNooby	85db23c7de	Add script that measures perplexity	2023-04-08 10:41:16 +04:00
saharNooby	e04baa032c	Remove reference impl comparison test	2023-04-08 10:01:29 +04:00
saharNooby	edd57a186c	Update README.md	2023-04-07 10:16:12 +04:00
saharNooby	e26b408ea7	Add Q4_1_O test	2023-04-07 10:12:19 +04:00
saharNooby	18bf02fea4	Use ggml function for parameter size calculation	2023-04-07 10:01:04 +04:00
saharNooby	c40941d9d0	Add Q4_1_O format	2023-04-07 09:55:39 +04:00
saharNooby	ec99bc1765	Do not quantize head	2023-04-06 20:30:32 +04:00
saharNooby	058b5cd1e6	Show file compression ratio	2023-04-06 20:29:58 +04:00
saharNooby	fa9ad13a39	Free ggml context when model is garbage collected	2023-04-06 20:27:33 +04:00
saharNooby	ad3a4ebc57	Add missing labels and symbols for new operators	2023-04-06 20:26:31 +04:00
saharNooby	d12088e164	Minor formatting changes	2023-04-05 15:31:23 +04:00
Alexander	dc679bf971	Merge pull request #14 from hypnopump/update_macos Update macOS, better instructions, streaming output	2023-04-04 21:42:45 +05:00
hypnopump	d3801340f3	streaming output	2023-04-04 18:27:14 +02:00
hypnopump	a9cb9adfd6	streaming output	2023-04-04 18:27:04 +02:00
hypnopump	c320573b5e	verify instructions can be followed	2023-04-04 17:45:55 +02:00
hypnopump	f5feb7470b	verify instructions can be followed	2023-04-04 17:45:06 +02:00
hypnopump	b75a805563	working on macos. no point in fp32 if all weights distributed in fp16	2023-04-04 17:39:21 +02:00
Alexander	77e19980e9	Merge pull request #13 from pixelkaiser/rwkv-macos we actually build a dylib on macos	2023-04-04 14:24:21 +05:00
PXLKSR	977efba905	we actually build a dylib on macos	2023-04-04 10:19:06 +02:00
saharNooby	aacc8b6872	Minor formatting changes	2023-04-03 10:39:28 +04:00
Alexander	4f1df7c89e	Merge pull request #9 from hypnopump/more_instructions_works_linux Adds instructions and works on linux as well	2023-04-03 11:35:38 +05:00
hypnopump	fa74b016c6	more details for macos/linux	2023-04-03 08:33:57 +02:00
Eric Alcaide	bea02c4b4c	Merge branch 'master' into more_instructions_works_linux	2023-04-03 08:29:55 +02:00
hypnopump	0a0cabc4c7	for consistency	2023-04-03 08:27:00 +02:00
hypnopump	6f3fb01913	suggestions	2023-04-03 08:25:54 +02:00
saharNooby	3535476987	Update README.md: include info about pre-compiled library	2023-04-03 09:48:53 +04:00
saharNooby	5b2830ed30	Increase memory for overhead from 32 MB to 256 MB	2023-04-03 09:32:58 +04:00
hypnopump	a64aaa81ec	initial addition	2023-04-03 00:52:26 +02:00
saharNooby	d62a050144	Remove hardcoded memory requirements table	2023-04-02 18:37:45 +04:00
saharNooby	1262ad0456	Fix build errors and warnings	2023-04-02 17:23:39 +04:00
saharNooby	f2b1dad22b	Add GitHub workflows file	2023-04-02 16:56:04 +04:00
saharNooby	6b4ebc328a	Update README.md	2023-04-02 15:28:34 +04:00
saharNooby	e0684e8104	Add text generation and chat scripts	2023-04-02 15:03:31 +04:00
saharNooby	ee46ad208e	Add quantization test back, run ggml tests on first context init	2023-04-02 13:05:17 +04:00
saharNooby	1ecbad3a65	Remove unused files	2023-04-02 12:53:41 +04:00
saharNooby	935d16f5db	Move library wrapper to separate file, refactor code	2023-04-02 12:24:40 +04:00
saharNooby	38f9d02d52	Fix quantization from FP16	2023-04-01 20:01:06 +04:00
saharNooby	972e28d48d	Implement INT4 conversion and inference	2023-04-01 19:22:01 +04:00
saharNooby	b164bf4e27	Allocate memory as needed for specific configuration of model	2023-04-01 17:15:23 +04:00
saharNooby	a1e1d34c93	Add Python wrapper for C library	2023-04-01 16:02:22 +04:00
saharNooby	7130a89d1f	[FILE FORMAT CHANGED] Reverse dimensions in ggml file (makes it more similar to llama.cpp format)	2023-04-01 14:41:30 +04:00
saharNooby	ac03019fcf	Move model to separate C library file	2023-04-01 14:38:50 +04:00
saharNooby	f6d45baec0	Support FP16 inference	2023-04-01 11:53:49 +04:00
saharNooby	fe98c94a63	[FILE FORMAT CHANGED] Use ggml_get_rows to get embedding	2023-04-01 11:28:32 +04:00

1 2 3 4 5 ...

316 Commits All Branches Search

316 Commits

All Branches