rwkv.cpp

Commit Graph

Author	SHA1	Message	Date
YorkZero	241350fde6	Feature add cublas support (#65 ) * chore: add ggml import in the head of rwkv.h * chore: add ggml import in the head of rwkv.h * feat: add cublas support * feat: update rwkv.cpp * feat: remove unused change * chore: fix linux build issue * chore: sync ggml and offload tensor to gpu * chore: comment out tensors which occurs error on GPU * chore: update comment and readme * chore: update ggml to recent * chore: add more performance test results * chore: add more performance test results * chore: fix problem of reading file more than 2 gb * chore: merge master * chore: remove unused comment * chore: fix for comments * Update README.md * Update rwkv.cpp --------- Co-authored-by: Alex <saharNooby@users.noreply.github.com>	2023-05-29 17:10:19 +05:00
Alex	dea929f8ca	Various improvements & upgrade ggml (#75 ) * Use types from typing for better compatibility with older Python versions * Split last double end of line token as per BlinkDL's suggestion * Fix MSVC warnings * Drop Q4_2 support * Update ggml * Bump file format version for quantization changes * Apply suggestions	2023-05-27 16:02:24 +05:00
Alex	a3178b20ea	Various improvements (#52 ) * Update ggml * Add link to pre-quantized models in README * Enable W4 for MSVC * Fix warnings, clean up code * Fix LoRA merge script	2023-05-08 14:28:54 +05:00
Alex	06dac0f80d	Use main ggml repo (#45 )	2023-04-29 21:35:36 +05:00
Alex	1198892888	Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format (#44 ) * Remove Q4_3 support * Add Q5_0, Q5_1, Q8_0 support * Add more clear message when loading Q4_3 model * Remove Q4_1_O format * Fix indentation in .gitmodules * Simplify sanitizer matrix	2023-04-29 17:39:11 +05:00
Alex	3587ff9e58	Sync ggml with upstream (#38 ) * Sync ggml with upstream * Remove file filters from Actions triggers * Update ggml * Add Q4_2 and Q4_3 support * Improve output of perplexity measuring script * Add tests for new formats * Add token limit argument to perplexity measuring script * Update README * Update README * Update ggml * Use master branch of ggml	2023-04-22 20:25:29 +05:00
Alex	1be9fda248	Add robust automatic testing (#33 )	2023-04-20 11:00:35 +05:00
saharNooby	7b28076243	Fix Q4_1_O optimization	2023-04-18 16:46:27 +04:00
saharNooby	2ef7ee0fac	Optimize Q4_1_O by moving outlier multiplication out of the dequantize+dot loop	2023-04-18 09:47:20 +04:00
saharNooby	82e2faa190	Update data type info	2023-04-17 19:17:47 +04:00
saharNooby	05825d2370	Fix GitHub Actions	2023-04-17 19:04:55 +04:00
saharNooby	678f5233a5	Add LoRA loading support	2023-04-15 20:46:30 +04:00
Daniel Breedeveld	70f7eece06	fix: Mention of incorrect filename for MacOS cmake build artifact Executing the cmake build produces "librwkv.dylib" on MacOS (tested on Ventura 13.3.1)	2023-04-10 02:01:28 +08:00
saharNooby	7437e1d860	Clarify that we now have binaries for Linux/MacOS	2023-04-08 19:39:31 +04:00
saharNooby	874826cb20	Update README.md	2023-04-08 10:45:42 +04:00
saharNooby	edd57a186c	Update README.md	2023-04-07 10:16:12 +04:00
saharNooby	d12088e164	Minor formatting changes	2023-04-05 15:31:23 +04:00
hypnopump	f5feb7470b	verify instructions can be followed	2023-04-04 17:45:06 +02:00
hypnopump	b75a805563	working on macos. no point in fp32 if all weights distributed in fp16	2023-04-04 17:39:21 +02:00
saharNooby	aacc8b6872	Minor formatting changes	2023-04-03 10:39:28 +04:00
hypnopump	fa74b016c6	more details for macos/linux	2023-04-03 08:33:57 +02:00
Eric Alcaide	bea02c4b4c	Merge branch 'master' into more_instructions_works_linux	2023-04-03 08:29:55 +02:00
hypnopump	6f3fb01913	suggestions	2023-04-03 08:25:54 +02:00
saharNooby	3535476987	Update README.md: include info about pre-compiled library	2023-04-03 09:48:53 +04:00
hypnopump	a64aaa81ec	initial addition	2023-04-03 00:52:26 +02:00
saharNooby	6b4ebc328a	Update README.md	2023-04-02 15:28:34 +04:00
saharNooby	e0684e8104	Add text generation and chat scripts	2023-04-02 15:03:31 +04:00
saharNooby	1ecbad3a65	Remove unused files	2023-04-02 12:53:41 +04:00
saharNooby	935d16f5db	Move library wrapper to separate file, refactor code	2023-04-02 12:24:40 +04:00
saharNooby	38f9d02d52	Fix quantization from FP16	2023-04-01 20:01:06 +04:00
saharNooby	a1e1d34c93	Add Python wrapper for C library	2023-04-01 16:02:22 +04:00
saharNooby	f6d45baec0	Support FP16 inference	2023-04-01 11:53:49 +04:00
saharNooby	0fcb7c64c6	Remove reference implementation code and test against pre-created logits	2023-04-01 11:09:24 +04:00
saharNooby	bf88e8a246	Update README.md	2023-04-01 10:12:10 +04:00
saharNooby	61c6b1a4e0	Add comparison against reference implementation script, implement state & logits saving	2023-03-31 20:23:42 +04:00
saharNooby	02c9946b57	Update README.md	2023-03-31 19:06:31 +04:00
saharNooby	fe272dc3d3	Minor changes	2023-03-31 10:24:12 +04:00
saharNooby	93c8dcae75	Update README.md	2023-03-30 20:37:09 +04:00
saharNooby	2f51451561	Initial commit	2023-03-30 17:55:30 +04:00
Georgi Gerganov	b467702b87	readme : fix typos	2023-03-29 19:38:31 +03:00
Georgi Gerganov	516d88e75c	readme : add GPT4All instructions (close #588 )	2023-03-29 19:37:20 +03:00
Stephan Walter	b391579db9	Update README and comments for standalone perplexity tool (#525 )	2023-03-26 16:14:01 +03:00
Georgi Gerganov	348d6926ee	Add logo to README.md	2023-03-26 10:20:49 +03:00
Georgi Gerganov	55ad42af84	Move chat scripts into "./examples"	2023-03-25 20:37:09 +02:00
Georgi Gerganov	4a7129acd2	Remove obsolete information from README	2023-03-25 16:30:32 +02:00
Gary Mulder	f4f5362edb	Update README.md (#444 ) Added explicit bolded instructions clarifying that people need to request access to models from Facebook and never through through this repo.	2023-03-24 15:23:09 +00:00
Georgi Gerganov	b6b268d441	Add link to Roadmap discussion	2023-03-24 09:13:35 +02:00
Stephan Walter	a50e39c6fe	Revert "Delete SHA256SUMS for now" (#429 ) * Revert "Delete SHA256SUMS for now (#416)" This reverts commit `8eea5ae0e5`. * Remove ggml files until they can be verified * Remove alpaca json * Add also model/tokenizer.model to SHA256SUMS + update README --------- Co-authored-by: Pavol Rusnak <pavol@rusnak.io>	2023-03-23 15:15:48 +01:00
Gary Mulder	8a3e5ef801	Move model section from issue template to README.md (#421 ) * Update custom.md * Removed Model section as it is better placed in README.md * Updates to README.md model section * Inserted text that was removed from issue template about obtaining models from FB and links to papers describing the various models * Removed IPF down links for the Alpaca 7B models as these look to be in the old data format and probably shouldn't be directly linked to, anyway * Updated the perplexity section to point at Perplexity scores #406 discussion	2023-03-23 11:30:40 +00:00
Georgi Gerganov	93208cfb92	Adjust repetition penalty ..	2023-03-23 10:46:58 +02:00

1 2 3

107 Commits