rwkv.cpp

Commit Graph

Author	SHA1	Message	Date
saharNooby	3535476987	Update README.md: include info about pre-compiled library	2023-04-03 09:48:53 +04:00
saharNooby	5b2830ed30	Increase memory for overhead from 32 MB to 256 MB	2023-04-03 09:32:58 +04:00
hypnopump	a64aaa81ec	initial addition	2023-04-03 00:52:26 +02:00
saharNooby	d62a050144	Remove hardcoded memory requirements table	2023-04-02 18:37:45 +04:00
saharNooby	1262ad0456	Fix build errors and warnings	2023-04-02 17:23:39 +04:00
saharNooby	f2b1dad22b	Add GitHub workflows file	2023-04-02 16:56:04 +04:00
saharNooby	6b4ebc328a	Update README.md	2023-04-02 15:28:34 +04:00
saharNooby	e0684e8104	Add text generation and chat scripts	2023-04-02 15:03:31 +04:00
saharNooby	ee46ad208e	Add quantization test back, run ggml tests on first context init	2023-04-02 13:05:17 +04:00
saharNooby	1ecbad3a65	Remove unused files	2023-04-02 12:53:41 +04:00
saharNooby	935d16f5db	Move library wrapper to separate file, refactor code	2023-04-02 12:24:40 +04:00
saharNooby	38f9d02d52	Fix quantization from FP16	2023-04-01 20:01:06 +04:00
saharNooby	972e28d48d	Implement INT4 conversion and inference	2023-04-01 19:22:01 +04:00
saharNooby	b164bf4e27	Allocate memory as needed for specific configuration of model	2023-04-01 17:15:23 +04:00
saharNooby	a1e1d34c93	Add Python wrapper for C library	2023-04-01 16:02:22 +04:00
saharNooby	7130a89d1f	[FILE FORMAT CHANGED] Reverse dimensions in ggml file (makes it more similar to llama.cpp format)	2023-04-01 14:41:30 +04:00
saharNooby	ac03019fcf	Move model to separate C library file	2023-04-01 14:38:50 +04:00
saharNooby	f6d45baec0	Support FP16 inference	2023-04-01 11:53:49 +04:00
saharNooby	fe98c94a63	[FILE FORMAT CHANGED] Use ggml_get_rows to get embedding	2023-04-01 11:28:32 +04:00
saharNooby	16ec7a5c18	Add fail-fast version of the test	2023-04-01 11:15:15 +04:00
saharNooby	0fcb7c64c6	Remove reference implementation code and test against pre-created logits	2023-04-01 11:09:24 +04:00
saharNooby	bf88e8a246	Update README.md	2023-04-01 10:12:10 +04:00
saharNooby	6fe9486cee	Finally, FP32 inference	2023-04-01 10:06:39 +04:00
saharNooby	61c6b1a4e0	Add comparison against reference implementation script, implement state & logits saving	2023-03-31 20:23:42 +04:00
saharNooby	d00f28581a	Add reference implementation of RWKV RNN	2023-03-31 19:57:16 +04:00
saharNooby	02c9946b57	Update README.md	2023-03-31 19:06:31 +04:00
saharNooby	01d667f066	Implement exp, max, 1_minus_x, sigmoid operators in ggml	2023-03-31 19:04:35 +04:00
saharNooby	fe272dc3d3	Minor changes	2023-03-31 10:24:12 +04:00
saharNooby	93c8dcae75	Update README.md	2023-03-30 20:37:09 +04:00
saharNooby	56bf4fc856	Implement time mixing, fix matrix shape mismatch	2023-03-30 20:29:41 +04:00
saharNooby	873cb954d0	Make ln0 work correctly	2023-03-30 20:01:26 +04:00
saharNooby	2f51451561	Initial commit	2023-03-30 17:55:30 +04:00
slaren	ed3c680bcd	Fix GGML_F32Cx8_STORE in AVX without F16C path (#619 )	2023-03-30 11:16:30 +02:00
anzz1	9cbc404ba6	ci : re-enable AVX512 testing (Windows-MSVC) (#584 ) * CI: Re-enable AVX512 testing (Windows-MSVC) Now with 100% less base64 encoding * plain __cpuid is enough here	2023-03-29 23:44:39 +03:00
Georgi Gerganov	b51c717d5c	ggml : init time on first ggml_init() call	2023-03-29 22:15:34 +03:00
Georgi Gerganov	0ba76c1e73	llama : fix compile warnings when reading the vocab	2023-03-29 22:13:12 +03:00
Georgi Gerganov	cea1c85948	ggml : add ARM_NEON dequantize_row_q4_1()	2023-03-29 22:10:01 +03:00
Georgi Gerganov	f202ada131	ggml : add ARM_NEON quantize_row_q4_1()	2023-03-29 22:03:07 +03:00
Georgi Gerganov	3b44d30d9b	ggml : add ARM_NEON ggml_vec_dot_q4_1()	2023-03-29 22:03:07 +03:00
Pavol Rusnak	61cbfff5c9	rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600 ) to match filenames of other converters	2023-03-29 20:09:25 +02:00
Thérence	d9ad104440	Create chat-13B.bat (#592 ) * Create chat-13B.bat Same script than chat-13B.sh, but for windows users. Tested and working on windows 10/11 v 22H2 * Apply suggestions from code review --------- Co-authored-by: anzz1 <anzz1@live.com>	2023-03-29 20:21:09 +03:00
Georgi Gerganov	b467702b87	readme : fix typos	2023-03-29 19:38:31 +03:00
Georgi Gerganov	516d88e75c	readme : add GPT4All instructions (close #588 )	2023-03-29 19:37:20 +03:00
Georgi Gerganov	53635c081c	py : add GPT4All conversion script For now: copy-paste Too much time for me to deduplicate the python code	2023-03-29 19:29:52 +03:00
Maël Kerbiriou	41318d708e	llama : use the same threshold for OpenBLAS and ggml thread limiting (#577 )	2023-03-29 19:10:07 +03:00
Tobias Lütke	a6956b25a1	add example of re-act pattern (#583 ) * add example of re-act pattern * spelling... * fixed whitespace in reverse prompt issue	2023-03-29 10:10:24 -05:00
anzz1	83df5639eb	Fix GCC warning about binary literal (#595 ) 0b10101010 -> 0xAA /* 0b10101010 */	2023-03-29 13:20:07 +00:00
anzz1	a5c42c4b13	Fix typo in llama.h (#593 )	2023-03-29 13:19:29 +00:00
anzz1	5a5f8b1501	Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375 ) * Enable Fused-Multiply-Add (FMA) instructions on MSVC __FMA__ macro does not exist in MSVC * Enable F16C/CVT16 vector extensions on MSVC __F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512 * MSVC cvt intrinsics * Add __SSE3__ macro for MSVC too because why not even though it's not currently used for anything when AVX is defined	2023-03-28 22:44:29 +03:00
anzz1	f1217055ea	CI: fix subdirectory path globbing (#546 ) - Changes in subdirectories will now be detecter properly - (Windows-MSVC) AVX512 tests temporarily disabled	2023-03-28 22:43:25 +03:00

1 2 3 4 5 ...

335 Commits All Branches Search

335 Commits

All Branches