hypnopump
|
c320573b5e
|
verify instructions can be followed
|
2023-04-04 17:45:55 +02:00 |
hypnopump
|
f5feb7470b
|
verify instructions can be followed
|
2023-04-04 17:45:06 +02:00 |
hypnopump
|
b75a805563
|
working on macos. no point in fp32 if all weights distributed in fp16
|
2023-04-04 17:39:21 +02:00 |
Alexander
|
77e19980e9
|
Merge pull request #13 from pixelkaiser/rwkv-macos
we actually build a dylib on macos
|
2023-04-04 14:24:21 +05:00 |
PXLKSR
|
977efba905
|
we actually build a dylib on macos
|
2023-04-04 10:19:06 +02:00 |
saharNooby
|
aacc8b6872
|
Minor formatting changes
|
2023-04-03 10:39:28 +04:00 |
Alexander
|
4f1df7c89e
|
Merge pull request #9 from hypnopump/more_instructions_works_linux
Adds instructions and works on linux as well
|
2023-04-03 11:35:38 +05:00 |
hypnopump
|
fa74b016c6
|
more details for macos/linux
|
2023-04-03 08:33:57 +02:00 |
Eric Alcaide
|
bea02c4b4c
|
Merge branch 'master' into more_instructions_works_linux
|
2023-04-03 08:29:55 +02:00 |
hypnopump
|
0a0cabc4c7
|
for consistency
|
2023-04-03 08:27:00 +02:00 |
hypnopump
|
6f3fb01913
|
suggestions
|
2023-04-03 08:25:54 +02:00 |
saharNooby
|
3535476987
|
Update README.md: include info about pre-compiled library
|
2023-04-03 09:48:53 +04:00 |
saharNooby
|
5b2830ed30
|
Increase memory for overhead from 32 MB to 256 MB
|
2023-04-03 09:32:58 +04:00 |
hypnopump
|
a64aaa81ec
|
initial addition
|
2023-04-03 00:52:26 +02:00 |
saharNooby
|
d62a050144
|
Remove hardcoded memory requirements table
|
2023-04-02 18:37:45 +04:00 |
saharNooby
|
1262ad0456
|
Fix build errors and warnings
|
2023-04-02 17:23:39 +04:00 |
saharNooby
|
f2b1dad22b
|
Add GitHub workflows file
|
2023-04-02 16:56:04 +04:00 |
saharNooby
|
6b4ebc328a
|
Update README.md
|
2023-04-02 15:28:34 +04:00 |
saharNooby
|
e0684e8104
|
Add text generation and chat scripts
|
2023-04-02 15:03:31 +04:00 |
saharNooby
|
ee46ad208e
|
Add quantization test back, run ggml tests on first context init
|
2023-04-02 13:05:17 +04:00 |
saharNooby
|
1ecbad3a65
|
Remove unused files
|
2023-04-02 12:53:41 +04:00 |
saharNooby
|
935d16f5db
|
Move library wrapper to separate file, refactor code
|
2023-04-02 12:24:40 +04:00 |
saharNooby
|
38f9d02d52
|
Fix quantization from FP16
|
2023-04-01 20:01:06 +04:00 |
saharNooby
|
972e28d48d
|
Implement INT4 conversion and inference
|
2023-04-01 19:22:01 +04:00 |
saharNooby
|
b164bf4e27
|
Allocate memory as needed for specific configuration of model
|
2023-04-01 17:15:23 +04:00 |
saharNooby
|
a1e1d34c93
|
Add Python wrapper for C library
|
2023-04-01 16:02:22 +04:00 |
saharNooby
|
7130a89d1f
|
[FILE FORMAT CHANGED] Reverse dimensions in ggml file (makes it more similar to llama.cpp format)
|
2023-04-01 14:41:30 +04:00 |
saharNooby
|
ac03019fcf
|
Move model to separate C library file
|
2023-04-01 14:38:50 +04:00 |
saharNooby
|
f6d45baec0
|
Support FP16 inference
|
2023-04-01 11:53:49 +04:00 |
saharNooby
|
fe98c94a63
|
[FILE FORMAT CHANGED] Use ggml_get_rows to get embedding
|
2023-04-01 11:28:32 +04:00 |
saharNooby
|
16ec7a5c18
|
Add fail-fast version of the test
|
2023-04-01 11:15:15 +04:00 |
saharNooby
|
0fcb7c64c6
|
Remove reference implementation code and test against pre-created logits
|
2023-04-01 11:09:24 +04:00 |
saharNooby
|
bf88e8a246
|
Update README.md
|
2023-04-01 10:12:10 +04:00 |
saharNooby
|
6fe9486cee
|
Finally, FP32 inference
|
2023-04-01 10:06:39 +04:00 |
saharNooby
|
61c6b1a4e0
|
Add comparison against reference implementation script, implement state & logits saving
|
2023-03-31 20:23:42 +04:00 |
saharNooby
|
d00f28581a
|
Add reference implementation of RWKV RNN
|
2023-03-31 19:57:16 +04:00 |
saharNooby
|
02c9946b57
|
Update README.md
|
2023-03-31 19:06:31 +04:00 |
saharNooby
|
01d667f066
|
Implement exp, max, 1_minus_x, sigmoid operators in ggml
|
2023-03-31 19:04:35 +04:00 |
saharNooby
|
fe272dc3d3
|
Minor changes
|
2023-03-31 10:24:12 +04:00 |
saharNooby
|
93c8dcae75
|
Update README.md
|
2023-03-30 20:37:09 +04:00 |
saharNooby
|
56bf4fc856
|
Implement time mixing, fix matrix shape mismatch
|
2023-03-30 20:29:41 +04:00 |
saharNooby
|
873cb954d0
|
Make ln0 work correctly
|
2023-03-30 20:01:26 +04:00 |
saharNooby
|
2f51451561
|
Initial commit
|
2023-03-30 17:55:30 +04:00 |
slaren
|
ed3c680bcd
|
Fix GGML_F32Cx8_STORE in AVX without F16C path (#619)
|
2023-03-30 11:16:30 +02:00 |
anzz1
|
9cbc404ba6
|
ci : re-enable AVX512 testing (Windows-MSVC) (#584)
* CI: Re-enable AVX512 testing (Windows-MSVC)
Now with 100% less base64 encoding
* plain __cpuid is enough here
|
2023-03-29 23:44:39 +03:00 |
Georgi Gerganov
|
b51c717d5c
|
ggml : init time on first ggml_init() call
|
2023-03-29 22:15:34 +03:00 |
Georgi Gerganov
|
0ba76c1e73
|
llama : fix compile warnings when reading the vocab
|
2023-03-29 22:13:12 +03:00 |
Georgi Gerganov
|
cea1c85948
|
ggml : add ARM_NEON dequantize_row_q4_1()
|
2023-03-29 22:10:01 +03:00 |
Georgi Gerganov
|
f202ada131
|
ggml : add ARM_NEON quantize_row_q4_1()
|
2023-03-29 22:03:07 +03:00 |
Georgi Gerganov
|
3b44d30d9b
|
ggml : add ARM_NEON ggml_vec_dot_q4_1()
|
2023-03-29 22:03:07 +03:00 |