saharNooby
aacc8b6872
Minor formatting changes
2023-04-03 10:39:28 +04:00
Alexander
4f1df7c89e
Merge pull request #9 from hypnopump/more_instructions_works_linux
...
Adds instructions and works on linux as well
2023-04-03 11:35:38 +05:00
hypnopump
fa74b016c6
more details for macos/linux
2023-04-03 08:33:57 +02:00
Eric Alcaide
bea02c4b4c
Merge branch 'master' into more_instructions_works_linux
2023-04-03 08:29:55 +02:00
hypnopump
0a0cabc4c7
for consistency
2023-04-03 08:27:00 +02:00
hypnopump
6f3fb01913
suggestions
2023-04-03 08:25:54 +02:00
saharNooby
3535476987
Update README.md: include info about pre-compiled library
2023-04-03 09:48:53 +04:00
saharNooby
5b2830ed30
Increase memory for overhead from 32 MB to 256 MB
2023-04-03 09:32:58 +04:00
hypnopump
a64aaa81ec
initial addition
2023-04-03 00:52:26 +02:00
saharNooby
d62a050144
Remove hardcoded memory requirements table
2023-04-02 18:37:45 +04:00
saharNooby
1262ad0456
Fix build errors and warnings
2023-04-02 17:23:39 +04:00
saharNooby
f2b1dad22b
Add GitHub workflows file
2023-04-02 16:56:04 +04:00
saharNooby
6b4ebc328a
Update README.md
2023-04-02 15:28:34 +04:00
saharNooby
e0684e8104
Add text generation and chat scripts
2023-04-02 15:03:31 +04:00
saharNooby
ee46ad208e
Add quantization test back, run ggml tests on first context init
2023-04-02 13:05:17 +04:00
saharNooby
1ecbad3a65
Remove unused files
2023-04-02 12:53:41 +04:00
saharNooby
935d16f5db
Move library wrapper to separate file, refactor code
2023-04-02 12:24:40 +04:00
saharNooby
38f9d02d52
Fix quantization from FP16
2023-04-01 20:01:06 +04:00
saharNooby
972e28d48d
Implement INT4 conversion and inference
2023-04-01 19:22:01 +04:00
saharNooby
b164bf4e27
Allocate memory as needed for specific configuration of model
2023-04-01 17:15:23 +04:00
saharNooby
a1e1d34c93
Add Python wrapper for C library
2023-04-01 16:02:22 +04:00
saharNooby
7130a89d1f
[FILE FORMAT CHANGED] Reverse dimensions in ggml file (makes it more similar to llama.cpp format)
2023-04-01 14:41:30 +04:00
saharNooby
ac03019fcf
Move model to separate C library file
2023-04-01 14:38:50 +04:00
saharNooby
f6d45baec0
Support FP16 inference
2023-04-01 11:53:49 +04:00
saharNooby
fe98c94a63
[FILE FORMAT CHANGED] Use ggml_get_rows to get embedding
2023-04-01 11:28:32 +04:00
saharNooby
16ec7a5c18
Add fail-fast version of the test
2023-04-01 11:15:15 +04:00
saharNooby
0fcb7c64c6
Remove reference implementation code and test against pre-created logits
2023-04-01 11:09:24 +04:00
saharNooby
bf88e8a246
Update README.md
2023-04-01 10:12:10 +04:00
saharNooby
6fe9486cee
Finally, FP32 inference
2023-04-01 10:06:39 +04:00
saharNooby
61c6b1a4e0
Add comparison against reference implementation script, implement state & logits saving
2023-03-31 20:23:42 +04:00
saharNooby
d00f28581a
Add reference implementation of RWKV RNN
2023-03-31 19:57:16 +04:00
saharNooby
02c9946b57
Update README.md
2023-03-31 19:06:31 +04:00
saharNooby
01d667f066
Implement exp, max, 1_minus_x, sigmoid operators in ggml
2023-03-31 19:04:35 +04:00
saharNooby
fe272dc3d3
Minor changes
2023-03-31 10:24:12 +04:00
saharNooby
93c8dcae75
Update README.md
2023-03-30 20:37:09 +04:00
saharNooby
56bf4fc856
Implement time mixing, fix matrix shape mismatch
2023-03-30 20:29:41 +04:00
saharNooby
873cb954d0
Make ln0 work correctly
2023-03-30 20:01:26 +04:00
saharNooby
2f51451561
Initial commit
2023-03-30 17:55:30 +04:00
slaren
ed3c680bcd
Fix GGML_F32Cx8_STORE in AVX without F16C path ( #619 )
2023-03-30 11:16:30 +02:00
anzz1
9cbc404ba6
ci : re-enable AVX512 testing (Windows-MSVC) ( #584 )
...
* CI: Re-enable AVX512 testing (Windows-MSVC)
Now with 100% less base64 encoding
* plain __cpuid is enough here
2023-03-29 23:44:39 +03:00
Georgi Gerganov
b51c717d5c
ggml : init time on first ggml_init() call
2023-03-29 22:15:34 +03:00
Georgi Gerganov
0ba76c1e73
llama : fix compile warnings when reading the vocab
2023-03-29 22:13:12 +03:00
Georgi Gerganov
cea1c85948
ggml : add ARM_NEON dequantize_row_q4_1()
2023-03-29 22:10:01 +03:00
Georgi Gerganov
f202ada131
ggml : add ARM_NEON quantize_row_q4_1()
2023-03-29 22:03:07 +03:00
Georgi Gerganov
3b44d30d9b
ggml : add ARM_NEON ggml_vec_dot_q4_1()
2023-03-29 22:03:07 +03:00
Pavol Rusnak
61cbfff5c9
rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py ( #600 )
...
to match filenames of other converters
2023-03-29 20:09:25 +02:00
Thérence
d9ad104440
Create chat-13B.bat ( #592 )
...
* Create chat-13B.bat
Same script than chat-13B.sh, but for windows users.
Tested and working on windows 10/11 v 22H2
* Apply suggestions from code review
---------
Co-authored-by: anzz1 <anzz1@live.com>
2023-03-29 20:21:09 +03:00
Georgi Gerganov
b467702b87
readme : fix typos
2023-03-29 19:38:31 +03:00
Georgi Gerganov
516d88e75c
readme : add GPT4All instructions ( close #588 )
2023-03-29 19:37:20 +03:00
Georgi Gerganov
53635c081c
py : add GPT4All conversion script
...
For now: copy-paste
Too much time for me to deduplicate the python code
2023-03-29 19:29:52 +03:00