Alex
06dac0f80d
Use main ggml repo ( #45 )
2023-04-29 21:35:36 +05:00
Alex
1198892888
Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format ( #44 )
...
* Remove Q4_3 support
* Add Q5_0, Q5_1, Q8_0 support
* Add more clear message when loading Q4_3 model
* Remove Q4_1_O format
* Fix indentation in .gitmodules
* Simplify sanitizer matrix
2023-04-29 17:39:11 +05:00
Alex
3587ff9e58
Sync ggml with upstream ( #38 )
...
* Sync ggml with upstream
* Remove file filters from Actions triggers
* Update ggml
* Add Q4_2 and Q4_3 support
* Improve output of perplexity measuring script
* Add tests for new formats
* Add token limit argument to perplexity measuring script
* Update README
* Update README
* Update ggml
* Use master branch of ggml
2023-04-22 20:25:29 +05:00
Alex
1be9fda248
Add robust automatic testing ( #33 )
2023-04-20 11:00:35 +05:00
saharNooby
7b28076243
Fix Q4_1_O optimization
2023-04-18 16:46:27 +04:00
saharNooby
2ef7ee0fac
Optimize Q4_1_O by moving outlier multiplication out of the dequantize+dot loop
2023-04-18 09:47:20 +04:00
saharNooby
82e2faa190
Update data type info
2023-04-17 19:17:47 +04:00
saharNooby
05825d2370
Fix GitHub Actions
2023-04-17 19:04:55 +04:00
saharNooby
678f5233a5
Add LoRA loading support
2023-04-15 20:46:30 +04:00
Daniel Breedeveld
70f7eece06
fix: Mention of incorrect filename for MacOS cmake build artifact
...
Executing the cmake build produces "librwkv.dylib" on MacOS (tested on Ventura 13.3.1)
2023-04-10 02:01:28 +08:00
saharNooby
7437e1d860
Clarify that we now have binaries for Linux/MacOS
2023-04-08 19:39:31 +04:00
saharNooby
874826cb20
Update README.md
2023-04-08 10:45:42 +04:00
saharNooby
edd57a186c
Update README.md
2023-04-07 10:16:12 +04:00
saharNooby
d12088e164
Minor formatting changes
2023-04-05 15:31:23 +04:00
hypnopump
f5feb7470b
verify instructions can be followed
2023-04-04 17:45:06 +02:00
hypnopump
b75a805563
working on macos. no point in fp32 if all weights distributed in fp16
2023-04-04 17:39:21 +02:00
saharNooby
aacc8b6872
Minor formatting changes
2023-04-03 10:39:28 +04:00
hypnopump
fa74b016c6
more details for macos/linux
2023-04-03 08:33:57 +02:00
Eric Alcaide
bea02c4b4c
Merge branch 'master' into more_instructions_works_linux
2023-04-03 08:29:55 +02:00
hypnopump
6f3fb01913
suggestions
2023-04-03 08:25:54 +02:00
saharNooby
3535476987
Update README.md: include info about pre-compiled library
2023-04-03 09:48:53 +04:00
hypnopump
a64aaa81ec
initial addition
2023-04-03 00:52:26 +02:00
saharNooby
6b4ebc328a
Update README.md
2023-04-02 15:28:34 +04:00
saharNooby
e0684e8104
Add text generation and chat scripts
2023-04-02 15:03:31 +04:00
saharNooby
1ecbad3a65
Remove unused files
2023-04-02 12:53:41 +04:00
saharNooby
935d16f5db
Move library wrapper to separate file, refactor code
2023-04-02 12:24:40 +04:00
saharNooby
38f9d02d52
Fix quantization from FP16
2023-04-01 20:01:06 +04:00
saharNooby
a1e1d34c93
Add Python wrapper for C library
2023-04-01 16:02:22 +04:00
saharNooby
f6d45baec0
Support FP16 inference
2023-04-01 11:53:49 +04:00
saharNooby
0fcb7c64c6
Remove reference implementation code and test against pre-created logits
2023-04-01 11:09:24 +04:00
saharNooby
bf88e8a246
Update README.md
2023-04-01 10:12:10 +04:00
saharNooby
61c6b1a4e0
Add comparison against reference implementation script, implement state & logits saving
2023-03-31 20:23:42 +04:00
saharNooby
02c9946b57
Update README.md
2023-03-31 19:06:31 +04:00
saharNooby
fe272dc3d3
Minor changes
2023-03-31 10:24:12 +04:00
saharNooby
93c8dcae75
Update README.md
2023-03-30 20:37:09 +04:00
saharNooby
2f51451561
Initial commit
2023-03-30 17:55:30 +04:00
Georgi Gerganov
b467702b87
readme : fix typos
2023-03-29 19:38:31 +03:00
Georgi Gerganov
516d88e75c
readme : add GPT4All instructions ( close #588 )
2023-03-29 19:37:20 +03:00
Stephan Walter
b391579db9
Update README and comments for standalone perplexity tool ( #525 )
2023-03-26 16:14:01 +03:00
Georgi Gerganov
348d6926ee
Add logo to README.md
2023-03-26 10:20:49 +03:00
Georgi Gerganov
55ad42af84
Move chat scripts into "./examples"
2023-03-25 20:37:09 +02:00
Georgi Gerganov
4a7129acd2
Remove obsolete information from README
2023-03-25 16:30:32 +02:00
Gary Mulder
f4f5362edb
Update README.md ( #444 )
...
Added explicit **bolded** instructions clarifying that people need to request access to models from Facebook and never through through this repo.
2023-03-24 15:23:09 +00:00
Georgi Gerganov
b6b268d441
Add link to Roadmap discussion
2023-03-24 09:13:35 +02:00
Stephan Walter
a50e39c6fe
Revert "Delete SHA256SUMS for now" ( #429 )
...
* Revert "Delete SHA256SUMS for now (#416 )"
This reverts commit 8eea5ae0e5
.
* Remove ggml files until they can be verified
* Remove alpaca json
* Add also model/tokenizer.model to SHA256SUMS + update README
---------
Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-23 15:15:48 +01:00
Gary Mulder
8a3e5ef801
Move model section from issue template to README.md ( #421 )
...
* Update custom.md
* Removed Model section as it is better placed in README.md
* Updates to README.md model section
* Inserted text that was removed from issue template about obtaining models from FB and links to papers describing the various models
* Removed IPF down links for the Alpaca 7B models as these look to be in the old data format and probably shouldn't be directly linked to, anyway
* Updated the perplexity section to point at Perplexity scores #406 discussion
2023-03-23 11:30:40 +00:00
Georgi Gerganov
93208cfb92
Adjust repetition penalty ..
2023-03-23 10:46:58 +02:00
Georgi Gerganov
03ace14cfd
Add link to recent podcast about whisper.cpp and llama.cpp
2023-03-23 09:48:51 +02:00
Gary Linscott
40ea807a97
Add details on perplexity to README.md ( #395 )
2023-03-22 08:53:54 -07:00
Georgi Gerganov
56817b1f88
Remove temporary notice and update hot topics
2023-03-22 07:34:02 +02:00