Commit Graph

104 Commits

Author SHA1 Message Date
Alex 06dac0f80d
Use main ggml repo (#45) 2023-04-29 21:35:36 +05:00
Alex 1198892888
Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format (#44)
* Remove Q4_3 support

* Add Q5_0, Q5_1, Q8_0 support

* Add more clear message when loading Q4_3 model

* Remove Q4_1_O format

* Fix indentation in .gitmodules

* Simplify sanitizer matrix
2023-04-29 17:39:11 +05:00
Alex 3587ff9e58
Sync ggml with upstream (#38)
* Sync ggml with upstream

* Remove file filters from Actions triggers

* Update ggml

* Add Q4_2 and Q4_3 support

* Improve output of perplexity measuring script

* Add tests for new formats

* Add token limit argument to perplexity measuring script

* Update README

* Update README

* Update ggml

* Use master branch of ggml
2023-04-22 20:25:29 +05:00
Alex 1be9fda248
Add robust automatic testing (#33) 2023-04-20 11:00:35 +05:00
saharNooby 7b28076243 Fix Q4_1_O optimization 2023-04-18 16:46:27 +04:00
saharNooby 2ef7ee0fac Optimize Q4_1_O by moving outlier multiplication out of the dequantize+dot loop 2023-04-18 09:47:20 +04:00
saharNooby 82e2faa190 Update data type info 2023-04-17 19:17:47 +04:00
saharNooby 05825d2370 Fix GitHub Actions 2023-04-17 19:04:55 +04:00
saharNooby 678f5233a5 Add LoRA loading support 2023-04-15 20:46:30 +04:00
Daniel Breedeveld 70f7eece06
fix: Mention of incorrect filename for MacOS cmake build artifact
Executing the cmake build produces "librwkv.dylib" on MacOS (tested on Ventura 13.3.1)
2023-04-10 02:01:28 +08:00
saharNooby 7437e1d860 Clarify that we now have binaries for Linux/MacOS 2023-04-08 19:39:31 +04:00
saharNooby 874826cb20 Update README.md 2023-04-08 10:45:42 +04:00
saharNooby edd57a186c Update README.md 2023-04-07 10:16:12 +04:00
saharNooby d12088e164 Minor formatting changes 2023-04-05 15:31:23 +04:00
hypnopump f5feb7470b
verify instructions can be followed 2023-04-04 17:45:06 +02:00
hypnopump b75a805563
working on macos. no point in fp32 if all weights distributed in fp16 2023-04-04 17:39:21 +02:00
saharNooby aacc8b6872 Minor formatting changes 2023-04-03 10:39:28 +04:00
hypnopump fa74b016c6
more details for macos/linux 2023-04-03 08:33:57 +02:00
Eric Alcaide bea02c4b4c
Merge branch 'master' into more_instructions_works_linux 2023-04-03 08:29:55 +02:00
hypnopump 6f3fb01913
suggestions 2023-04-03 08:25:54 +02:00
saharNooby 3535476987 Update README.md: include info about pre-compiled library 2023-04-03 09:48:53 +04:00
hypnopump a64aaa81ec
initial addition 2023-04-03 00:52:26 +02:00
saharNooby 6b4ebc328a Update README.md 2023-04-02 15:28:34 +04:00
saharNooby e0684e8104 Add text generation and chat scripts 2023-04-02 15:03:31 +04:00
saharNooby 1ecbad3a65 Remove unused files 2023-04-02 12:53:41 +04:00
saharNooby 935d16f5db Move library wrapper to separate file, refactor code 2023-04-02 12:24:40 +04:00
saharNooby 38f9d02d52 Fix quantization from FP16 2023-04-01 20:01:06 +04:00
saharNooby a1e1d34c93 Add Python wrapper for C library 2023-04-01 16:02:22 +04:00
saharNooby f6d45baec0 Support FP16 inference 2023-04-01 11:53:49 +04:00
saharNooby 0fcb7c64c6 Remove reference implementation code and test against pre-created logits 2023-04-01 11:09:24 +04:00
saharNooby bf88e8a246 Update README.md 2023-04-01 10:12:10 +04:00
saharNooby 61c6b1a4e0 Add comparison against reference implementation script, implement state & logits saving 2023-03-31 20:23:42 +04:00
saharNooby 02c9946b57 Update README.md 2023-03-31 19:06:31 +04:00
saharNooby fe272dc3d3 Minor changes 2023-03-31 10:24:12 +04:00
saharNooby 93c8dcae75 Update README.md 2023-03-30 20:37:09 +04:00
saharNooby 2f51451561 Initial commit 2023-03-30 17:55:30 +04:00
Georgi Gerganov b467702b87
readme : fix typos 2023-03-29 19:38:31 +03:00
Georgi Gerganov 516d88e75c
readme : add GPT4All instructions (close #588) 2023-03-29 19:37:20 +03:00
Stephan Walter b391579db9
Update README and comments for standalone perplexity tool (#525) 2023-03-26 16:14:01 +03:00
Georgi Gerganov 348d6926ee
Add logo to README.md 2023-03-26 10:20:49 +03:00
Georgi Gerganov 55ad42af84
Move chat scripts into "./examples" 2023-03-25 20:37:09 +02:00
Georgi Gerganov 4a7129acd2
Remove obsolete information from README 2023-03-25 16:30:32 +02:00
Gary Mulder f4f5362edb
Update README.md (#444)
Added explicit **bolded** instructions clarifying that people need to request access to models from Facebook and never through through this repo.
2023-03-24 15:23:09 +00:00
Georgi Gerganov b6b268d441
Add link to Roadmap discussion 2023-03-24 09:13:35 +02:00
Stephan Walter a50e39c6fe
Revert "Delete SHA256SUMS for now" (#429)
* Revert "Delete SHA256SUMS for now (#416)"

This reverts commit 8eea5ae0e5.

* Remove ggml files until they can be verified
* Remove alpaca json
* Add also model/tokenizer.model to SHA256SUMS + update README

---------

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-23 15:15:48 +01:00
Gary Mulder 8a3e5ef801
Move model section from issue template to README.md (#421)
* Update custom.md

* Removed Model section as it is better placed in README.md

* Updates to README.md model section

* Inserted text that was removed from  issue template about obtaining models from FB and links to papers describing the various models

* Removed IPF down links for the Alpaca 7B models as these look to be in the old data format and probably shouldn't be directly linked to, anyway

* Updated the perplexity section to point at Perplexity scores #406 discussion
2023-03-23 11:30:40 +00:00
Georgi Gerganov 93208cfb92
Adjust repetition penalty .. 2023-03-23 10:46:58 +02:00
Georgi Gerganov 03ace14cfd
Add link to recent podcast about whisper.cpp and llama.cpp 2023-03-23 09:48:51 +02:00
Gary Linscott 40ea807a97
Add details on perplexity to README.md (#395) 2023-03-22 08:53:54 -07:00
Georgi Gerganov 56817b1f88
Remove temporary notice and update hot topics 2023-03-22 07:34:02 +02:00