* Use types from typing for better compatibility with older Python versions
* Split last double end of line token as per BlinkDL's suggestion
* Fix MSVC warnings
* Drop Q4_2 support
* Update ggml
* Bump file format version for quantization changes
* Apply suggestions
* Update ggml
* Pack only rwkv.dll for Windows releases
Test executables would not be packed anymore.
* Move test code into a separate file
* Remove redundant zeroing
* Refactor chat script
* Remove Q4_3 support
* Add Q5_0, Q5_1, Q8_0 support
* Add more clear message when loading Q4_3 model
* Remove Q4_1_O format
* Fix indentation in .gitmodules
* Simplify sanitizer matrix