Commit Graph

  • 0fcb7c64c6 Remove reference implementation code and test against pre-created logits saharNooby 2023-04-01 11:09:24 +0400
  • bf88e8a246 Update README.md saharNooby 2023-04-01 10:12:10 +0400
  • 6fe9486cee Finally, FP32 inference saharNooby 2023-04-01 10:06:39 +0400
  • 61c6b1a4e0 Add comparison against reference implementation script, implement state & logits saving saharNooby 2023-03-31 20:23:42 +0400
  • d00f28581a Add reference implementation of RWKV RNN saharNooby 2023-03-31 19:57:16 +0400
  • 02c9946b57 Update README.md saharNooby 2023-03-31 19:06:31 +0400
  • 01d667f066 Implement exp, max, 1_minus_x, sigmoid operators in ggml saharNooby 2023-03-31 19:04:35 +0400
  • fe272dc3d3 Minor changes saharNooby 2023-03-31 10:24:12 +0400
  • 93c8dcae75 Update README.md saharNooby 2023-03-30 20:37:09 +0400
  • 56bf4fc856 Implement time mixing, fix matrix shape mismatch saharNooby 2023-03-30 20:29:41 +0400
  • 873cb954d0 Make ln0 work correctly saharNooby 2023-03-30 20:01:26 +0400
  • 2f51451561 Initial commit saharNooby 2023-03-30 17:55:30 +0400
  • ed3c680bcd
    Fix GGML_F32Cx8_STORE in AVX without F16C path (#619) slaren 2023-03-30 11:16:30 +0200
  • 9cbc404ba6
    ci : re-enable AVX512 testing (Windows-MSVC) (#584) anzz1 2023-03-29 23:44:39 +0300
  • b51c717d5c
    ggml : init time on first ggml_init() call Georgi Gerganov 2023-03-29 22:15:34 +0300
  • 0ba76c1e73
    llama : fix compile warnings when reading the vocab Georgi Gerganov 2023-03-29 22:13:12 +0300
  • cea1c85948
    ggml : add ARM_NEON dequantize_row_q4_1() Georgi Gerganov 2023-03-29 22:10:01 +0300
  • f202ada131
    ggml : add ARM_NEON quantize_row_q4_1() Georgi Gerganov 2023-03-29 22:03:02 +0300
  • 3b44d30d9b
    ggml : add ARM_NEON ggml_vec_dot_q4_1() Georgi Gerganov 2023-03-29 21:47:33 +0300
  • 61cbfff5c9
    rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) Pavol Rusnak 2023-03-29 20:09:25 +0200
  • d9ad104440
    Create chat-13B.bat (#592) Thérence 2023-03-29 19:21:09 +0200
  • b467702b87
    readme : fix typos Georgi Gerganov 2023-03-29 19:38:31 +0300
  • 516d88e75c
    readme : add GPT4All instructions (close #588) Georgi Gerganov 2023-03-29 19:37:20 +0300
  • 53635c081c
    py : add GPT4All conversion script Georgi Gerganov 2023-03-29 19:29:26 +0300
  • 41318d708e
    llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) Maël Kerbiriou 2023-03-29 18:10:07 +0200
  • a6956b25a1
    add example of re-act pattern (#583) Tobias Lütke 2023-03-29 17:10:24 +0200
  • 83df5639eb
    Fix GCC warning about binary literal (#595) anzz1 2023-03-29 16:20:07 +0300
  • a5c42c4b13
    Fix typo in llama.h (#593) anzz1 2023-03-29 16:19:29 +0300
  • 5a5f8b1501
    Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375) anzz1 2023-03-28 22:44:29 +0300
  • f1217055ea
    CI: fix subdirectory path globbing (#546) anzz1 2023-03-28 22:43:25 +0300
  • 7f4c5c6651
    llama : fix linkage with mingw (#551) anzz1 2023-03-28 21:23:09 +0300
  • 2a98bc18ea
    ggml : add AVX2 implementation of quantize_row_q4_1 (#515) slaren 2023-03-28 20:06:03 +0200
  • d0aaff571c
    py : add temporary script to convert old ggml files to newer version (#539) thement 2023-03-28 19:55:42 +0200
  • d0330fd783
    py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) Tai Duc Nguyen 2023-03-28 13:51:29 -0400
  • 99c5b27654
    ggml : refactor quantized processing functions (#509) Stephan Walter 2023-03-28 17:13:01 +0000
  • 692ce3164e
    py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) DooWoong Lee (David) 2023-03-29 02:02:34 +0900
  • 96f9c0506f
    ci : make ctest verbose, hopefully we see what is wrong with the sanitizer Georgi Gerganov 2023-03-28 20:01:09 +0300
  • d502bc7c9d
    tests : free llama context at the end of the test Georgi Gerganov 2023-03-28 19:51:55 +0300
  • 436e561931
    all : be more strict about converting float to double (#458) Stephan Walter 2023-03-28 16:48:20 +0000
  • 20e1e84884
    deploy : add a Package.swift for SwiftPM support (#393) Jed Fox 2023-03-28 11:39:01 -0500
  • c1f885067c
    ggml : introduce structs for the q4 data blocks (#356) Stephan Walter 2023-03-28 15:56:03 +0000
  • e0670260fb
    gitignore : add "embedding" Georgi Gerganov 2023-03-28 18:34:35 +0300
  • 28ba975aea
    Check the existence of f16_model_path_base in quantize.py (#574) dotpy314 2023-03-28 23:06:28 +0800
  • a6bdc47cba
    Fix usage of F16C intrinsics in AVX code (#563) slaren 2023-03-28 16:26:55 +0200
  • 7b8dbcb78b
    main.cpp fixes, refactoring (#571) anzz1 2023-03-28 17:09:55 +0300
  • 4b8efff0e3
    Add embedding example to Makefile (#540) RJ Adriaansen 2023-03-28 08:11:09 +0200
  • 7e5395575a
    Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542) Marco Matthies 2023-03-27 06:55:26 +0200
  • 34c1072e49
    ci: add debug build to sanitizer build matrix (#527) Erik Scholz 2023-03-26 17:48:40 +0200
  • 939ad2d3a5
    Fix undefined variables in debug build, remove unused variables (#531) Stephan Walter 2023-03-26 15:34:02 +0000
  • 8c2ec5e21d
    Add support for linux/arm64 platform during Docker Builds (#514) Juan Calderon-Perez 2023-03-26 10:48:42 -0400
  • b391579db9
    Update README and comments for standalone perplexity tool (#525) Stephan Walter 2023-03-26 13:14:01 +0000
  • 7a87d31f4f
    [main] fix infinite generation (-n == -1) (#523) anzz1 2023-03-26 16:06:10 +0300
  • 348d6926ee
    Add logo to README.md Georgi Gerganov 2023-03-26 10:20:49 +0300
  • 33e35b8fe8
    Exit from interactive mode if input stream is bad (#491) Harald Fernengel 2023-03-26 07:25:46 +0200
  • 19726169b3
    CI: Run other sanitizer builds even if one fails (#511) anzz1 2023-03-26 00:13:28 +0200
  • f732695cd5
    Clarify console output in convert-pth-to-ggml.py (#512) jp-x-g 2023-03-25 14:53:55 -0700
  • 2f7bf7dd7c
    CMake / CI additions (#497) anzz1 2023-03-25 23:38:11 +0200
  • 34ab526843
    (Windows) Set console to UTF-8 on init (#420) anzz1 2023-03-25 22:29:22 +0200
  • c2b25b6912
    Fix colors enabling on WIN32 Georgi Gerganov 2023-03-25 21:53:39 +0200
  • 79b2b266db
    If n_predict == -1, generate forever Georgi Gerganov 2023-03-25 21:51:41 +0200
  • e2d490dafd
    Inifinite generation via context swapping (#71) Georgi Gerganov 2023-03-25 21:36:22 +0200
  • 03f7e33560
    Cleanup STL headers + fix embedding examples + minor stuff Georgi Gerganov 2023-03-25 20:51:14 +0200
  • 55ad42af84
    Move chat scripts into "./examples" Georgi Gerganov 2023-03-25 20:36:52 +0200
  • 459e93cce0
    Add AVX2 implementation of dequantize_row_q4_1 (#505) slaren 2023-03-25 19:31:48 +0100
  • a316a425d0
    Overhaul the examples structure Georgi Gerganov 2023-03-25 20:26:40 +0200
  • ecbe466a36
    Retire the ggml_mul_mat() branch for transposed src0 (#500) Georgi Gerganov 2023-03-25 19:47:21 +0200
  • 502a400192
    Disable prompt verbosity by default and add option to enable (#480) Georgi Gerganov 2023-03-25 17:16:50 +0200
  • 09aecbf628
    Add AVX2 implementation of dequantize_row_q4_0 (#467) slaren 2023-03-25 16:06:49 +0100
  • 4640eff23d
    Don't interefe with BLAS for large prompts by running only 1 thread Georgi Gerganov 2023-03-25 17:03:10 +0200
  • ab77d76312
    Add longer DAN prompt for testing big batch numbers Georgi Gerganov 2023-03-25 16:47:59 +0200
  • 29b7baab67
    Add timings for the prompt evaluation (#478) slaren 2023-03-25 15:34:23 +0100
  • 4a7129acd2
    Remove obsolete information from README Georgi Gerganov 2023-03-25 16:30:32 +0200
  • 6b6dbc8910
    Remove obsolete assert and fix compiler warning Georgi Gerganov 2023-03-25 16:22:05 +0200
  • 2a2e63ce05
    Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS Georgi Gerganov 2023-03-25 16:09:54 +0200
  • e899bf54b2
    bounds checking for input prefix (#492) anzz1 2023-03-25 14:42:09 +0200
  • fbd4d38c64
    feat: '--in-prefix STRING' option (#426) anzz1 2023-03-25 14:03:19 +0200
  • 58e6c9f36f
    Add support for file load progress reporting callbacks (#434) Jed Fox 2023-03-25 01:26:28 -0400
  • 36d07532ef
    Add missing struct annotation (#483) Doomsdayrs 2023-03-25 01:21:24 -0400
  • 6f1ee4b640
    Fix crash for 65B model with pre-allocated memory (#485) Chris Kuehl 2023-03-24 23:38:14 -0500
  • 8520fc310e
    Disable BLAS altogether - the bug is not just for qunatized mat mul Georgi Gerganov 2023-03-24 23:47:06 +0200
  • b3f460e941
    Disable BLAS branch in mul_mat - seems there is a bug Georgi Gerganov 2023-03-24 23:39:17 +0200
  • 04c6f5ed6f
    Immediately start processing the prompt before user input has been provided (#476) Georgi Gerganov 2023-03-24 23:17:58 +0200
  • 7a9b6c3a8b
    Reduce memory usage and allocate enough memory for largest context (#473) Georgi Gerganov 2023-03-24 23:17:37 +0200
  • 31572d9665
    Temporary bump the memory buffer size - hopefully fix issues from 483bab2e Georgi Gerganov 2023-03-24 18:23:56 +0200
  • f4f5362edb
    Update README.md (#444) Gary Mulder 2023-03-24 15:23:09 +0000
  • 863f65e2e3
    fix instruct mode (#445) rabidcopy 2023-03-24 10:22:39 -0500
  • afd220d9c6
    Properly free llama_context on failure Georgi Gerganov 2023-03-24 17:21:01 +0200
  • 481044d50c
    additional optimizations for POWER9 (#454) Cameron Kaiser 2023-03-24 08:19:26 -0700
  • 563cdc391d
    Support calling mlock() on loaded model data on Linux and macOS (#453) comex 2023-03-24 08:19:05 -0700
  • 8d4a855c24
    Add embedding mode with arg flag. Currently working (#282) Luciano 2023-03-24 08:05:13 -0700
  • b6b268d441
    Add link to Roadmap discussion Georgi Gerganov 2023-03-24 09:13:35 +0200
  • 3cd8dde0d1 Revert "Fix memory allocation issues and seg faults" Georgi Gerganov 2023-03-24 06:22:28 +0200
  • 4870e455b3
    Fix memory allocation issues and seg faults Georgi Gerganov 2023-03-24 00:11:53 +0200
  • 483bab2e3d
    Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439) Georgi Gerganov 2023-03-23 23:22:01 +0200
  • 404e1da38e
    Fix quantize script not finding models in parent directory (#428) Jed Fox 2023-03-23 16:42:52 -0400
  • 4cc053b6d5
    Remove oboslete command from Docker script Georgi Gerganov 2023-03-23 22:39:44 +0200
  • 0ba5a3a9a5
    Obsolete Georgi Gerganov 2023-03-23 22:32:02 +0200
  • 2e17dfd80a
    Replace EOS with newline to prevent context/memory being flushed by EOS in interactive mode (#333) rabidcopy 2023-03-23 15:22:47 -0500
  • 20a1a4e09c
    Fix GPTQ converter (#423) Timmy Knight 2023-03-23 10:18:13 -1000
  • ad072fc5ad
    Generate library with CMake (#430) nusu-github 2023-03-24 05:16:48 +0900