llama.swift

Commit Graph

Select branches

Hide Pull Requests

master

v2

#5

#6

#6

1.0.0

1.1.0

1.1.1

10c5ba0548 pass raw tokens back in session context Alex Rozanski 2023-04-02 14:20:56 +0100
3d02e8eec3 fix get context implementation Alex Rozanski 2023-04-02 13:49:15 +0100
c554f0eff4 update concurrency model in LlamaSession Alex Rozanski 2023-04-02 13:47:01 +0100
a256928a47 support getting current context Alex Rozanski 2023-04-02 12:31:51 +0100
d8d4e865cd

Add a missing step to the gpt4all instructions (#690) Thatcher Chamberlin 2023-04-02 06:48:57 -0400
e986f94829

Added api for getting/setting the kv_cache (#685) Christian Falch 2023-04-02 12:23:04 +0200
c0bb1d3ce2

ggml : change ne to int64_t (#626) Marian Cepok 2023-04-02 12:21:31 +0200
6e7801d08d

examples : add gpt4all script (#658) Leonardo Neumann 2023-04-02 04:56:20 -0300
81040f10aa

llama : do not allocate KV cache for "vocab_only == true" (#682) Stephan Walter 2023-04-02 07:18:53 +0000
c4f89d8d73

make : use -march=native -mtune=native on x86 (#609) Fabian 2023-04-02 09:17:05 +0200
5b70e7de4c

fix default params for examples/main (#697) Murilo Santana 2023-04-01 23:41:12 -0300
445c9764fb don't output initial prompts Alex Rozanski 2023-04-01 21:13:22 +0100
d02fa44c79 fix reverse prompt Alex Rozanski 2023-04-01 21:06:21 +0100
1f9758aac9 update params for Alpaca Alex Rozanski 2023-04-01 19:53:49 +0100
7b065d4d35 remove newlines from Alpaca prompt Alex Rozanski 2023-04-01 19:33:34 +0100
4cb433f01c match implementation of operations to llama.cpp Alex Rozanski 2023-04-01 19:10:32 +0100
c5a9f91628 match prompt injection to llama.cpp and use prefix/suffixes Alex Rozanski 2023-04-01 18:52:41 +0100
7d8a713007 actually pass LlamaContext back from LlamaSetupOperation Alex Rozanski 2023-04-01 18:37:58 +0100
b20ada19ca move initialization into LlamaSetupOperation Alex Rozanski 2023-04-01 18:29:48 +0100
6bd636ef05 add prompt prefix/suffixes Alex Rozanski 2023-04-01 17:54:00 +0100
a717cba844

py: huggingface -> Hugging Face (#686) Ikko Eltociear Ashimine 2023-04-02 01:38:18 +0900
d0a7f742e7

readme: replace termux links with homepage, play store is deprecated (#680) rimoliga 2023-04-01 11:57:30 -0300
0d054e292e Show error message when -f fails Slaren 2023-03-31 20:03:48 +0200
921467c9fa fix import name casing Alex Rozanski 2023-03-31 23:56:21 +0100
1cbd4afd71 add prediction state handler Alex Rozanski 2023-03-31 23:55:31 +0100
dc285e41f8 add cancellation support Alex Rozanski 2023-03-31 21:58:37 +0100
3525899277

Enable -std= for cmake builds, fix warnings (#598) Stephan Walter 2023-03-31 19:19:16 +0000
1d08882afa

Optimize AVX2 ggml_vec_dot_q4_0 (#642) slaren 2023-03-31 17:55:52 +0200
02c5b27e91

Add AVX acceleration (#617) perserk 2023-03-31 16:55:44 +0500
cbef542879 py : cleanup the code Pavol Rusnak 2023-03-29 21:31:24 +0200
9733104be5 drop quantize.py (now that models are using a single file) Pavol Rusnak 2023-03-31 00:52:06 +0200
3df890aef4

readme : update supported models Georgi Gerganov 2023-03-30 22:31:54 +0300
ee0c40dd6d Introduce GGML migration tool for new file format Justine Tunney 2023-03-30 05:42:56 -0700
6f23ba5ee2 Ensure --mlock works properly with mmap() support Justine Tunney 2023-03-30 01:53:36 -0700
78ca9838ee Make loading weights 10-100x faster Justine Tunney 2023-03-29 13:51:37 -0700
a017390358 Initial windows support (untested) Slaren 2023-03-29 22:22:36 +0200
ac184d5147 Always initialize mm_addr and mm_length in llama_model Slaren 2023-03-29 08:53:14 +0200
276e5b7811 Unmap the file in llama_free Slaren 2023-03-29 08:31:26 +0200
d68c5dc435 Make mmap_file static Slaren 2023-03-29 06:18:18 +0200
64bde3ffd4 Fix ggml_init_params in quantize Slaren 2023-03-29 05:38:57 +0200
c03ae8dca1 Add mmap support for model files Slaren 2023-03-29 02:03:43 +0200
3bcc129ba8

cmake : properly invoke CTest (#629) Stephan Walter 2023-03-30 17:56:59 +0000
a4755cf288

Remove unused variable (#607) Casey Primozic 2023-03-30 10:53:35 -0700
1f0414feec

make : fix darwin f16c flags check (#615) david raistrick 2023-03-30 13:34:45 -0400
77efdf5a50

ggml : fix NEON signs (close #620, #622) Georgi Gerganov 2023-03-30 20:27:32 +0300
ed3c680bcd

Fix GGML_F32Cx8_STORE in AVX without F16C path (#619) slaren 2023-03-30 11:16:30 +0200
5dbe5d5e04 update default config to use desirable number of threads Alex Rozanski 2023-03-30 10:07:49 +0100
606d0ebd43 reconnect stateChangeHandler in BridgedSession Alex Rozanski 2023-03-30 09:05:23 +0100
dfd2ebeb22 expose predict() from Session protocol Alex Rozanski 2023-03-30 09:03:14 +0100
1595e7bb4c remove depedence on common.h and remove gpt_params Alex Rozanski 2023-03-30 09:01:03 +0100
37c0945d54 rename _LlamaSessionConfig to _LlamaSessionParams Alex Rozanski 2023-03-30 08:19:06 +0100
9cbc404ba6

ci : re-enable AVX512 testing (Windows-MSVC) (#584) anzz1 2023-03-29 23:44:39 +0300
b51c717d5c

ggml : init time on first ggml_init() call Georgi Gerganov 2023-03-29 22:15:34 +0300
0ba76c1e73

llama : fix compile warnings when reading the vocab Georgi Gerganov 2023-03-29 22:13:12 +0300
cea1c85948

ggml : add ARM_NEON dequantize_row_q4_1() Georgi Gerganov 2023-03-29 22:10:01 +0300
f202ada131

ggml : add ARM_NEON quantize_row_q4_1() Georgi Gerganov 2023-03-29 22:03:02 +0300
3b44d30d9b

ggml : add ARM_NEON ggml_vec_dot_q4_1() Georgi Gerganov 2023-03-29 21:47:33 +0300
61cbfff5c9

rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) Pavol Rusnak 2023-03-29 20:09:25 +0200
d9ad104440

Create chat-13B.bat (#592) Thérence 2023-03-29 19:21:09 +0200
b467702b87

readme : fix typos Georgi Gerganov 2023-03-29 19:38:31 +0300
516d88e75c

readme : add GPT4All instructions (close #588) Georgi Gerganov 2023-03-29 19:37:20 +0300
53635c081c

py : add GPT4All conversion script Georgi Gerganov 2023-03-29 19:29:26 +0300
41318d708e

llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) Maël Kerbiriou 2023-03-29 18:10:07 +0200
a6956b25a1

add example of re-act pattern (#583) Tobias Lütke 2023-03-29 17:10:24 +0200
2b5c5a9e02 rework Swift API to add explicit Llama/Alpaca support Alex Rozanski 2023-03-29 15:52:39 +0100
83df5639eb

Fix GCC warning about binary literal (#595) anzz1 2023-03-29 16:20:07 +0300
a5c42c4b13

Fix typo in llama.h (#593) anzz1 2023-03-29 16:19:29 +0300
d96f78e5db remove unnecessary code from LlamaPredictOperation Alex Rozanski 2023-03-29 13:26:25 +0100
8043a912a0 remove llamaTest Alex Rozanski 2023-03-29 13:17:57 +0100
ed854da89e update LlamaPredictOperation to preserve state across runs Alex Rozanski 2023-03-29 03:57:28 +0100
5a5f8b1501

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375) anzz1 2023-03-28 22:44:29 +0300
f1217055ea

CI: fix subdirectory path globbing (#546) anzz1 2023-03-28 22:43:25 +0300
7f4c5c6651

llama : fix linkage with mingw (#551) anzz1 2023-03-28 21:23:09 +0300
2a98bc18ea

ggml : add AVX2 implementation of quantize_row_q4_1 (#515) slaren 2023-03-28 20:06:03 +0200
d0aaff571c

py : add temporary script to convert old ggml files to newer version (#539) thement 2023-03-28 19:55:42 +0200
d0330fd783

py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) Tai Duc Nguyen 2023-03-28 13:51:29 -0400
99c5b27654

ggml : refactor quantized processing functions (#509) Stephan Walter 2023-03-28 17:13:01 +0000
692ce3164e

py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) DooWoong Lee (David) 2023-03-29 02:02:34 +0900
96f9c0506f

ci : make ctest verbose, hopefully we see what is wrong with the sanitizer Georgi Gerganov 2023-03-28 20:01:09 +0300
d502bc7c9d

tests : free llama context at the end of the test Georgi Gerganov 2023-03-28 19:51:55 +0300
436e561931

all : be more strict about converting float to double (#458) Stephan Walter 2023-03-28 16:48:20 +0000
20e1e84884

deploy : add a Package.swift for SwiftPM support (#393) Jed Fox 2023-03-28 11:39:01 -0500
c1f885067c

ggml : introduce structs for the q4 data blocks (#356) Stephan Walter 2023-03-28 15:56:03 +0000
e0670260fb

gitignore : add "embedding" Georgi Gerganov 2023-03-28 18:34:35 +0300
28ba975aea

Check the existence of f16_model_path_base in quantize.py (#574) dotpy314 2023-03-28 23:06:28 +0800
a6bdc47cba

Fix usage of F16C intrinsics in AVX code (#563) slaren 2023-03-28 16:26:55 +0200
7b8dbcb78b

main.cpp fixes, refactoring (#571) anzz1 2023-03-28 17:09:55 +0300
696e8563f6 add llama_swift_run_state to manage run state between invocations Alex Rozanski 2023-03-28 11:28:39 +0100
d8f706c3c1 Merge remote-tracking branch 'llama.cpp/master' into v2 Alex Rozanski 2023-03-28 10:33:22 +0100
4b8efff0e3

Add embedding example to Makefile (#540) RJ Adriaansen 2023-03-28 08:11:09 +0200
7e5395575a

Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542) Marco Matthies 2023-03-27 06:55:26 +0200
34c1072e49

ci: add debug build to sanitizer build matrix (#527) Erik Scholz 2023-03-26 17:48:40 +0200
939ad2d3a5

Fix undefined variables in debug build, remove unused variables (#531) Stephan Walter 2023-03-26 15:34:02 +0000
8c2ec5e21d

Add support for linux/arm64 platform during Docker Builds (#514) Juan Calderon-Perez 2023-03-26 10:48:42 -0400
b391579db9

Update README and comments for standalone perplexity tool (#525) Stephan Walter 2023-03-26 13:14:01 +0000
7a87d31f4f

[main] fix infinite generation (-n == -1) (#523) anzz1 2023-03-26 16:06:10 +0300
348d6926ee

Add logo to README.md Georgi Gerganov 2023-03-26 10:20:49 +0300
33e35b8fe8

Exit from interactive mode if input stream is bad (#491) Harald Fernengel 2023-03-26 07:25:46 +0200
5beeb28694 add Actions status badge Alex Rozanski 2023-03-25 23:06:08 +0000
5041ebd2af remove .dockerignore Alex Rozanski 2023-03-25 23:04:25 +0000

Commit Graph Select branches Hide Pull Requests master v2 #5 #6 #6 1.0.0 1.1.0 1.1.1 Mono Color

Commit Graph

Select branches

Hide Pull Requests

master

v2

#5

#6

#6

1.0.0

1.1.0

1.1.1