diff --git a/README.md b/README.md index 5e15d7f..db25596 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT) -A fork of @ggerganov's llama.cpp to use [Facebook's LLaMA](https://github.com/facebookresearch/llama) in Swift. +A fork of [@ggerganov](https://github.com/ggerganov)'s [llama.cpp](https://github.com/ggerganov/llama.cpp) to use [Facebook's LLaMA](https://github.com/facebookresearch/llama) in Swift. ## Description @@ -37,14 +37,23 @@ python3 convert-pth-to-ggml.py models/7B/ 1 When running the larger models, make sure you have enough disk space to store all the intermediate files. +## Building + +For now, compile from source. Will add other distribution channels shortly. + +NB: Ensure to build `llama.framework` for Release for snappiness; Debug builds are super slow. + ## Usage +In Swift: + ```swift let url = ... // URL to the model file, as per llama.cpp let llama = LlamaRunner(modelURL: url) llama.run( with: "Building a website can be done in 10 simple steps:", + config: LlamaRunner.Config(numThreads: 8, numTokens: 512) // Can also specify `reversePrompt` tokenHandler: { token in // If printing tokens directly use `terminator: ""` as the tokens include whitespace and newlines. print(token, terminator: "") @@ -71,6 +80,11 @@ llama.run( ``` +Using the `llamaTest` app: + +- Set `MODEL_PATH` in `LlamaTest.xcconfig` to point to your `path/to/ggml-model-q4_0.bin`, then build & run for interactive prompt generation. +- Ensure to build for Release if you want this to be snappy. + ## Misc - License: MIT