In addition to the official version, there are multiple forks of llama.cpp. Among them, the PrismML fork includes optimizations such as Flash Attention, and is characterized by its inference speed on ...