# Download and run a model directly from Hugging Face
llama-cli-hfggml-org/gemma-4-E2B-it-GGUF--prompt"Write a poem about the Kraken."# Use System Prompt
llama-cli-hfggml-org/gemma-4-E2B-it-GGUF-sys"You are Hong Gildong."-p"Who are you?"
To get started an run the model in a nice interface, you can start up a server
with:
llama-server-hfggml-org/gemma-4-E2B-it-GGUF
This creates a server that lets you access your model either from an interface
(http://localhost:8080) or by accessing the OpenAI-endpoint
(http://localhost:8080/v1).
For more information and instructions on how to use llama.cpp with Gemma,
refer to the official repository:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-04-16 UTC."],[],[]]