1
0

ch05-web-ui.md 35 KB


title: "5. Adding a Web UI" order: 5


By the end of Chapter 4, we've built out all the server features: the translation API, SSE streaming, and model management. But so far, the only way to interact with it is through curl. In this chapter, we'll add a Web UI so you can translate from the browser.

Here's what the finished screen looks like.

Web UI

  • As you type text, tokens appear one by one (with debounce)
  • You can switch models and languages from the header dropdowns
  • Selecting an undownloaded model starts a download with a progress bar (cancellable)

The HTML, CSS, and JavaScript code is minimal. We won't use any CSS framework -- just plain CSS (about 100 lines) for the layout. Since this is a C++ book, we won't go into detailed frontend explanations. We'll just show you "write this, and it does that."

5.1 File Structure

These are the files we'll add in this chapter. We'll place HTML, CSS, and JavaScript in the public/ directory and serve them from the server.

translate-app/
├── public/
│   ├── index.html
│   ├── style.css
│   └── script.js
└── src/
    └── main.cpp      # Add set_mount_point

5.2 Setting Up Static File Serving

Using cpp-httplib's set_mount_point, you can serve a directory directly over HTTP. Create a public/ directory and place an empty index.html in it.

mkdir public
<!DOCTYPE html>
<html lang="ja">
<head>
  <meta charset="UTF-8">
  <title>Translate App</title>
</head>
<body>
  <h1>Hello!</h1>
</body>
</html>

Add one line of set_mount_point to the server code and rebuild.

// Add inside `main()`, before `svr.listen()`
svr.set_mount_point("/", "./public");

Start the server and open http://127.0.0.1:8080 in your browser -- you should see "Hello!" displayed. Since these are static files, just reload the browser after editing index.html to see the changes. No server restart needed.

5.3 Building the Layout

Replace index.html with the final layout.

<!DOCTYPE html>
<html lang="ja">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Translate App</title>
  <!-- Set favicon with inline SVG emoji (no image file needed) -->
  <link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>🌐</text></svg>">
  <link rel="stylesheet" href="/style.css">
</head>
<body>
  <!-- Header: title + model selector + language selector -->
  <header>
    <strong>Translate App</strong>
    <div>
      <!-- Options are dynamically populated by script.js via `GET /models` -->
      <select id="model-select" aria-label="Model"></select>
      <select id="target-lang" aria-label="Target language">
        <option value="ja">Japanese</option>
        <option value="en">English</option>
        <option value="zh">Chinese</option>
        <option value="ko">Korean</option>
        <option value="fr">French</option>
        <option value="de">German</option>
        <option value="es">Spanish</option>
      </select>
    </div>
  </header>

  <!-- Two-column layout: input and translation result -->
  <main>
    <textarea id="input-text" placeholder="Enter text to translate..."></textarea>
    <output id="output-text"></output>
  </main>

  <!-- Modal displayed during model download -->
  <dialog id="download-dialog">
    <h3>Downloading model...</h3>
    <progress id="download-progress" max="100" value="0"></progress>
    <p id="download-status"></p>
    <button id="download-cancel">Cancel</button>
  </dialog>

  <script src="/script.js"></script>
</body>
</html>

Key points about the HTML.

  • The favicon uses an inline SVG emoji, so no image file is needed
  • <dialog> shows download progress. It's a standard HTML element you can display as a modal with showModal()
  • <output> is for displaying translation results. It's an element that semantically represents "computed output"
  • There's no translate button. Translation starts automatically when you type text (implemented in Section 5.4)

Write the CSS to public/style.css. We won't use any CSS framework -- just plain CSS for the layout.

:root {
  --gap: 0.5rem;
  --color-border: #ccc;
  --font: system-ui, sans-serif;
}

* {
  margin: 0;
  padding: 0;
  box-sizing: border-box;
}

html, body {
  height: 100%;
  font-family: var(--font);
}

body {
  display: flex;
  flex-direction: column;
  padding: var(--gap);
  gap: var(--gap);
}

/* Header: title + dropdowns */
header {
  display: flex;
  align-items: center;
  justify-content: space-between;
}

header div {
  display: flex;
  gap: var(--gap);
}

/* Main: two-column layout */
main {
  flex: 1;
  display: grid;
  grid-template-columns: 1fr 1fr;
  gap: var(--gap);
  min-height: 0;
}

#input-text {
  resize: none;
  padding: 0.75rem;
  font-family: var(--font);
  font-size: 1rem;
  border: 1px solid var(--color-border);
  border-radius: 4px;
}

textarea:focus,
select:focus {
  outline: 1px solid #4a9eff;
  outline-offset: -1px;
}

#output-text {
  display: block;
  padding: 0.75rem;
  font-size: 1rem;
  border: 1px solid var(--color-border);
  border-radius: 4px;
  white-space: pre-wrap;
  overflow-y: auto;
}

/* Download modal */
dialog {
  border: 1px solid var(--color-border);
  border-radius: 8px;
  padding: 1.5rem;
  max-width: 400px;
  width: 90%;
  margin: auto;
}

dialog::backdrop {
  background: rgba(0, 0, 0, 0.4);
}

dialog h3 {
  margin-bottom: 0.75rem;
}

dialog progress {
  width: 100%;
  height: 1.25rem;
}

dialog p {
  margin-top: 0.5rem;
  text-align: center;
  color: #666;
}

dialog button {
  display: block;
  margin: 0.75rem auto 0;
  padding: 0.4rem 1.5rem;
  cursor: pointer;
}

/* Block the entire UI during translation or model switching */
body.busy {
  cursor: wait;
}

body.busy select,
body.busy textarea {
  pointer-events: none;
  opacity: 0.6;
}

Key points about the layout.

  • body uses Flexbox for vertical layout, and main takes up the remaining height with flex: 1. The input and output areas extend to the bottom of the window
  • main uses CSS Grid's 1fr 1fr to split into two columns
  • The --gap variable unifies all spacing. The top of the header, the space between the header and boxes, and the bottom of the boxes all have the same width
  • The body.busy class blocks the UI during translation or model switching. JavaScript toggles it on and off

Reload the browser and you should see the input and output areas side by side. Nothing happens when you type yet, but the layout is complete.

5.4 Connecting the Translation Feature

Now it's time to call the server's API from JavaScript. Create public/script.js.

Reading the SSE Stream

The /translate/stream endpoint we built in Chapter 3 is a POST endpoint. Since the browser's EventSource only supports GET, we'll read SSE using fetch() + ReadableStream. The basic pattern is:

  1. Send a POST request with fetch()
  2. Get a stream with res.body.getReader()
  3. Process lines starting with data: as we read chunks

Chunks can be split in the middle of an SSE line, so we need to buffer them and process line by line.

Auto-translation with Debounce

Instead of a translate button, we trigger translation automatically on text input or language change. We add a 300ms debounce to prevent requests from firing on every keystroke.

To cancel the previous translation while typing, we use AbortController. When new input arrives, abort() cancels the previous fetch and starts a new translation. Since we need to pass a cancellation signal to fetch, the SSE reading is written inline.

const inputText = document.getElementById("input-text");
const outputText = document.getElementById("output-text");
const targetLang = document.getElementById("target-lang");

let debounceTimer = null;
let abortController = null;

async function translate() {
  const text = inputText.value.trim();
  if (!text) {
    outputText.textContent = "";
    return;
  }

  // Cancel any in-progress translation
  if (abortController) abortController.abort();
  abortController = new AbortController();
  const { signal } = abortController;

  outputText.textContent = "";
  document.body.classList.add("busy");

  try {
    const res = await fetch("/translate/stream", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ text, target_lang: targetLang.value }),
      signal,
    });

    if (!res.ok) {
      const err = await res.json();
      throw new Error(err.error || `HTTP ${res.status}`);
    }

    const reader = res.body.getReader();
    const decoder = new TextDecoder();
    let buffer = "";

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      buffer += decoder.decode(value, { stream: true });
      const lines = buffer.split("\n");
      buffer = lines.pop();

      for (const line of lines) {
        if (line.startsWith("data: ")) {
          const data = line.slice(6);
          if (data === "[DONE]") return;
          const parsed = JSON.parse(data);
          if (parsed && parsed.error) {
            outputText.textContent = "Error: " + parsed.error;
            return;
          }
          outputText.textContent += parsed;
        }
      }
    }
  } catch (e) {
    if (e.name === "AbortError") return; // Cancelled by new input
    outputText.textContent = "Error: " + e.message;
  } finally {
    document.body.classList.remove("busy");
  }
}

function scheduleTranslation() {
  clearTimeout(debounceTimer);
  debounceTimer = setTimeout(translate, 300);
}

inputText.addEventListener("input", scheduleTranslation);
targetLang.addEventListener("change", scheduleTranslation);

We use fetch directly because we need to pass the AbortController's signal. Since the server can return errors as JSON objects (from the try/catch we added in Chapter 3), we also check for parsed.error.

Reload the browser and try typing some text. After 300ms, tokens should appear one by one. If you change the input, the previous translation is cancelled and a new one begins.

5.5 Connecting Model Selection

Loading the Model List

When the page loads, we call GET /models to initialize the dropdown.

const modelSelect = document.getElementById("model-select");

// Fetch model list from `GET /models` and build the dropdown
async function loadModels() {
  const res = await fetch("/models");
  const { models } = await res.json();

  modelSelect.innerHTML = ""; // Clear existing options
  for (const m of models) {
    const opt = document.createElement("option");
    opt.value = m.name;
    // Mark undownloaded models with a ⬇ icon to distinguish them
    opt.textContent = m.downloaded
      ? `${m.name} (${m.params})`
      : `${m.name} (${m.params}) ⬇`;
    opt.selected = m.selected; // Select the current model using the `selected` flag from the server
    modelSelect.appendChild(opt);
  }
}

loadModels(); // Run on page load

Undownloaded models are marked with a icon to distinguish them.

Switching Models

Changing the dropdown calls POST /models/select. If a download is needed, a <dialog> with a progress bar appears. The cancel button can abort the download.

As with translation, we use AbortController. Clicking the cancel button calls abort() to disconnect. The server detects the disconnection and aborts the download (thanks to download_model returning sink.os.good() from Chapter 4).

const dialog = document.getElementById("download-dialog");
const progressBar = document.getElementById("download-progress");
const downloadStatus = document.getElementById("download-status");
const downloadCancel = document.getElementById("download-cancel");

let modelAbort = null;

downloadCancel.addEventListener("click", () => {
  if (modelAbort) modelAbort.abort();
});

modelSelect.addEventListener("change", async () => {
  const name = modelSelect.value;
  document.body.classList.add("busy");

  modelAbort = new AbortController();
  const { signal } = modelAbort;

  try {
    const res = await fetch("/models/select", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ model: name }),
      signal,
    });

    if (!res.ok) {
      const err = await res.json();
      throw new Error(err.error || `HTTP ${res.status}`);
    }

    const reader = res.body.getReader();
    const decoder = new TextDecoder();
    let buffer = "";

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      buffer += decoder.decode(value, { stream: true });
      const lines = buffer.split("\n");
      buffer = lines.pop();

      for (const line of lines) {
        if (line.startsWith("data: ")) {
          const data = line.slice(6);
          if (data === "[DONE]") return;
          const event = JSON.parse(data);

          switch (event.status) {
            case "downloading":
              if (!dialog.open) dialog.showModal(); // Show the modal
              progressBar.value = event.progress;   // Update the progress bar
              downloadStatus.textContent = `${event.progress}%`;
              break;
            case "loading":
              // Removing the `value` attribute puts `<progress>` into animated (indeterminate) state
              progressBar.removeAttribute("value");
              downloadStatus.textContent = "Loading model...";
              break;
            case "ready":
              if (dialog.open) dialog.close();
              break;
            case "error":
              if (dialog.open) dialog.close();
              alert("Download failed: " + event.message);
              break;
          }
        }
      }
    }

    await loadModels(); // Refresh the list since the `selected` flag changed
    scheduleTranslation(); // Re-translate with the new model
  } catch (e) {
    if (e.name === "AbortError") {
      // Cancelled -- revert to the original model
      await loadModels();
    } else {
      alert("Error: " + e.message);
    }
  } finally {
    document.body.classList.remove("busy");
    if (dialog.open) dialog.close();
    modelAbort = null;
  }
});

progressBar.removeAttribute("value") puts the <progress> element into an indeterminate (animated) state. We use this while loading the model after the download completes.

5.6 Complete Code

Complete code (index.html) ```html Translate App
Japanese English Chinese Korean French German Spanish

Downloading model...

Cancel ```
Complete code (style.css) ```css :root { --gap: 0.5rem; --color-border: #ccc; --font: system-ui, sans-serif; } * { margin: 0; padding: 0; box-sizing: border-box; } html, body { height: 100%; font-family: var(--font); } body { display: flex; flex-direction: column; padding: var(--gap); gap: var(--gap); } /* Header: title + dropdowns */ header { display: flex; align-items: center; justify-content: space-between; } header div { display: flex; gap: var(--gap); } /* Main: two-column layout */ main { flex: 1; display: grid; grid-template-columns: 1fr 1fr; gap: var(--gap); min-height: 0; } #input-text { resize: none; padding: 0.75rem; font-family: var(--font); font-size: 1rem; border: 1px solid var(--color-border); border-radius: 4px; } textarea:focus, select:focus { outline: 1px solid #4a9eff; outline-offset: -1px; } #output-text { display: block; padding: 0.75rem; font-size: 1rem; border: 1px solid var(--color-border); border-radius: 4px; white-space: pre-wrap; overflow-y: auto; } /* Download modal */ dialog { border: 1px solid var(--color-border); border-radius: 8px; padding: 1.5rem; max-width: 400px; width: 90%; margin: auto; } dialog::backdrop { background: rgba(0, 0, 0, 0.4); } dialog h3 { margin-bottom: 0.75rem; } dialog progress { width: 100%; height: 1.25rem; } dialog p { margin-top: 0.5rem; text-align: center; color: #666; } dialog button { display: block; margin: 0.75rem auto 0; padding: 0.4rem 1.5rem; cursor: pointer; } /* Block the entire UI during translation or model switching */ body.busy { cursor: wait; } body.busy select, body.busy textarea { pointer-events: none; opacity: 0.6; } ```
Complete code (script.js) ```js // --- DOM Elements --- const inputText = document.getElementById("input-text"); const outputText = document.getElementById("output-text"); const targetLang = document.getElementById("target-lang"); const modelSelect = document.getElementById("model-select"); const dialog = document.getElementById("download-dialog"); const progressBar = document.getElementById("download-progress"); const downloadStatus = document.getElementById("download-status"); const downloadCancel = document.getElementById("download-cancel"); // --- Model List --- // Fetch model list from `GET /models` and build the dropdown async function loadModels() { const res = await fetch("/models"); const { models } = await res.json(); modelSelect.innerHTML = ""; // Clear existing options for (const m of models) { const opt = document.createElement("option"); opt.value = m.name; // Mark undownloaded models with a ⬇ icon to distinguish them opt.textContent = m.downloaded ? `${m.name} (${m.params})` : `${m.name} (${m.params}) ⬇`; opt.selected = m.selected; // Select the current model using the `selected` flag from the server modelSelect.appendChild(opt); } } loadModels(); // Run on page load // --- Translation (auto-translation with debounce) --- let debounceTimer = null; let abortController = null; async function translate() { const text = inputText.value.trim(); if (!text) { outputText.textContent = ""; return; } // Cancel any in-progress translation if (abortController) abortController.abort(); abortController = new AbortController(); const { signal } = abortController; outputText.textContent = ""; document.body.classList.add("busy"); try { const res = await fetch("/translate/stream", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ text, target_lang: targetLang.value }), signal, }); if (!res.ok) { const err = await res.json(); throw new Error(err.error || `HTTP ${res.status}`); } const reader = res.body.getReader(); const decoder = new TextDecoder(); let buffer = ""; while (true) { const { done, value } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); const lines = buffer.split("\n"); buffer = lines.pop(); for (const line of lines) { if (line.startsWith("data: ")) { const data = line.slice(6); if (data === "[DONE]") return; const parsed = JSON.parse(data); if (parsed && parsed.error) { outputText.textContent = "Error: " + parsed.error; return; } outputText.textContent += parsed; } } } } catch (e) { if (e.name === "AbortError") return; // Cancelled by new input outputText.textContent = "Error: " + e.message; } finally { document.body.classList.remove("busy"); } } function scheduleTranslation() { clearTimeout(debounceTimer); debounceTimer = setTimeout(translate, 300); } inputText.addEventListener("input", scheduleTranslation); targetLang.addEventListener("change", scheduleTranslation); // --- Model Selection --- let modelAbort = null; downloadCancel.addEventListener("click", () => { if (modelAbort) modelAbort.abort(); }); modelSelect.addEventListener("change", async () => { const name = modelSelect.value; document.body.classList.add("busy"); modelAbort = new AbortController(); const { signal } = modelAbort; try { const res = await fetch("/models/select", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ model: name }), signal, }); if (!res.ok) { const err = await res.json(); throw new Error(err.error || `HTTP ${res.status}`); } const reader = res.body.getReader(); const decoder = new TextDecoder(); let buffer = ""; while (true) { const { done, value } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); const lines = buffer.split("\n"); buffer = lines.pop(); for (const line of lines) { if (line.startsWith("data: ")) { const data = line.slice(6); if (data === "[DONE]") return; const event = JSON.parse(data); switch (event.status) { case "downloading": if (!dialog.open) dialog.showModal(); progressBar.value = event.progress; downloadStatus.textContent = `${event.progress}%`; break; case "loading": progressBar.removeAttribute("value"); downloadStatus.textContent = "Loading model..."; break; case "ready": if (dialog.open) dialog.close(); break; case "error": if (dialog.open) dialog.close(); alert("Download failed: " + event.message); break; } } } } await loadModels(); scheduleTranslation(); // Re-translate with the new model } catch (e) { if (e.name === "AbortError") { // Cancelled -- revert to the original model await loadModels(); } else { alert("Error: " + e.message); } } finally { document.body.classList.remove("busy"); if (dialog.open) dialog.close(); modelAbort = null; } }); ```
Complete code (main.cpp) The only server-side change is the single `set_mount_point` line. Add it before `svr.listen()` in the complete code from Chapter 4. ```cpp #include #include #include #include #include #include #include #include using json = nlohmann::json; // ------------------------------------------------------------------------- // Model definitions // ------------------------------------------------------------------------- struct ModelInfo { std::string name; std::string params; std::string size; std::string repo; std::string filename; }; const std::vector MODELS = { { .name = "gemma-2-2b-it", .params = "2B", .size = "1.6 GB", .repo = "bartowski/gemma-2-2b-it-GGUF", .filename = "gemma-2-2b-it-Q4_K_M.gguf", }, { .name = "gemma-2-9b-it", .params = "9B", .size = "5.8 GB", .repo = "bartowski/gemma-2-9b-it-GGUF", .filename = "gemma-2-9b-it-Q4_K_M.gguf", }, { .name = "Llama-3.1-8B-Instruct", .params = "8B", .size = "4.9 GB", .repo = "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF", .filename = "Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf", }, }; // ------------------------------------------------------------------------- // Model storage directory // ------------------------------------------------------------------------- std::filesystem::path get_models_dir() { #ifdef _WIN32 auto env = std::getenv("APPDATA"); auto base = env ? std::filesystem::path(env) : std::filesystem::path("."); return base / "translate-app" / "models"; #else auto env = std::getenv("HOME"); auto base = env ? std::filesystem::path(env) : std::filesystem::path("."); return base / ".translate-app" / "models"; #endif } // ------------------------------------------------------------------------- // Model download // ------------------------------------------------------------------------- // Abort the download if progress_cb returns false bool download_model(const ModelInfo &model, std::function progress_cb) { httplib::Client cli("https://huggingface.co"); cli.set_follow_location(true); // Hugging Face redirects to CDN cli.set_read_timeout(std::chrono::hours(1)); // Long timeout for large models auto url = "/" + model.repo + "/resolve/main/" + model.filename; auto path = get_models_dir() / model.filename; auto tmp_path = std::filesystem::path(path).concat(".tmp"); std::ofstream ofs(tmp_path, std::ios::binary); if (!ofs) { return false; } auto res = cli.Get(url, // content_receiver: receive chunks and write to file [&](const char *data, size_t len) { ofs.write(data, len); return ofs.good(); }, // progress: report download progress (return false to abort) [&, last_pct = -1](size_t current, size_t total) mutable { int pct = total ? (int)(current * 100 / total) : 0; if (pct == last_pct) return true; // Skip if same value last_pct = pct; return progress_cb(pct); }); ofs.close(); if (!res || res->status != 200) { std::filesystem::remove(tmp_path); return false; } // Rename after download completes std::filesystem::rename(tmp_path, path); return true; } // ------------------------------------------------------------------------- // Server // ------------------------------------------------------------------------- httplib::Server svr; void signal_handler(int sig) { if (sig == SIGINT || sig == SIGTERM) { std::cout << "\nReceived signal, shutting down gracefully...\n"; svr.stop(); } } int main() { // Create model storage directory auto models_dir = get_models_dir(); std::filesystem::create_directories(models_dir); // Auto-download default model if not present std::string selected_model = MODELS[0].filename; auto path = models_dir / selected_model; if (!std::filesystem::exists(path)) { std::cout << "Downloading " << selected_model << "..." << std::endl; if (!download_model(MODELS[0], [](int pct) { std::cout << "\r" << pct << "%" << std::flush; return true; })) { std::cerr << "\nFailed to download model." << std::endl; return 1; } std::cout << std::endl; } auto llm = llamalib::Llama{path}; // LLM inference takes time, so set a longer timeout (default is 5 seconds) svr.set_read_timeout(300); svr.set_write_timeout(300); svr.set_logger([](const auto &req, const auto &res) { std::cout << req.method << " " << req.path << " -> " << res.status << std::endl; }); svr.Get("/health", [](const httplib::Request &, httplib::Response &res) { res.set_content(json{{"status", "ok"}}.dump(), "application/json"); }); // --- Translation endpoint (Chapter 2) ------------------------------------ svr.Post("/translate", [&](const httplib::Request &req, httplib::Response &res) { auto input = json::parse(req.body, nullptr, false); if (input.is_discarded()) { res.status = 400; res.set_content(json{{"error", "Invalid JSON"}}.dump(), "application/json"); return; } if (!input.contains("text") || !input["text"].is_string() || input["text"].get().empty()) { res.status = 400; res.set_content(json{{"error", "'text' is required"}}.dump(), "application/json"); return; } auto text = input["text"].get(); auto target_lang = input.value("target_lang", "ja"); auto prompt = "Translate the following text to " + target_lang + ". Output only the translation, nothing else.\n\n" + text; try { auto translation = llm.chat(prompt); res.set_content(json{{"translation", translation}}.dump(), "application/json"); } catch (const std::exception &e) { res.status = 500; res.set_content(json{{"error", e.what()}}.dump(), "application/json"); } }); // --- SSE streaming translation (Chapter 3) -------------------------------- svr.Post("/translate/stream", [&](const httplib::Request &req, httplib::Response &res) { auto input = json::parse(req.body, nullptr, false); if (input.is_discarded()) { res.status = 400; res.set_content(json{{"error", "Invalid JSON"}}.dump(), "application/json"); return; } if (!input.contains("text") || !input["text"].is_string() || input["text"].get().empty()) { res.status = 400; res.set_content(json{{"error", "'text' is required"}}.dump(), "application/json"); return; } auto text = input["text"].get(); auto target_lang = input.value("target_lang", "ja"); auto prompt = "Translate the following text to " + target_lang + ". Output only the translation, nothing else.\n\n" + text; res.set_chunked_content_provider( "text/event-stream", [&, prompt](size_t, httplib::DataSink &sink) { try { llm.chat(prompt, [&](std::string_view token) { sink.os << "data: " << json(std::string(token)).dump( -1, ' ', false, json::error_handler_t::replace) << "\n\n"; return sink.os.good(); // Abort inference on disconnect }); sink.os << "data: [DONE]\n\n"; } catch (const std::exception &e) { sink.os << "data: " << json({{"error", e.what()}}).dump() << "\n\n"; } sink.done(); return true; }); }); // --- Model list (Chapter 4) ----------------------------------------------- svr.Get("/models", [&](const httplib::Request &, httplib::Response &res) { auto models_dir = get_models_dir(); auto arr = json::array(); for (const auto &m : MODELS) { auto path = models_dir / m.filename; arr.push_back({ {"name", m.name}, {"params", m.params}, {"size", m.size}, {"downloaded", std::filesystem::exists(path)}, {"selected", m.filename == selected_model}, }); } res.set_content(json{{"models", arr}}.dump(), "application/json"); }); // --- Model selection (Chapter 4) ------------------------------------------ svr.Post("/models/select", [&](const httplib::Request &req, httplib::Response &res) { auto input = json::parse(req.body, nullptr, false); if (input.is_discarded() || !input.contains("model")) { res.status = 400; res.set_content(json{{"error", "'model' is required"}}.dump(), "application/json"); return; } auto name = input["model"].get(); auto it = std::find_if(MODELS.begin(), MODELS.end(), [&](const ModelInfo &m) { return m.name == name; }); if (it == MODELS.end()) { res.status = 404; res.set_content(json{{"error", "Unknown model"}}.dump(), "application/json"); return; } const auto &model = *it; // Always respond with SSE (same format whether downloaded or not) res.set_chunked_content_provider( "text/event-stream", [&, model](size_t, httplib::DataSink &sink) { // SSE event send helper auto send = [&](const json &event) { sink.os << "data: " << event.dump() << "\n\n"; }; // Download if not yet downloaded (report progress via SSE) auto path = get_models_dir() / model.filename; if (!std::filesystem::exists(path)) { bool ok = download_model(model, [&](int pct) { send({{"status", "downloading"}, {"progress", pct}}); return sink.os.good(); // Abort download on client disconnect }); if (!ok) { send({{"status", "error"}, {"message", "Download failed"}}); sink.done(); return true; } } // Load and switch to the model send({{"status", "loading"}}); llm = llamalib::Llama{path}; selected_model = model.filename; send({{"status", "ready"}}); sink.done(); return true; }); }); // --- Static file serving (Chapter 5) -------------------------------------- svr.set_mount_point("/", "./public"); // Allow graceful shutdown via `Ctrl+C` (`SIGINT`) or `kill` (`SIGTERM`) signal(SIGINT, signal_handler); signal(SIGTERM, signal_handler); std::cout << "Listening on http://127.0.0.1:8080" << std::endl; svr.listen("127.0.0.1", 8080); } ```

5.7 Testing

Rebuild and start the server.

cmake --build build -j
./build/translate-server

Open http://127.0.0.1:8080 in your browser.

  1. Type some text -- after 300ms, tokens appear incrementally
  2. Change the input -- the previous translation is cancelled and a new one starts
  3. Change the language dropdown -- automatic re-translation
  4. Change the model dropdown -- switches immediately if already downloaded
  5. Select an undownloaded model -- a progress bar appears, and Cancel can abort it

Everything we did with curl in Chapter 4 can now be done from the browser.

Next Chapter

The server and Web UI are complete. In the next chapter, we'll wrap this app with webview/webview to make it a desktop application that runs without a browser. We'll embed the static files into the binary so the distributable is a single executable.

Next: Turning It into a Desktop App with WebView