9 ~45 min

Add a chatbot to your site

Wire a chat widget on your deployed site that answers questions about you, using either Vertex AI directly or your MCP server.

Two paths

Pick one. Both end up at the same UX — a small chat panel on your site that knows about you.

PathFlowTradeoff
Vertex AI direct site → /api/chat → Vertex AI Gemini with persona.json as system prompt Simpler. More failure modes (auth, IAM, region, model name).
Via your MCP server site → /api/chat → your MCP server → tools answer More elegant if step 8 is done. No Gemini call needed for direct tool questions.

Recommendation: take the Vertex AI direct path on first run. Once it works, swap to the MCP path for round 2.

Configure environment variables

Copy the example file and fill in your project details.

cp .env.example .env
# then edit .env:
GOOGLE_CLOUD_PROJECT=your-project-id
LOCATION=europe-west1

Authenticate

gcloud auth application-default login

This drops application-default credentials on disk. The Vertex AI client picks them up automatically. On Cloud Run the service account is used instead — make sure it has the roles/aiplatform.user role.

Write the chat endpoint

Create api/chat.js (or wire it into your existing Express app). This builds a persona-grounded system prompt from persona.json and calls Gemini through Vertex AI.

// api/chat.js
import { VertexAI } from "@google-cloud/vertexai";
import { readFileSync } from "node:fs";

const persona = JSON.parse(readFileSync("./persona.json", "utf8"));

const vertex = new VertexAI({
  project: process.env.GOOGLE_CLOUD_PROJECT,
  location: process.env.LOCATION || "europe-west1",
});

const model = vertex.getGenerativeModel({
  model: "gemini-2.5-flash", // verify current model name in Vertex AI docs
  systemInstruction: {
    role: "system",
    parts: [{
      text:
        "You answer questions about the person described in the JSON below. " +
        "Be concise. Decline questions that are not about this person.\n\n" +
        JSON.stringify(persona),
    }],
  },
});

export async function chatHandler(req, res) {
  try {
    const { message } = req.body || {};
    if (!message) return res.status(400).json({ error: "missing message" });

    const result = await model.generateContent({
      contents: [{ role: "user", parts: [{ text: message }] }],
    });
    const reply = result.response.candidates?.[0]?.content?.parts?.[0]?.text ?? "";
    res.json({ reply });
  } catch (err) {
    console.error(err);
    res.status(500).json({ error: err.message });
  }
}

Mount it in your server entry point:

import express from "express";
import { chatHandler } from "./api/chat.js";

const app = express();
app.use(express.json());
app.post("/api/chat", chatHandler);
app.use(express.static("public"));
app.listen(process.env.PORT || 3000);

Add the chat widget

Drop this into public/app.js and include it from your HTML with <script src="app.js" defer></script>.

// public/app.js
(function () {
  const btn = document.createElement("button");
  btn.textContent = "Chat";
  btn.style.cssText =
    "position:fixed;bottom:1.5rem;right:1.5rem;padding:.75rem 1.25rem;" +
    "background:#4285F4;color:#fff;border:0;border-radius:999px;" +
    "font:500 .95rem system-ui;cursor:pointer;z-index:9999;";

  const panel = document.createElement("div");
  panel.style.cssText =
    "position:fixed;bottom:5rem;right:1.5rem;width:340px;max-height:60vh;" +
    "display:none;flex-direction:column;background:#fff;border:1px solid #E8EAED;" +
    "border-radius:16px;box-shadow:0 8px 24px rgba(0,0,0,.15);" +
    "font:400 .95rem system-ui;z-index:9999;overflow:hidden;";
  panel.innerHTML =
    '<div id="log" style="flex:1;overflow:auto;padding:1rem;"></div>' +
    '<form id="f" style="display:flex;border-top:1px solid #E8EAED;">' +
    '<input id="i" placeholder="Ask about me..." style="flex:1;border:0;padding:.75rem;font:inherit;outline:none;"/>' +
    '<button style="border:0;background:#4285F4;color:#fff;padding:0 1rem;cursor:pointer;">Send</button>' +
    '</form>';

  document.body.append(btn, panel);
  btn.onclick = () => (panel.style.display = panel.style.display === "flex" ? "none" : "flex");

  const log = panel.querySelector("#log");
  const input = panel.querySelector("#i");
  panel.querySelector("#f").onsubmit = async (e) => {
    e.preventDefault();
    const text = input.value.trim();
    if (!text) return;
    log.insertAdjacentHTML("beforeend", `<p><b>You:</b> ${text}</p>`);
    input.value = "";
    const r = await fetch("/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ message: text }),
    });
    const { reply, error } = await r.json();
    log.insertAdjacentHTML("beforeend", `<p><b>Site:</b> ${reply || error}</p>`);
    log.scrollTop = log.scrollHeight;
  };
})();

Test locally

Redeploy

./deploy.sh

This pushes a new revision to Cloud Run with the chat endpoint and widget included. Hit your public URL and verify the chat button is there and answers.

Where to take it next

Troubleshooting

"GOOGLE_CLOUD_PROJECT not set"

Cause. .env isn't being loaded, or the dev server was started before you created it.

Fix. Confirm the value is set, then restart:

cat .env
npm run dev
"Permission 'aiplatform.endpoints.predict' denied"

Cause. The service account calling Vertex AI lacks the right role.

Fix. Grant roles/aiplatform.user to the default compute service account:

PROJECT_NUMBER=$(gcloud projects describe $(gcloud config get-value project) --format='value(projectNumber)')
gcloud projects add-iam-policy-binding $(gcloud config get-value project) \
  --member="serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com" \
  --role="roles/aiplatform.user"
"Model not found" / 404 from Vertex AI

Cause. Either the model isn't available in your region, or the model name has changed since the snippet was written.

Fix. Try us-central1 as LOCATION. If that still fails, look up a current Gemini model name in the Vertex AI docs and update the model field.

503 / quota errors

Cause. Fresh projects ship with low Vertex AI quota — a handful of requests per minute.

Fix. Wait a minute between test calls. For real use, request a quota increase in the Cloud Console under IAM & Admin → Quotas.

Key takeaways
  • Two architectures, same UX: direct Vertex AI call vs. routing through your MCP server.
  • Most failures are auth, region, or model-name drift — not your code.
  • Persona JSON in the system prompt is enough to ground a small chatbot. RAG is overkill at this scale.
  • Stream once you have something working. Don't optimize before it's correct.
From chatbot to agent

A chatbot that calls one model is the entry point. The next step is an agent that calls tools, plans, and acts. Google's Agent Development Kit (ADK) is the production framework for that.