9 ~45 min

Add a chatbot to your site

Wire a chat widget on your deployed site that answers questions about you, using either Vertex AI directly or your MCP server.

Two paths

Pick one. Both end up at the same UX — a small chat panel on your site that knows about you.

Path	Flow	Tradeoff
Vertex AI direct	site → `/api/chat` → Vertex AI Gemini with `persona.json` as system prompt	Simpler. More failure modes (auth, IAM, region, model name).
Via your MCP server	site → `/api/chat` → your MCP server → tools answer	More elegant if step 8 is done. No Gemini call needed for direct tool questions.

Recommendation: take the Vertex AI direct path on first run. Once it works, swap to the MCP path for round 2.

Configure environment variables

Copy the example file and fill in your project details.

cp .env.example .env
# then edit .env:
GOOGLE_CLOUD_PROJECT=your-project-id
LOCATION=europe-west1

Authenticate

gcloud auth application-default login

This drops application-default credentials on disk. The Vertex AI client picks them up automatically. On Cloud Run the service account is used instead — make sure it has the roles/aiplatform.user role.

Write the chat endpoint

Create api/chat.js (or wire it into your existing Express app). This builds a persona-grounded system prompt from persona.json and calls Gemini through Vertex AI.

// api/chat.js
import { VertexAI } from "@google-cloud/vertexai";
import { readFileSync } from "node:fs";

const persona = JSON.parse(readFileSync("./persona.json", "utf8"));

const vertex = new VertexAI({
  project: process.env.GOOGLE_CLOUD_PROJECT,
  location: process.env.LOCATION || "europe-west1",
});

const model = vertex.getGenerativeModel({
  model: "gemini-2.5-flash", // verify current model name in Vertex AI docs
  systemInstruction: {
    role: "system",
    parts: [{
      text:
        "You answer questions about the person described in the JSON below. " +
        "Be concise. Decline questions that are not about this person.\n\n" +
        JSON.stringify(persona),
    }],
  },
});

export async function chatHandler(req, res) {
  try {
    const { message } = req.body || {};
    if (!message) return res.status(400).json({ error: "missing message" });

    const result = await model.generateContent({
      contents: [{ role: "user", parts: [{ text: message }] }],
    });
    const reply = result.response.candidates?.[0]?.content?.parts?.[0]?.text ?? "";
    res.json({ reply });
  } catch (err) {
    console.error(err);
    res.status(500).json({ error: err.message });
  }
}

Mount it in your server entry point:

import express from "express";
import { chatHandler } from "./api/chat.js";

const app = express();
app.use(express.json());
app.post("/api/chat", chatHandler);
app.use(express.static("public"));
app.listen(process.env.PORT || 3000);

Add the chat widget

Drop this into public/app.js and include it from your HTML with <script src="app.js" defer></script>.

// public/app.js
(function () {
  const btn = document.createElement("button");
  btn.textContent = "Chat";
  btn.style.cssText =
    "position:fixed;bottom:1.5rem;right:1.5rem;padding:.75rem 1.25rem;" +
    "background:#4285F4;color:#fff;border:0;border-radius:999px;" +
    "font:500 .95rem system-ui;cursor:pointer;z-index:9999;";

  const panel = document.createElement("div");
  panel.style.cssText =
    "position:fixed;bottom:5rem;right:1.5rem;width:340px;max-height:60vh;" +
    "display:none;flex-direction:column;background:#fff;border:1px solid #E8EAED;" +
    "border-radius:16px;box-shadow:0 8px 24px rgba(0,0,0,.15);" +
    "font:400 .95rem system-ui;z-index:9999;overflow:hidden;";
  panel.innerHTML =
    '<div id="log" style="flex:1;overflow:auto;padding:1rem;"></div>' +
    '<form id="f" style="display:flex;border-top:1px solid #E8EAED;">' +
    '<input id="i" placeholder="Ask about me..." style="flex:1;border:0;padding:.75rem;font:inherit;outline:none;"/>' +
    '<button style="border:0;background:#4285F4;color:#fff;padding:0 1rem;cursor:pointer;">Send</button>' +
    '</form>';

  document.body.append(btn, panel);
  btn.onclick = () => (panel.style.display = panel.style.display === "flex" ? "none" : "flex");

  const log = panel.querySelector("#log");
  const input = panel.querySelector("#i");
  panel.querySelector("#f").onsubmit = async (e) => {
    e.preventDefault();
    const text = input.value.trim();
    if (!text) return;
    log.insertAdjacentHTML("beforeend", `<p><b>You:</b> ${text}</p>`);
    input.value = "";
    const r = await fetch("/api/chat", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ message: text }),
    });
    const { reply, error } = await r.json();
    log.insertAdjacentHTML("beforeend", `<p><b>Site:</b> ${reply || error}</p>`);
    log.scrollTop = log.scrollHeight;
  };
})();

Test locally

Restart the dev server: npm run dev
Open http://localhost:3000
Click the floating Chat button
Try: "What are your top skills?", "Tell me about your projects.", "Where have you worked?"

Redeploy

./deploy.sh

This pushes a new revision to Cloud Run with the chat endpoint and widget included. Hit your public URL and verify the chat button is there and answers.

Where to take it next

Switch to MCP-server-backed chat: have /api/chat spawn or talk to your MCP server from step 8 instead of calling Vertex directly
Stream responses with generateContentStream and Server-Sent Events for token-by-token UX
Persist chat history per session (cookie or short-lived KV) and pass it back as contents
Add auth or rate limiting before it gets indexed and abused

Troubleshooting

"GOOGLE_CLOUD_PROJECT not set"

Cause. .env isn't being loaded, or the dev server was started before you created it.

Fix. Confirm the value is set, then restart:

cat .env

npm run dev

"Permission 'aiplatform.endpoints.predict' denied"

Cause. The service account calling Vertex AI lacks the right role.

Fix. Grant roles/aiplatform.user to the default compute service account:

PROJECT_NUMBER=$(gcloud projects describe $(gcloud config get-value project) --format='value(projectNumber)')
gcloud projects add-iam-policy-binding $(gcloud config get-value project) \
  --member="serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com" \
  --role="roles/aiplatform.user"

"Model not found" / 404 from Vertex AI

Cause. Either the model isn't available in your region, or the model name has changed since the snippet was written.

Fix. Try us-central1 as LOCATION. If that still fails, look up a current Gemini model name in the Vertex AI docs and update the model field.

503 / quota errors

Cause. Fresh projects ship with low Vertex AI quota — a handful of requests per minute.

Fix. Wait a minute between test calls. For real use, request a quota increase in the Cloud Console under IAM & Admin → Quotas.

Key takeaways

Two architectures, same UX: direct Vertex AI call vs. routing through your MCP server.
Most failures are auth, region, or model-name drift — not your code.
Persona JSON in the system prompt is enough to ground a small chatbot. RAG is overkill at this scale.
Stream once you have something working. Don't optimize before it's correct.

From chatbot to agent

A chatbot that calls one model is the entry point. The next step is an agent that calls tools, plans, and acts. Google's Agent Development Kit (ADK) is the production framework for that.

Building AI Agents with ADK: The Foundation — start here.
Tools Make an Agent — turn the chatbot into a tool-using assistant.
MCP, ADK and A2A together — combine all three concepts in one project.