Add a chatbot to your site
Wire a chat widget on your deployed site that answers questions about you, using either Vertex AI directly or your MCP server.
Two paths
Pick one. Both end up at the same UX — a small chat panel on your site that knows about you.
| Path | Flow | Tradeoff |
|---|---|---|
| Vertex AI direct | site → /api/chat → Vertex AI Gemini with persona.json as system prompt |
Simpler. More failure modes (auth, IAM, region, model name). |
| Via your MCP server | site → /api/chat → your MCP server → tools answer |
More elegant if step 7 is done. No Gemini call needed for direct tool questions. |
Recommendation: take the Vertex AI direct path on first run. Once it works, swap to the MCP path for round 2.
Configure environment variables
Copy the example file and fill in your project details.
cp .env.example .env
# then edit .env:
GOOGLE_CLOUD_PROJECT=your-project-id
LOCATION=europe-west1
Authenticate
gcloud auth application-default login
This drops application-default credentials on disk. The Vertex AI client picks them up automatically. On Cloud Run the service account is used instead — make sure it has the roles/aiplatform.user role.
Write the chat endpoint
Create api/chat.js (or wire it into your existing Express app). This builds a persona-grounded system prompt from persona.json and calls Gemini through Vertex AI.
// api/chat.js
import { VertexAI } from "@google-cloud/vertexai";
import { readFileSync } from "node:fs";
const persona = JSON.parse(readFileSync("./persona.json", "utf8"));
const vertex = new VertexAI({
project: process.env.GOOGLE_CLOUD_PROJECT,
location: process.env.LOCATION || "europe-west1",
});
const model = vertex.getGenerativeModel({
model: "gemini-2.5-flash", // verify current model name in Vertex AI docs
systemInstruction: {
role: "system",
parts: [{
text:
"You answer questions about the person described in the JSON below. " +
"Be concise. Decline questions that are not about this person.\n\n" +
JSON.stringify(persona),
}],
},
});
export async function chatHandler(req, res) {
try {
const { message } = req.body || {};
if (!message) return res.status(400).json({ error: "missing message" });
const result = await model.generateContent({
contents: [{ role: "user", parts: [{ text: message }] }],
});
const reply = result.response.candidates?.[0]?.content?.parts?.[0]?.text ?? "";
res.json({ reply });
} catch (err) {
console.error(err);
res.status(500).json({ error: err.message });
}
}
Mount it in your server entry point:
import express from "express";
import { chatHandler } from "./api/chat.js";
const app = express();
app.use(express.json());
app.post("/api/chat", chatHandler);
app.use(express.static("public"));
app.listen(process.env.PORT || 3000);
Add the chat widget
Drop this into public/app.js and include it from your HTML with <script src="app.js" defer></script>.
// public/app.js
(function () {
const btn = document.createElement("button");
btn.textContent = "Chat";
btn.style.cssText =
"position:fixed;bottom:1.5rem;right:1.5rem;padding:.75rem 1.25rem;" +
"background:#4285F4;color:#fff;border:0;border-radius:999px;" +
"font:500 .95rem system-ui;cursor:pointer;z-index:9999;";
const panel = document.createElement("div");
panel.style.cssText =
"position:fixed;bottom:5rem;right:1.5rem;width:340px;max-height:60vh;" +
"display:none;flex-direction:column;background:#fff;border:1px solid #E8EAED;" +
"border-radius:16px;box-shadow:0 8px 24px rgba(0,0,0,.15);" +
"font:400 .95rem system-ui;z-index:9999;overflow:hidden;";
panel.innerHTML =
'<div id="log" style="flex:1;overflow:auto;padding:1rem;"></div>' +
'<form id="f" style="display:flex;border-top:1px solid #E8EAED;">' +
'<input id="i" placeholder="Ask about me..." style="flex:1;border:0;padding:.75rem;font:inherit;outline:none;"/>' +
'<button style="border:0;background:#4285F4;color:#fff;padding:0 1rem;cursor:pointer;">Send</button>' +
'</form>';
document.body.append(btn, panel);
btn.onclick = () => (panel.style.display = panel.style.display === "flex" ? "none" : "flex");
const log = panel.querySelector("#log");
const input = panel.querySelector("#i");
panel.querySelector("#f").onsubmit = async (e) => {
e.preventDefault();
const text = input.value.trim();
if (!text) return;
log.insertAdjacentHTML("beforeend", `<p><b>You:</b> ${text}</p>`);
input.value = "";
const r = await fetch("/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ message: text }),
});
const { reply, error } = await r.json();
log.insertAdjacentHTML("beforeend", `<p><b>Site:</b> ${reply || error}</p>`);
log.scrollTop = log.scrollHeight;
};
})();
Test locally
- Restart the dev server:
npm run dev - Open
http://localhost:3000 - Click the floating Chat button
- Try: "What are your top skills?", "Tell me about your projects.", "Where have you worked?"
Redeploy
./deploy.sh
This pushes a new revision to Cloud Run with the chat endpoint and widget included. Hit your public URL and verify the chat button is there and answers.
- "GOOGLE_CLOUD_PROJECT not set" —
.envnot loaded. Check the file, restart the server. - IAM error / 403 on Vertex AI — the service account needs
roles/aiplatform.user. Grant it to the default compute SA in IAM & Admin. - "Model not found in region" — the model isn't available in your
LOCATION. Tryus-central1. - "Model not found" with the right region — Gemini model names drift. Check the current model in the Vertex AI Gemini docs and update
model. - Quota errors — fresh projects have low Vertex AI quota. Request an increase or wait a few minutes between requests during testing.
Where to take it next
- Switch to MCP-server-backed chat: have
/api/chatspawn or talk to your MCP server from step 7 instead of calling Vertex directly - Stream responses with
generateContentStreamand Server-Sent Events for token-by-token UX - Persist chat history per session (cookie or short-lived KV) and pass it back as
contents - Add auth or rate limiting before it gets indexed and abused
- Two architectures, same UX: direct Vertex AI call vs. routing through your MCP server.
- Most failures are auth, region, or model-name drift — not your code.
- Persona JSON in the system prompt is enough to ground a small chatbot. RAG is overkill at this scale.
- Stream once you have something working. Don't optimize before it's correct.
A chatbot that calls one model is the entry point. The next step is an agent that calls tools, plans, and acts. Google's Agent Development Kit (ADK) is the production framework for that.
- Building AI Agents with ADK: The Foundation — start here.
- Tools Make an Agent — turn the chatbot into a tool-using assistant.
- MCP, ADK and A2A together — combine all three concepts in one project.