How I made my chatbot understand current events
So here's what I was dealing with. Gemini (the AI model I'm using) has a knowledge cutoff in early 2025. That means if you ask it about anything that happened after that date, it literally doesn't know. It can't tell you today's weather, current stock prices, recent news, or who won yesterday's game.
For a student using an educational chatbot, this is a huge problem. Imagine asking about a recent scientific discovery or current events for a homework assignment, and getting "I don't have that information" or worse, the AI making something up.
Instead of accepting this limitation, I built a system that gives the AI access to current information. The approach is pretty straightforward - when someone asks a question that needs recent data, the chatbot searches the web first, then uses those results to answer.
First thing I needed was a way to figure out which questions actually need current data. I wrote a detection function:
function needsRealTimeData(message) {
const lower = message.toLowerCase();
const realTimeKeywords = [
'today', 'now', 'current', 'currently', 'latest', 'recent', 'recently',
'this week', 'this month', 'this year',
'2024', '2025',
'news', 'update', 'happening', 'going on', 'breaking',
'price', 'stock', 'weather', 'score', 'result', 'live',
'status', 'situation', 'development', 'announcement',
'just', 'yesterday', 'last week', 'last month'
];
if (realTimeKeywords.some(keyword => lower.includes(keyword))) {
return true;
}
const currentQuestionPatterns = [
/what('s| is) (the )?(latest|current|today|new|happening)/i,
/who (is|are) (the )?(current|now)/i,
/how (much|many) (is|are|does|cost)/i,
/when (is|did|will|does)/i,
];
if (currentQuestionPatterns.some(pattern => pattern.test(message))) {
return true;
}
return false;
}
One simple but important thing I do is inject the current date into every conversation. This helps the AI understand temporal context:
function getCurrentDateTime() {
const now = new Date();
return {
utc: now.toUTCString(),
ist: now.toLocaleString('en-IN', { timeZone: 'Asia/Kolkata' }),
timestamp: now.toISOString(),
year: now.getUTCFullYear(),
month: now.toLocaleString('en-US', { month: 'long' }),
day: now.getUTCDate(),
dayOfWeek: now.toLocaleString('en-US', { weekday: 'long' }),
unixTimestamp: Math.floor(now.getTime() / 1000)
};
}
Then I include this in the system prompt:
Current Date: ${dateTime.dayOfWeek}, ${dateTime.month} ${dateTime.day}, ${dateTime.year}
Current Time (IST): ${dateTime.ist}
This seems basic, but it's actually really important. Without it, the AI might not even realize when a question is time-sensitive.
Different types of questions need different approaches. I categorize queries to optimize the search:
function detectCategory(message) {
const lower = message.toLowerCase();
if (needsRealTimeData(lower)) {
return 'realtime';
}
if (/\b(news|today|current|recent|latest|happening|breaking)\b/i.test(lower)) {
return 'news';
}
if (/\b(math|calculus|algebra|geometry|equation|formula|theorem)\b/i.test(lower)) {
return 'math';
}
if (/\b(biology|chemistry|physics|science|experiment|molecule|atom)\b/i.test(lower)) {
return 'science';
}
if (/\b(code|programming|javascript|python|function|algorithm|debug)\b/i.test(lower)) {
return 'programming';
}
return 'general';
}
Not all websites are equally reliable, especially for students. I maintain lists of trusted sources by category:
const TRUSTED_SOURCES = {
general: [
'wikipedia.org', 'britannica.com', 'khanacademy.org',
'coursera.org', 'edu'
],
science: [
'ncbi.nlm.nih.gov', 'nature.com', 'sciencedirect.com',
'arxiv.org', 'scientificamerican.com'
],
math: [
'wolframalpha.com', 'mathworld.wolfram.com', 'brilliant.org'
],
programming: [
'stackoverflow.com', 'github.com', 'mdn.mozilla.org',
'w3schools.com', 'geeksforgeeks.org'
],
news: [
'bbc.com', 'reuters.com', 'apnews.com', 'theguardian.com',
'cnn.com', 'cnbc.com', 'news'
],
};
When search results come back, I flag which ones are from trusted domains:
const trustedDomains = [
...TRUSTED_SOURCES.general,
...(TRUSTED_SOURCES[category] || []),
...(TRUSTED_SOURCES.news || [])
];
const finalResults = results.map(r => ({
...r,
trusted: trustedDomains.some(domain =>
r.url.toLowerCase().includes(domain)
),
provider: 'Serper'
}));
Here's how everything works together in the actual endpoint:
chatRoutes.post('/message', async (c) => {
const { message, enableResearch = true } = await c.req.json();
const sanitizedMessage = sanitizeInput(message);
const dateTime = getCurrentDateTime();
// Check if we need real-time data
let searchResults = null;
if (enableResearch && needsRealTimeData(sanitizedMessage)) {
const category = detectCategory(sanitizedMessage);
const searchQuery = buildSearchQuery(sanitizedMessage, category);
console.log(`Research enabled: "${searchQuery}" [${category}]`);
searchResults = await performWebSearch(searchQuery, category, c.env);
if (searchResults && searchResults.length > 0) {
console.log(`Using ${searchResults.length} real-time sources`);
}
}
// Build system prompt with current date and search results
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({
model: 'gemini-2.5-flash',
systemInstruction: buildSystemPrompt(dateTime, searchResults)
});
// Generate response
const result = await model.generateContent(sanitizedMessage);
const text = result.response.text();
return c.json({
success: true,
response: text,
timestamp: dateTime.timestamp,
sources: searchResults || [],
researchPerformed: !!searchResults,
searchProvider: searchResults ? 'Serper' : null,
resultCount: searchResults?.length || 0
});
});
There were a bunch of edge cases I had to handle:
Sometimes the search API takes too long. I added timeout handling:
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 8000);
const response = await fetch('https://google.serper.dev/search', {
method: 'POST',
headers: {
'X-API-KEY': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({ q: query, num: 5 }),
signal: controller.signal
});
clearTimeout(timeoutId);
What if the search returns nothing? I handle that gracefully:
if (!results || results.length === 0) {
console.warn('No search results found');
return null;
}
// In the main handler:
if (searchResults && searchResults.length > 0) {
console.log(`Using ${searchResults.length} real-time sources`);
} else {
console.warn('No search results - Check SERPER_API_KEY');
}
When someone asks about current events without specifying a year, I automatically add it:
if (category === 'realtime' || category === 'news') {
if (!query.match(/202[4-5]/)) {
query += ' 2025';
}
}
This prevents getting outdated results from previous years.
If I was starting over or had more time:
The key insight is that you don't need to retrain the entire model to give it current information. You just need to:
This is way more practical than trying to continuously retrain models, and it keeps the chatbot useful for students who need accurate, current information for their assignments.
Back to Research Overview