Breaking Apps with AI: How I Used Passmark to Stress Test Cal.com Automatically
Using Passmark AI Testing Library by Bug0 , to Explore, Interact With, and Stress Test Cal.com

Introduction
Background and Overview
Modern web applications are becoming increasingly dynamic and interactive, with today’s platforms heavily relying on real-time UI updates, dynamic forms, complex authentication flows, interactive components, API-driven navigation, and client-side rendering to deliver fast and responsive user experiences.
As applications grow more complex, traditional software testing becomes more difficult to maintain. Conventional automation frameworks generally rely on hardcoded test scenarios such as:
Click Button A → Validate Page B → Check Element C
While traditional automation remains effective for regression validation, it often struggles when applications behave unpredictably or when exploratory testing is required. This is where AI-powered testing becomes particularly valuable, introducing dynamic exploration, autonomous interaction, adaptive browser behavior, and real-time decision making instead of relying solely on rigid scripted flows.
The concept is simple:
“What if an AI could behave like a QA engineer and explore an application automatically?”
That question became the motivation behind this experiment. In this hands-on project, I used Passmark — an open-source AI testing library by Bug0 — to autonomously stress test Cal.com using browser automation and AI-generated interaction logic.
The objective was not destructive testing or exploitation. Instead, the goal was to observe:
How AI explores a real application
How autonomous browser interaction behaves
What types of UI weaknesses can appear
Whether AI-generated actions can simulate exploratory QA
What is Passmark?
Passmark is an open-source AI testing framework designed to automate exploratory and regression testing using AI-assisted browser interaction. Rather than relying entirely on manually written test cases, Passmark introduces a more autonomous approach to QA automation.
Conceptually, Passmark behaves like:
An AI QA engineer that continuously explores an application.
Instead of following a strict testing script, the AI can observe UI elements, decide what to click, trigger interactions, navigate pages, simulate user behavior, and generate testing reports dynamically during execution. This makes the approach particularly interesting for exploratory testing, UI stress testing, regression validation, DevSecOps experimentation, and autonomous browser interaction research.
Why I Chose Cal.com
I selected Cal.com because it represents a realistic modern SaaS application with multiple interactive workflows, including dynamic scheduling interfaces, interactive calendars, navigation-heavy UI components, authentication systems, form interactions, and modal-based workflows.
These characteristics make it a perfect target for exploratory AI testing. Applications like this often contain hidden edge cases involving:
State synchronization
Timing issues
UI transitions
Form validation
Interaction loops
Unexpected navigation behavior
From a testing perspective, this creates an ideal playground for autonomous browser exploration.
Use Case & Flow Architecture
Setting Up the Foundation
The experiment environment was intentionally lightweight. The testing stack consisted of:
| Component | Purpose |
|---|---|
| Node.js | Runtime environment |
| Playwright | Browser automation |
| Gemini AI | AI-generated testing logic |
| Chromium | Browser execution |
| Passmark Concept | Autonomous exploration approach |
The initial setup process was straightforward.
Installing Dependencies
npm install
npx playwright install
The browser engine used throughout the experiment was Chromium running through Playwright.
Architecture Overview
The testing flow combines AI reasoning with browser automation. Instead of manually defining every testing step, the AI dynamically generates browser actions during runtime.
The architecture flow looked like this:
This architecture effectively transforms AI-generated instructions into executable browser interactions.
Autonomous Exploration Strategy
One of the most interesting aspects of the experiment is that the browser actions were not hardcoded. Instead, the AI dynamically generated interaction ideas. Examples included:
[
"Click a link",
"Type text into field",
"Reload current page",
"Navigate to a URL",
"Spam an add button"
]
This creates behavior that feels significantly closer to exploratory QA compared to traditional scripted automation.
The AI behaves less like:
A rigid automation script
and more like:
A curious QA engineer exploring an application
Test Scenario Design
The experiment focused on several autonomous interaction scenarios.
| Scenario | Objective |
|---|---|
| Random Clicking | Explore unexpected navigation paths |
| Invalid Input | Test validation handling |
| Spam Clicking | Stress interaction logic |
| Reload Action | Observe state persistence |
| Random Navigation | Validate transition handling |
Launching the AI Tester
The testing logic was implemented using Playwright and Google Gemini AI.
The test file:tests/ai-calcom.spec.ts
started by importing the required modules.
import 'dotenv/config';
import { test, expect, Page } from '@playwright/test';
import { GoogleGenAI } from '@google/genai';
These modules provide:
| Module | Purpose |
|---|---|
| dotenv | Environment variable loading |
| Playwright | Browser automation |
| GoogleGenAI | AI action generation |
The .env integration is important because the Gemini API key should never be hardcoded directly into the source code.
The script defines a global timeout for the entire autonomous test session.
test.setTimeout(120000);
Without extending the timeout, Playwright might terminate the session prematurely.
Initializing Gemini AI
The next step initializes the Gemini model connection.
const ai = new GoogleGenAI({
apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY
});
This creates a connection between the test framework and Google Gemini AI. Instead of manually defining browser actions, the script will request testing instructions dynamically from the AI model.
Starting the Autonomous Test
The main Playwright test block defines the autonomous browser testing session.
test(
'AI autonomously stress tests Cal.com',
async ({ page }: { page: Page }) => {
This launches:
Chromium browser instance
Playwright execution context
AI interaction workflow
The page object becomes the primary interface for all browser actions.
The script begins by opening Cal.com.
await page.goto('https://cal.com/', {
waitUntil: 'domcontentloaded'
});
Playwright:
Launches Chromium
Opens the target application
Waits until the DOM content finishes loading
The script then introduces an intentional stabilization delay.
await page.waitForTimeout(3000);
Without stabilization time, the AI might interact with incomplete UI states.
Designing the AI Prompt
One of the most important parts of the experiment is prompt engineering. The script defines a structured instruction for Gemini.
const prompt = `
You are an AI QA engineer.
Generate 5 SIMPLE browser testing actions.
Rules:
- short sentence only
- maximum 5 words
- realistic browser interaction
- executable in UI testing
Allowed actions:
- clicking
- typing
- reload
- navigation
- repeated clicking
Return ONLY valid JSON array.
Example:
[
"click random button",
"fill invalid input",
"reload page",
"spam submit button",
"random navigation"
]
`;
The prompt intentionally narrows the output into actionable browser interactions.
Generating AI Testing Actions
The next step sends the prompt to Gemini.
const result = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: prompt
});
Gemini processes the instruction and dynamically generates browser testing ideas. This transforms the system from Static automation into AI-assisted exploratory testing.
Processing AI Output
The AI response is extracted from the model output.
const response = result.text || '';
Example AI output:
[
"Click a link",
"Type text into field",
"Reload current page",
"Navigate to a URL",
"Spam an add button"
]
The generated JSON is parsed inside a try/catch block.
actions = JSON.parse(
response
.replace(/```json/g, '')
.replace(/```/g, '')
.trim()
);
If parsing fails, the script automatically falls back to predefined actions.
actions = [
'click random button',
'fill invalid input',
'spam submit button',
'reload page',
'random navigation'
];
Autonomous Action Execution Engine
The core of the experiment is the autonomous execution loop.
Instead of relying on predefined browser flows, the framework dynamically interprets AI-generated testing instructions and converts them into real browser interactions.
The execution engine processes every action generated by Gemini AI one by one.
for (const action of actions) {
console.log(`Executing: ${action}`);
try {
// SAFETY CHECK
if (page.isClosed()) {
console.log('Page already closed');
break;
}
const lowerAction = action.toLowerCase();
// CLICK ACTION
if (
lowerAction.includes('click') ||
lowerAction.includes('button')
) {
const elements =
await page.locator('button, a').all();
if (elements.length > 0) {
const randomElement =
elements[
Math.floor(Math.random() * elements.length)
];
try {
await randomElement.click({
timeout: 3000,
force: true
});
console.log('Random click executed');
} catch (error) {
console.log('Random click failed');
}
}
}
// INPUT ACTION
if (
lowerAction.includes('input') ||
lowerAction.includes('fill') ||
lowerAction.includes('type')
) {
const inputs =
await page.locator('input, textarea').all();
for (const input of inputs) {
try {
await input.fill(
'@@INVALID_PAYLOAD###'
);
} catch {}
}
console.log('Invalid input injected');
}
// SPAM CLICK ACTION
if (
lowerAction.includes('spam') ||
lowerAction.includes('repeated')
) {
const buttons =
await page.locator('button').all();
if (buttons.length > 0) {
const button = buttons[0];
for (let i = 0; i < 5; i++) {
try {
await button.click({
force: true
});
} catch {}
}
console.log('Spam click executed');
}
}
// RELOAD ACTION
if (
lowerAction.includes('reload')
) {
await page.reload({
waitUntil: 'domcontentloaded'
});
console.log('Page reloaded');
}
// NAVIGATION ACTION
if (
lowerAction.includes('navigation')
) {
await page.goto('https://cal.com/', {
waitUntil: 'domcontentloaded'
});
console.log('Navigation executed');
}
// RANDOM DELAY
const randomDelay =
Math.floor(Math.random() * 2000) + 1000;
await page.waitForTimeout(randomDelay);
} catch (error) {
console.log(`Action failed: ${action}`);
console.log(error);
}
}
This block acts as the central decision-making and execution layer of the AI testing framework. The workflow begins when Gemini generates browser actions such as:
[
"Click a link",
"Type text into field",
"Reload current page",
"Navigate to a URL",
"Spam an add button"
]
The framework then loops through every generated instruction dynamically using:
for (const action of actions)
Unlike traditional automation frameworks that depend entirely on predefined flows, this approach allows the browser behavior to change depending on AI-generated decisions.
Before any interaction begins, the script validates whether the browser page is still active.
if (page.isClosed()) {
break;
}
For click-based actions, the framework scans the entire UI dynamically.
await page.locator('button, a').all();
A random element is then selected:
Math.floor(Math.random() * elements.length)
The selected element is clicked automatically.
The AI also performs invalid form interaction using:
await input.fill(
'@@INVALID_PAYLOAD###'
);
This simulates aggressive or malformed user behavior. This is especially useful when testing modern SaaS applications containing complex forms and interactive workflows.
Another important stress-testing technique implemented in the framework is repeated clicking.
for (let i = 0; i < 5; i++)
The framework repeatedly clicks the same button rapidly.
This helps simulate:
Impatient users
Rapid interaction behavior
UI flooding scenarios
Repeated interactions can expose:
Race conditions
Duplicate request problems
Debounce weaknesses
State synchronization issues
The framework also performs page reloads and forced navigation.
Reload logic:
await page.reload({
waitUntil: 'domcontentloaded'
});
Navigation logic:
await page.goto('https://cal.com/', {
waitUntil: 'domcontentloaded'
});
These actions help evaluate:
Session persistence
UI recovery behavior
State restoration
Navigation stability
To avoid deterministic interaction patterns, the framework introduces random delays between actions.
const randomDelay =
Math.floor(Math.random() * 2000) + 1000;
This creates interaction timing that behaves more similarly to real human users rather than perfectly synchronized automation scripts. The randomized timing also improves exploratory behavior by allowing the application state to evolve naturally between interactions.
Screenshot & Reporting
At the end of execution, the framework captures a final screenshot.
await page.screenshot({
path: 'final-result.png',
fullPage: true
});
Final Validation
The script validates that the browser is still within the expected domain.
await expect(page).toHaveURL(/cal.com/);
This confirms:
Navigation remained valid
Browser session survived
Test execution completed successfully
Running the Autonomous AI Test
The complete test is executed using:
npx playwright test tests/ai-calcom.spec.ts --headed
The --headed flag visually displays browser activity in real time.
This allows observation of:
AI interactions
Browser movement
Autonomous exploration
UI transitions
Execution Result
The final output demonstrated successful autonomous execution.
unning 1 test using 1 worker
1 tests\ai-calcom.spec.ts:14:5 › AI autonomously stress tests Cal.com
=================================
AI AUTONOMOUS TEST STARTED
=================================
Launching browser...
Generating AI actions...
=================================
GEMINI OUTPUT
=================================
```json
[
"Click a link",
"Type text into field",
"Reload current page",
"Navigate to a URL",
"Spam an add button"
]
```
=================================
EXECUTING ACTIONS
=================================
Executing: Click a link
Random click executed
Executing: Type text into field
Invalid input injected
Executing: Reload current page
Page reloaded
Executing: Navigate to a URL
Executing: Spam an add button
Random click executed
Spam click executed
=================================
FINALIZING TEST
=================================
Screenshot saved: final-result.png
=================================
AI TESTING COMPLETED
=================================
The entire browser testing lifecycle was successfully completed autonomously using AI-generated instructions and Playwright execution logic.
Generating the HTML Report
After execution, Playwright automatically generated an HTML report.
npx playwright show-report
This launched a local reporting dashboard:
http://localhost:9323
The report included:
Execution logs
Interaction timelines
Screenshots
Test duration
Browser traces
Demo: AI Testing Cal.com
Once the test commenced, the browser autonomously navigated, dynamically generating actions. The AI selected random UI elements, interacted with forms, triggered reloads, and explored the application without predefined navigation paths.
Github repository : Passmark-calcom
Conclusion
This experiment showcased how AI-driven browser automation can greatly improve exploratory testing. By integrating Passmark concepts with Playwright and Gemini AI, it created a workflow that dynamically explores UI flows, simulates unpredictable user interactions, stress tests application behavior, and generates automated execution reports. This approach transforms the testing process from rigid scripted testing into a more realistic user exploration of the application.
Most importantly, the testing process felt less like:
Scripted automation
and more like:
Autonomous exploratory QA
The experiment underscored a crucial point:
'" I testing is not a replacement for QA engineers but an enhancement."
Combining human intuition with AI-driven exploration results in a robust testing approach for modern applications.
#BreakingAppsHackathon #Passmark #Hackathon #Bug0 #GeminiAI #Hashnode



