PersonaGen

A synthetic persona generation API that creates statistically grounded demographic profiles for AI research, testing, and development.

Date
2025
Type
Personal Project
Stack
Next.js (App Router)
TypeScript
PostgreSQL
Tailwind
BetterAuth
Nextra
Bun
Elysia
Drizzle ORM
Railway
OpenRouter
PersonaGen hero page with 'Generate Persona' call-to-action button and empty placeholder showing the 62+ demographic dimensions available for generation.

Project Purpose and Goal

PersonaGen started as a random idea that popped into my head. I thought it had potential and was worth exploring. The concept of creating a primitive building block for GenAI applications felt like something that should exist, even though I wasn't entirely sure how it would be applied or whether it would even prove useful. The idea of programmatically generating statistically grounded personas seemed like an interesting challenge to tackle.

While you could theoretically use GenAI to populate personas, I wasn't happy with two fundamental issues: how long it would take and how 'random' the results would be: there's too much inherent bias in LLM outputs. My north star was creating a system that could generate quick, random personas that felt real and plausible. The technical challenge became building an API that could deliver comprehensive demographic profiles in under 100ms while maintaining statistical accuracy across 60+ dimensions, ensuring each persona felt authentic rather than artificially generated.

  • Next.js (App Router)
  • TypeScript
  • PostgreSQL
  • Tailwind
  • BetterAuth
  • Nextra
  • Bun
  • Elysia
  • Drizzle ORM
  • Railway
  • OpenRouter

Tech Stack and Explanation

I went with Next.js and the App Router since I'm comfortable with the stack and it handles SEO well. Rather than pulling in heavy component libraries, I built a custom Tailwind component system with a neo-brutalist design aesthetic. Better Auth handles the dashboard authentication - I find it more straightforward than Next Auth for this kind of setup.

Nextra let me keep the documentation in the same repo while still getting proper markdown support and a clean interface. The backend uses Elysia and Bun because persona generation processes around 160,000 data point, every bit of performance matters when you're aiming for sub-100ms responses. Drizzle with PostgreSQL handles the data layer, and Railway keeps deployment simple.

PersonaGen interface displaying a fully generated synthetic persona with complete demographic profile including age, location, psychology, lifestyle, and physical characteristics.

Problems Encountered

The biggest challenge was managing 60+ demographic dimensions while ensuring every combination felt realistic and statistically sound. With so many interconnected variables, it was crucial that personas didn't veer too far from expected demographic patterns when viewed across larger population subsets. This required complex logic and mathematical relationships between dimensions. I can't reveal too much about the methodology, but it involved significant computational work to maintain authenticity at scale.

To augment and validate the data points, I integrated GenAI through OpenRouter, incorporating synthetic data generation where appropriate. This required extensive A/B testing across different LLM models to ensure the output met my quality and integrity standards. After testing 15-20 different models including Qwen, DeepSeek, GPT, Sonnet, and Mistral, I eventually settled on Grok-3-mini for its balance of cost-effectiveness and performance, but finding the right model took considerable experimentation and prompt engineering.

PersonaGen API key generation settings page showing the interface for creating and managing API keys with usage limits and authentication controls.

Lessons Learnt

Working with data at this scale while trying to simulate human demographics showed me just how complex people really are. You could easily drive yourself crazy trying to model every possible correlation and edge case. At some point you have to accept that you're not creating a simulation of the entire world on your laptop. The biggest challenge was knowing when to stop tweaking and call it good enough.

The most rewarding part wasn't solving the technical challenges, but bringing the entire idea to life. I loved creating the branding, building the website, writing comprehensive documentation, and making it feel like a real product. There's something satisfying about taking a random idea and turning it into something polished that others might actually use. I'm excited to see what people do with it.