I Built an AI That
Writes My Blog Posts
A voice-to-blog pipeline using Whisper, GPT-4, and DALL-E. Because typing is overrated.
The Problem
I Have Thoughts. I Hate Typing Them.
Here's the thing about blogging: I actually have plenty of things I want to write about. The problem is the writing part. Sitting down, opening a blank document, staring at the cursor blinking judgmentally at me — it's the worst part of having a personal website.
But talking? Talking I can do. I ramble to myself in the car all the time. What if I could just... record those rambles and have AI turn them into actual content? It sounded like either a brilliant idea or a very lazy one. Maybe both.
The goal was simple: build a system where I could record a 2-minute voice memo, hit a button, and have a fully-formatted blog post with a custom header image appear on my site. No typing, no image hunting, no excuses for not updating my blog.
Spoiler alert: it actually worked. Then I forgot to use it for 90 days and Supabase paused my database. But hey, the technology was sound.
The Stack
What Powered This Thing
The Pipeline
Voice → Text → Blog → Image → Done
The architecture was surprisingly straightforward — at least in concept. Each piece of the puzzle handled one specific job, and they all fed into each other like a content assembly line.
How It All Works
- 1Record voice memo in the browser using MediaRecorder API
- 2Upload audio blob to Next.js API route
- 3Send to OpenAI Whisper for transcription
- 4Pass transcript to GPT-4 with blog-writing prompt
- 5Extract title and generate DALL-E image prompt
- 6Create header image with DALL-E 3
- 7Save everything to Supabase with timestamps
- 8Display on blog with TipTap rich text rendering
Interactive Demo
See It In Action
Since the actual service is currently hibernating (RIP free tier), here's a simulation of what the pipeline looked like when it was running. The real version was slightly slower because, you know, actual AI processing takes time.
Content Pipeline Demo
const createPost = async (audioBlob) => {
// 1. Transcribe with Whisper
const transcript = await whisper.transcribe(audioBlob);
// 2. Expand with GPT-4
const content = await gpt4.complete({
prompt: `Write a blog post based on: ${transcript}`,
style: "conversational, engaging"
});
// 3. Generate header image
const image = await dalle.generate(content.title);
// 4. Save to database
await supabase.from('posts').insert({
title: content.title,
content: content.body,
header_image_url: image.url
});
}Features
What It Could Do
Voice-to-Post
Record a quick voice memo and watch it transform into a polished blog post.
AI Writing Assistant
GPT-4 expands your rough thoughts into engaging, well-structured content.
Auto-Generated Images
DALL-E creates unique header images based on your post content.
Rich Text Editor
TipTap-powered editor for manual tweaks and formatting.
One-Click Publish
From voice memo to live post in under a minute.
Supabase Backend
Real-time database with authentication and storage.
Challenges
Things That Went Wrong
Building with AI is fun until you realize these models are simultaneously brilliant and completely unhinged. Here's what I learned the hard way.
Lessons Learned
Was It Worth It?
Absolutely. Even if the project is currently in database purgatory, I learned a ton about building with AI APIs, handling audio in the browser, and the importance of actually using the things you build.
The core idea is sound: AI can genuinely help bridge the gap between "I have thoughts" and "those thoughts are published." The technology is there. The execution worked. The only thing that failed was my commitment to actually recording voice memos.
What I'd Do Differently
- Add a mobile app for easier voice recording on the go
- Set up reminder notifications to actually use the thing
- Use a database that doesn't hibernate (or pay for the premium tier)
- Build in a "draft" mode for posts that need human editing first
If you want to build something similar, the stack is solid: Next.js for the frontend and API routes, OpenAI for the AI magic, and Supabase for the backend. Just remember to actually use it once in a while. Or pay for the premium tier. Or both.
Brian Stever