Category: General

What I Learned While Trying to Build a Custom Voice Phone Agent with Twilio, ElevenLabs, and OpenAI
Building a Phone Agent with Twilio and ElevenLabs: What I Learned

I wanted a phone number that people could call, talk to naturally, and get a response in a cloned voice. The stack sounded simple at first, but the actual architecture turned out to have a few very different paths, each with tradeoffs.

What I ended up learning is that Twilio and ElevenLabs can be combined in more than one way, and the choice changes who owns the conversation logic, who handles speech-to-text, and where the voice audio actually comes from.

The Goal

My goal was straightforward:
- Twilio provides the phone number and call routing
- the caller speaks into the phone
- the system transcribes the speech
- an LLM decides the response
- the response is spoken back in a cloned voice
That sounds like one system, but there are really three different ways to build it.

Option 1, Twilio ConversationRelay

ConversationRelay is the lowest-friction path if you want to move fast. Twilio handles the phone side, and it gives you a tight voice loop with strong latency characteristics.

The catch is that it is opinionated. In my testing, it worked well with standard voices, but it did not fit my custom cloned ElevenLabs voice setup the way I wanted.

So the tradeoff is clear:
- strong latency
- simple Twilio integration
- less flexibility around custom voice behavior
If your priority is speed and you are okay with the platform’s voice constraints, this is a solid path.

Option 2, ElevenLabs Agents

ElevenLabs Agents gave me the next best thing: low-latency voice-agent behavior, but with support for custom voices.

This was the first path where my cloned voice actually made sense. I had to get the ElevenLabs side configured correctly, including plan access for the cloned voice, and once that was in place, the agent could speak in my own voice.

What I learned here is important:
- ElevenLabs can own the agent runtime
- ElevenLabs can also own the LLM layer
- Twilio becomes the transport layer for the phone call
That means if you use ElevenLabs Agents, you are not just buying text-to-speech. You are buying the whole voice agent stack.

The upside is simplicity. The downside is that your conversation logic lives more inside ElevenLabs than inside your own app.

Option 3, Build It Yourself

This is the path I originally assumed I would take: Twilio handles calls, my Node app handles STT, OpenAI handles the LLM, and ElevenLabs only does TTS.

That architecture is absolutely valid.

It looks like this:
- Twilio receives the call
- Twilio streams audio to my server
- my server transcribes audio
- OpenAI generates the response
- ElevenLabs synthesizes the response in my cloned voice
- audio is sent back into the call
This gives you the most control, because you own the logic, prompts, tool calls, business rules, and state.

But it also means you own the full media pipeline. That is where complexity shows up:
- audio formats have to match
- streaming has to stay real-time
- you need a bridge between generated audio and the live Twilio call
- latency depends on how well you implement every hop
So this is not automatically “better,” it is just more flexible.

The Key Architectural Difference

The main question is: who owns the conversation?
- If you use ConversationRelay, Twilio manages a lot of the voice plumbing, but you lose flexibility around custom voices.
- If you use ElevenLabs Agents, ElevenLabs manages the voice agent runtime, and custom voices work well.
- If you build it yourself, you own everything, including the LLM, but you also own all the streaming and latency work.
My Takeaway

For a quick, managed setup, ConversationRelay is the easiest.

For custom voice with a managed agent stack, ElevenLabs Agents is the best fit.

For maximum control, roll your own, but expect more work.

The real lesson is that Twilio is the call layer, and ElevenLabs is either the voice layer or the full agent layer depending on how you use it.
June 14, 2026
A Simple Mental Model for Microsoft AI
Microsoft’s AI ecosystem is confusing.

Not because the tools are bad, but because the naming overlaps everywhere. Copilot, Microsoft 365 Copilot, Copilot Studio, Azure AI Foundry, Fabric, Power Apps, Agent 365, Agent Factory , they all sound related, and they are, but Microsoft’s documentation does not always make the relationship clear.

This is my simple mental model after researching the Microsoft AI offering.

1. Copilot: where users experience AI

Start here.

Copilot is the user-facing AI experience.

There are two main versions:

Copilot Consumer

This is the personal version of Copilot.

You use it through:
- web
- Edge
- Windows
- personal Microsoft account
Microsoft 365 Copilot

This is the work version.

It lives inside Microsoft 365 and connects to work tools like:
- chat
- Word
- Excel
- Outlook
- Teams
- files
- meetings
- emails
- tasks
My simple explanation:

Microsoft 365 Copilot is the AI interface for work.

It is where users chat, search, summarize, create, use agents, and work with company context.

2. Agents: how Microsoft lets you extend Copilot

Inside Microsoft 365 Copilot, you can create or use agents.

An agent is a focused AI assistant for a specific task.

The easiest way to understand agent creation is by complexity:
1. New Agent Simple, no-code, built inside Copilot 2. Microsoft Copilot Studio More advanced no-code / low-code agent builder 3. Microsoft 365 Agents SDK / Azure AI Foundry Code-based developer approach
That is the most useful ladder.

New Agent

This is the easiest option.

You click New Agent and create a basic agent without writing code.

Copilot Studio

This is deeper than New Agent.

It is still no-code / low-code, but gives more control. You can build, test, update, publish, and manage agents.

Agents can plug into:
- Microsoft 365 Copilot
- Teams
- websites
- custom apps
My simple explanation:

Copilot Studio is the serious no-code agent builder.

Microsoft 365 Agents SDK / Azure AI Foundry

This is the developer path.

Use this when you need code, custom logic, backend control, models, data connections, security, and more advanced AI systems.

My simple explanation:

Azure AI Foundry is the AI backend for developers.

3. Microsoft-built agents and Frontier features

Microsoft also has built-in or Microsoft-created agents.

Some are marked as Frontier, which basically means early, experimental, or beta-style features.

Example:

App Builder

App Builder is a Frontier agent that creates an app.

My simple explanation:

App Builder is an agent that helps create lightweight business apps.

It seems to combine ideas from:
- Copilot
- Copilot Studio
- Power Apps
You use it from Microsoft 365 Copilot, but the result feels like a lightweight Power Apps-style application.

4. Power Apps: internal business apps

Power Apps is Microsoft’s low-code tool for building internal business apps.

Examples:
- timesheet app
- IT support ticket app
- PTO request app
- inventory tracker
- approval workflow
My simple explanation:

Power Apps is for building internal business apps without traditional software development.

This is different from Copilot Studio.
Copilot Studio = build agents Power Apps = build apps
But now AI is making these areas overlap.

That is part of the confusion.

5. Microsoft Fabric: business data

Microsoft Fabric is the data platform.

It is for storing, processing, analyzing, and connecting business data.

Related ideas:
- OneLake
- Power BI
- analytics
- Fabric IQ
- data agents
- business data for AI
My simple explanation:

Fabric is where business data can live so analytics and AI can use it.

Important note: Fabric does not magically contain all company data. It has to be enabled, set up, governed, and paid for.

6. Work IQ: work context

Work IQ is not really something I think of as a normal product.

It is more like Microsoft’s intelligence layer for work context.

It helps Copilot understand:
- emails
- files
- meetings
- chats
- tasks
- permissions
- tools
- organizational knowledge
My simple explanation:

Work IQ is the context layer behind Microsoft 365 Copilot.

I would not explain it as an app. I would explain it as the thing that helps Copilot understand work.

7. Agent 365: managing agents

Microsoft Agent 365 is for managing agents across an organization.

My simple explanation:

Agent 365 is the control plane for agents.

It is about:
- governance
- security
- discovery
- management
- compliance
- lifecycle
- shadow agents
You do not start here as a normal user. This becomes important when a company has many agents.

8. Agent Factory: enterprise agent program

Microsoft Agent Factory sounds like another product, but I think of it more as Microsoft’s enterprise program for scaling agents.

It connects many Microsoft AI pieces together:
- Microsoft 365 Copilot
- Copilot Studio
- Azure AI Foundry
- Fabric
- GitHub Copilot
- Azure AI Search
- governance tools
My simple explanation:

Agent Factory is Microsoft’s enterprise approach for helping companies build and scale agents.

It is not the same as Agent 365.
Agent 365 = govern agents Agent Factory = scale agent adoption
Quick reference
Copilot = consumer AI assistant Microsoft 365 Copilot = AI assistant for work New Agent = easiest no-code way to create an agent Copilot Studio = advanced no-code / low-code agent builder Microsoft 365 Agents SDK = developer SDK for custom agents Azure AI Foundry = AI backend platform for developers Power Apps = low-code business app builder App Builder = Copilot agent that creates lightweight apps Microsoft Fabric = business data and analytics platform Work IQ = work context layer behind Copilot Agent 365 = governance and management for agents Agent Factory = enterprise program for scaling agent adoption
The shortest version

If you are new to Microsoft AI, think of it like this:
Use AI: Microsoft 365 Copilot Create a simple agent: New Agent Create a serious no-code agent: Copilot Studio Build custom AI with code: Azure AI Foundry / Agents SDK Build internal apps: Power Apps / App Builder Use business data: Microsoft Fabric Manage agents: Agent 365 Scale agents across the company: Agent Factory
This is the mental model that helped me understand the clutter.

Microsoft’s documentation explains the pieces, but not always the map. This is my map.
June 7, 2026
Is the MacBook Pro M5 Worth It for AI Apps?

My 2018 Mac mini is starting to show its age.

It has served me well, especially with the i7 processor and 32 GB of RAM, but it is beginning to struggle with heavier workloads and too many browser tabs. The bigger issue, though, is compatibility. Many of the newer AI development tools, including apps like Codex, are moving away from Intel-based Macs and are no longer supported on older Intel chips.

So I decided it was time for an upgrade.

This is the new MacBook Pro with the M5 chip and 64 GB of memory. Apple has been positioning these machines as capable AI workstations, and I’m curious to see how well that claim holds up in real-world use.

I’m especially interested in experimenting with local AI tools such as Ollama, LM Studio, OpenCode, Open WebUI, and AnythingLLM. Based on my research, this machine should be able to run models such as Qwen3 30B, Gemma 31B, and possibly Llama 3.3 70B, depending on quantization, memory requirements, and performance expectations.

Over the next few weeks, I’ll be testing how practical this setup is for local AI development, coding assistance, and running large language models directly on the MacBook Pro.

I’ll post my findings as I go.

May 25, 2026
Lessons Learned After My WordPress Site Hack

Hi everyone! My last WordPress site was hacked and I lost all my data, so I’m starting fresh.

I’ve learned the hard way that self-hosting isn’t always worth it. Keeping everything updated takes too much time, and it eventually became a security risk.

This time, I’m running my blog on WordPress.com. I needed something low-cost and fully managed so I don’t have to deal with security patches anymore. Let’s see how it goes!

Let’s see how it works out this time.

April 5, 2026