[00:00] (0.16s)
So, I recently added an AI agent to my
[00:02] (2.48s)
daily planning app, Ellie, and it can do
[00:04] (4.32s)
things like time box your day, bulkedit
[00:06] (6.72s)
tasks, and basically act as a personal
[00:08] (8.80s)
assistant. This is the first major AI
[00:10] (10.56s)
feature that I ship, but getting it to
[00:12] (12.16s)
the finish line was way harder than I
[00:14] (14.16s)
anticipated. There is so much stuff
[00:15] (15.84s)
people don't tell you when it comes to
[00:17] (17.44s)
shipping AI products, and this is what
[00:19] (19.28s)
this video is going to be about. This is
[00:20] (20.72s)
not a tutorial video. I already have a
[00:22] (22.24s)
step-by-step video on my channel about
[00:24] (24.16s)
how to build an AI agent and build this
[00:26] (26.24s)
feature from scratch. So, check that out
[00:27] (27.76s)
if you want the basics. This is a video
[00:29] (29.28s)
about the stuff that is not covered in
[00:31] (31.12s)
those basic tutorials because building
[00:32] (32.80s)
AI features is a bit different than
[00:34] (34.32s)
traditional software. The cost problems,
[00:36] (36.32s)
the security problems, the design
[00:38] (38.00s)
problems. I'm going to share all the
[00:39] (39.28s)
lessons I learned while getting this
[00:40] (40.56s)
feature to the finish line. If you're
[00:42] (42.08s)
planning on building anything with AI,
[00:43] (43.52s)
these are the walls that you're going to
[00:44] (44.64s)
hit, and I want to make sure that you
[00:46] (46.00s)
see them coming.
[00:49] (49.44s)
Let's start with something that I really
[00:50] (50.88s)
didn't keep in my mind while I was
[00:52] (52.40s)
building this, and that's cost. When I
[00:54] (54.24s)
was building this thing, I really was
[00:55] (55.52s)
just trying to get it over the finish
[00:56] (56.88s)
line. I kind of had costs in mind, but I
[00:58] (58.88s)
wasn't thinking too much of it. But once
[01:00] (60.64s)
I started getting closer to shipping and
[01:02] (62.32s)
I looked at how much the stuff was
[01:03] (63.84s)
costing, just in my case alone, I had
[01:05] (65.84s)
spent over $30 in a single month.
[01:08] (68.16s)
Problem is that the subscription price
[01:09] (69.44s)
of the app is only $10 a month. So I'd
[01:11] (71.36s)
be losing $20 every single month just
[01:13] (73.68s)
through my own usage. So before
[01:15] (75.12s)
launching, I had to sit down and
[01:16] (76.56s)
seriously figure out how to optimize
[01:18] (78.00s)
this. And I'm going to share some of the
[01:19] (79.20s)
stuff that I learned with you guys. So
[01:20] (80.64s)
the first thing was that the system
[01:21] (81.92s)
prompt was way too long. When you're
[01:23] (83.92s)
developing an app, as you're
[01:24] (84.96s)
encountering issues and edge cases,
[01:26] (86.56s)
you're going to start adding things to
[01:27] (87.68s)
the system prompt to get it to function
[01:29] (89.28s)
the way that you want. And in my case,
[01:30] (90.88s)
my system prompt got really long.
[01:32] (92.88s)
Something I didn't consider was that the
[01:34] (94.24s)
system prompt gets sent every single
[01:36] (96.00s)
time you're sending a message. Even if
[01:37] (97.76s)
I'm just saying hi in the chat, that one
[01:39] (99.76s)
word is going to be sent over, but the
[01:41] (101.36s)
entire system prompt would also be sent
[01:43] (103.36s)
along with it, too. And every subsequent
[01:45] (105.20s)
message would include that system
[01:46] (106.56s)
prompt. And all of this does add up over
[01:48] (108.48s)
time. In my case, I kind of went
[01:49] (109.84s)
overboard. The system prompt was almost
[01:51] (111.60s)
8,000 tokens long. So, I did a lot of
[01:53] (113.60s)
optimizations to cut that down to around
[01:56] (116.00s)
3,000. And I still think that that's
[01:58] (118.00s)
pretty long, and there's a lot more that
[01:59] (119.28s)
I can do, but for the time being, it
[02:00] (120.80s)
seems okay for now. The second mistake I
[02:02] (122.40s)
was doing was sending the entire
[02:03] (123.76s)
conversation history with each message.
[02:05] (125.76s)
During testing, this was not a problem
[02:07] (127.20s)
because I was sending maybe two to three
[02:08] (128.96s)
messages at a time. So, the whole chat
[02:10] (130.96s)
was really like six messages total. It
[02:12] (132.96s)
wasn't a big deal. But something I
[02:14] (134.16s)
noticed during actual usage and during
[02:15] (135.92s)
the beta testing was people like to keep
[02:17] (137.92s)
the chat window open for 2 to 3 days.
[02:20] (140.48s)
And those conversations would end up
[02:21] (141.76s)
being 50 plus messages long. So imagine
[02:23] (143.84s)
sending a single message and then the
[02:25] (145.44s)
entire chat history with 50 messages
[02:27] (147.36s)
gets sent along with that. That would
[02:28] (148.80s)
add up a lot over time. And this is
[02:30] (150.40s)
actually where the bulk of the cost was
[02:32] (152.08s)
coming from. There's a ton of ways to
[02:33] (153.44s)
solve it, but the way that I did it was
[02:34] (154.96s)
doing a sort of window technique where I
[02:37] (157.12s)
only send the last couple messages in
[02:39] (159.44s)
the chat to the LLM for processing. I
[02:41] (161.92s)
had to really play around with what that
[02:43] (163.28s)
window felt like. The optimal amount
[02:44] (164.88s)
really depends on the AI app itself and
[02:46] (166.80s)
what the usage is. And in my case, most
[02:49] (169.12s)
people were using it to just send
[02:50] (170.80s)
one-off instructions to an LLM. They
[02:52] (172.72s)
didn't really need much of the
[02:53] (173.60s)
conversation history to do that. So in
[02:55] (175.20s)
my case, I kept it to the last 10
[02:56] (176.64s)
messages, which seemed to work pretty
[02:58] (178.32s)
well. The big issue with this is what if
[03:00] (180.00s)
they ask about something earlier in the
[03:01] (181.68s)
conversation and it's cut off. And yes,
[03:03] (183.20s)
that is a big problem again for the use
[03:04] (184.56s)
case of this assistant. I think most
[03:06] (186.32s)
people won't be doing that. But a
[03:07] (187.76s)
technique I might try to do is
[03:08] (188.88s)
summarizing and compressing the earlier
[03:10] (190.64s)
messages so that they are sent in
[03:12] (192.32s)
context, but it doesn't eat up as many
[03:13] (193.60s)
tokens as sending the entire
[03:14] (194.88s)
conversation. This was just the basics.
[03:16] (196.56s)
There's a ton more I plan on exploring
[03:18] (198.08s)
with cost optimization, but these were
[03:19] (199.84s)
the two biggest things that I did to
[03:21] (201.28s)
really get the cost down.
[03:25] (205.44s)
The next thing I really didn't have on
[03:26] (206.96s)
my mind was how was I going to prevent
[03:28] (208.96s)
abuse? Even not intentional abuse, but
[03:30] (210.96s)
people accidentally abusing the system.
[03:32] (212.80s)
So, an example is there was no limit to
[03:35] (215.12s)
what people can put in the chat box. In
[03:37] (217.04s)
theory, someone could just insert an
[03:38] (218.48s)
entire book in there and then I would be
[03:39] (219.84s)
on the hook for that and it would cost
[03:40] (220.88s)
me like $20 for that single message. or
[03:42] (222.96s)
someone could just spam the chat with a
[03:44] (224.32s)
thousand messages and I would be on the
[03:45] (225.76s)
hook for that too. I had to think
[03:47] (227.04s)
through a couple of these scenarios and
[03:48] (228.48s)
put systems in place to prevent some of
[03:50] (230.56s)
this from happening, whether it be
[03:51] (231.92s)
intentional or non-intentional abuse.
[03:54] (234.00s)
Here's a couple things that I did that
[03:55] (235.28s)
you can implement in your own
[03:56] (236.32s)
application. The first thing I did was
[03:57] (237.84s)
set a message size limit. This is the
[04:00] (240.08s)
max size that a message can be before
[04:01] (241.84s)
it's either truncated or just rejected
[04:03] (243.68s)
by the system. In my case, the max
[04:05] (245.44s)
message size is about 10,000 tokens. And
[04:07] (247.28s)
in real world usage, I have not come
[04:09] (249.36s)
close to that limit. So, I think that's
[04:10] (250.96s)
okay for now. The second thing was to
[04:12] (252.56s)
add some per user rate limits. This is a
[04:15] (255.44s)
limit on the max number of messages that
[04:17] (257.20s)
a user can send every single day and
[04:19] (259.20s)
every single month. So in my case, I
[04:21] (261.12s)
capped it at 100 messages per day and
[04:23] (263.60s)
a,000 messages per month. And again,
[04:25] (265.28s)
it's really dependent on the AI
[04:26] (266.72s)
application and how users are going to
[04:28] (268.56s)
use it. But in my case, I really can't
[04:30] (270.16s)
see people sending more than 100
[04:31] (271.60s)
messages because again, they're really
[04:32] (272.80s)
just using this to send commands for
[04:34] (274.16s)
Ellie. This isn't like chat GPT where
[04:36] (276.00s)
they're going to be sending thousands of
[04:37] (277.28s)
messages a day and having entire
[04:38] (278.64s)
conversations. At least in my case, I
[04:40] (280.48s)
never got close to sending 100 messages
[04:42] (282.24s)
per day. So, I think that's a pretty
[04:43] (283.60s)
good limit for now, but if people
[04:45] (285.04s)
complain, I'm more than happy to raise
[04:46] (286.48s)
that. The third thing I did was to build
[04:48] (288.08s)
a remote kill switch. So, this is the
[04:50] (290.00s)
ability for me to turn off the assistant
[04:52] (292.40s)
for a specific user. I did set up some
[04:54] (294.56s)
analytics using a service called Post
[04:56] (296.08s)
Hog, so I can see how much money is this
[04:58] (298.80s)
app incurring, and I can even break down
[05:00] (300.40s)
and see how much is each specific user
[05:02] (302.48s)
using. If I see someone racking up a
[05:04] (304.16s)
huge bill and it looks kind of
[05:05] (305.52s)
suspicious to me, what I can do is just
[05:07] (307.12s)
press a button, turn it off for them,
[05:08] (308.72s)
and then I can reach out to them and ask
[05:10] (310.08s)
them, "Hey, just checking what are you
[05:11] (311.68s)
doing with this? Why are you sending
[05:12] (312.88s)
this many messages?" And if it looks
[05:14] (314.32s)
legitimate, I'll turn it back on and and
[05:15] (315.76s)
then if it's not, we'll deal with that.
[05:16] (316.96s)
I guess on the note, the fourth thing I
[05:18] (318.16s)
did to prevent abuse was that analytic
[05:19] (319.84s)
system. I do recommend adding some sort
[05:22] (322.16s)
of system. And you could either use
[05:23] (323.28s)
Postgog or you could do this manually,
[05:24] (324.64s)
but you should have a way to view at
[05:26] (326.16s)
minimum how many tokens and how much
[05:27] (327.76s)
money is your app consuming. And if
[05:29] (329.28s)
possible, do that on a per user level so
[05:31] (331.36s)
you can see who is using the most. Is
[05:33] (333.12s)
there something weird going on? I'm very
[05:34] (334.64s)
surprised by the number of apps that
[05:35] (335.92s)
don't have that in place in day one.
[05:39] (339.12s)
[Music]
[05:40] (340.48s)
So the next learning is to not reinvent
[05:42] (342.32s)
the wheel. After my last video, a bunch
[05:43] (343.84s)
of people reached out to me and said,
[05:44] (344.96s)
"Hey, you know, there are libraries out
[05:46] (346.56s)
there that do a lot of the stuff that
[05:48] (348.16s)
you implemented yourself out of the
[05:49] (349.84s)
box." When I built the application in my
[05:51] (351.68s)
first video, I did everything from
[05:53] (353.28s)
scratch from the streaming to the tool
[05:55] (355.04s)
calling to the max number of tool calls
[05:57] (357.12s)
that could happen in a single loop. All
[05:58] (358.64s)
of that stuff was built manually and
[06:00] (360.08s)
from scratch. Then people pointed me to
[06:01] (361.76s)
the Verscell AI SDK. I'd heard about it
[06:04] (364.08s)
in the past, but I was hesitant because
[06:05] (365.36s)
I didn't want to be locked into
[06:06] (366.56s)
anything. But after doing a lot more
[06:07] (367.76s)
research, I realized that it actually
[06:09] (369.20s)
did a lot of the stuff that I did in my
[06:11] (371.68s)
first video out of the box and way
[06:13] (373.52s)
better than I did it myself. It handled
[06:15] (375.12s)
things like being able to do the
[06:16] (376.32s)
streaming correctly with proper error
[06:17] (377.92s)
handling, tool calling with automatic
[06:19] (379.84s)
retries, managing the conversation
[06:22] (382.08s)
state. The system that I built kind of
[06:23] (383.60s)
worked, but there were times when the
[06:24] (384.80s)
stream would fail or some of the tool
[06:26] (386.48s)
callings weren't happening consistently,
[06:28] (388.16s)
but I had a suspicion to get it more
[06:29] (389.60s)
consistent would probably take a lot of
[06:31] (391.04s)
effort. So, I did take a look at the AI
[06:32] (392.80s)
SDK. I did port it over to test and it
[06:35] (395.60s)
actually did solve a lot of the problems
[06:37] (397.12s)
that I was facing. Streaming started
[06:38] (398.64s)
working out of the box very reliably and
[06:40] (400.40s)
the tool calling was way more
[06:42] (402.08s)
consistent, which was a big problem with
[06:43] (403.68s)
the system that I had set up. And the
[06:45] (405.04s)
codebase looked a lot cleaner. So, what
[06:46] (406.64s)
took 100 lines of code in the past ended
[06:48] (408.48s)
up being like 10 lines with the Versel
[06:50] (410.08s)
AI SDK. SDK is completely free. It's
[06:52] (412.80s)
open- source. And this is not sponsored
[06:54] (414.32s)
at all. I just wanted to share this
[06:55] (415.60s)
library because that's what I ended up
[06:57] (417.04s)
using at the end. No regrets doing it
[06:58] (418.64s)
the manual way, though. I did learn a
[07:00] (420.08s)
lot in the process, and it really did
[07:01] (421.60s)
confirm why things like the AI SDK do
[07:04] (424.48s)
exist. And I understand how this stuff
[07:06] (426.00s)
works under the hood a lot better, too.
[07:10] (430.40s)
The next lesson was kind of obvious in
[07:11] (431.92s)
hindsight, but not a lot of people talk
[07:13] (433.68s)
about this. You're probably going to be
[07:14] (434.88s)
using multiple models for a lot of
[07:17] (437.04s)
different things. When I started, I
[07:18] (438.48s)
naively thought, I can do all of this
[07:20] (440.24s)
with one model. I'll probably just use
[07:21] (441.52s)
Gemini Flash or something and it'll all
[07:23] (443.20s)
work perfectly. I wasted a ton of time
[07:25] (445.04s)
tweaking the system prompt, trying to
[07:26] (446.56s)
get it to consistently output or call
[07:28] (448.64s)
specific tools when it turns out it was
[07:30] (450.64s)
actually a problem with the model
[07:31] (451.84s)
itself. Because then when I tried
[07:33] (453.04s)
different models, certain things started
[07:34] (454.80s)
working more consistently. So, that's
[07:36] (456.64s)
something I wish I had a little bit more
[07:37] (457.92s)
of an open mind with going in. It would
[07:39] (459.52s)
have saved a lot of time was that I
[07:41] (461.36s)
would probably be using different models
[07:43] (463.12s)
for different use cases. So, some
[07:44] (464.80s)
specific examples, I ended up using
[07:46] (466.32s)
GPT40 Mini for a lot of stuff because it
[07:49] (469.12s)
seemed to outperform Gemini Flash in
[07:50] (470.96s)
most cases. There were certain tasks
[07:52] (472.56s)
related to time boxing, for example,
[07:54] (474.40s)
that it was really struggling with. So,
[07:56] (476.08s)
I had to use GPT40 for those tasks. And
[07:58] (478.80s)
for something like time boxing, even
[08:00] (480.64s)
GPT40 was struggling with it. And my
[08:03] (483.20s)
suspicion is because the time zones were
[08:05] (485.04s)
kind of confusing it. So, after testing
[08:06] (486.48s)
a bunch of models, I actually ended up
[08:07] (487.76s)
using Grock to do the time boxing stuff.
[08:09] (489.68s)
For some weird reason, Grock was very
[08:11] (491.60s)
consistent at dealing with multiple time
[08:13] (493.84s)
zones. And here's a cool technique that
[08:15] (495.44s)
I learned. You can actually put a layer
[08:16] (496.96s)
before it starts the agent to actually
[08:19] (499.28s)
pick which model to use. So, in my case,
[08:21] (501.36s)
I actually have a layer that's using
[08:22] (502.80s)
Gemini Flash to then choose which model
[08:26] (506.32s)
the agent should be running. So, if it's
[08:28] (508.00s)
a really simple task that doesn't really
[08:29] (509.92s)
involve time, I'll use GPT4 Mini. And if
[08:32] (512.72s)
it's a little bit more complex or
[08:34] (514.00s)
involves time zones, then it switches to
[08:35] (515.68s)
GPT40. big benefit is cost and speed
[08:38] (518.88s)
because then it can default to the
[08:40] (520.08s)
cheaper faster model for simpler use
[08:41] (521.92s)
cases and then only go to the bigger
[08:44] (524.00s)
more expensive model when needed and
[08:45] (525.60s)
then I specifically have a tool that
[08:47] (527.12s)
calls Grock just for the time boxing
[08:49] (529.28s)
stuff and Grock is way more expensive
[08:51] (531.12s)
than GPT40 so I only reserve it for that
[08:53] (533.68s)
task when it's needed. I have a feeling
[08:55] (535.28s)
that in the future I'm probably going to
[08:56] (536.48s)
be calling 10 different models here for
[08:58] (538.16s)
a bunch of different use cases. But that
[08:59] (539.68s)
was a really cool technique is using a
[09:01] (541.20s)
very cheap small model to decide which
[09:03] (543.44s)
model to use based on the user's input.
[09:08] (548.88s)
a couple smaller observations that I had
[09:10] (550.64s)
that I really wanted to share with you
[09:11] (551.92s)
guys. First is that the form factor
[09:13] (553.68s)
actually does matter and I wish I spent
[09:15] (555.52s)
a little bit more time considering that
[09:17] (557.28s)
when building the product. I originally
[09:18] (558.80s)
built the agent just on the web for the
[09:20] (560.88s)
sake of speed thinking that I'd port it
[09:22] (562.48s)
to iOS later, but I should have thought
[09:24] (564.08s)
a little bit harder about where people
[09:25] (565.92s)
were going to be using this agent. The
[09:27] (567.44s)
main use case that I'm seeing so far and
[09:29] (569.36s)
even for myself personally is dictating
[09:31] (571.68s)
quick commands on my phone on the go.
[09:33] (573.60s)
So, I can say something like, "When you
[09:35] (575.04s)
create a task for groceries, add bacon,
[09:36] (576.88s)
eggs, and paper towels to the list, and
[09:38] (578.88s)
it'll just go ahead and do that and
[09:40] (580.48s)
create the task with the relevant
[09:42] (582.48s)
subtasks for me." These actions are so
[09:44] (584.48s)
much nicer with dictation, and it's so
[09:46] (586.00s)
much easier on the mobile version. It's
[09:47] (587.60s)
a small detail, but it's something I
[09:48] (588.96s)
wish I did consider because I could have
[09:50] (590.88s)
launched this a little bit earlier, and
[09:52] (592.32s)
probably mobile first if I'd realized
[09:53] (593.92s)
that sooner. The next observation is
[09:55] (595.60s)
actually pretty cool. It's that
[09:56] (596.88s)
personalization and settings is very
[09:59] (599.20s)
different for AI products than
[10:00] (600.88s)
traditional software. In traditional
[10:02] (602.40s)
software like Ellie, for settings, you
[10:04] (604.00s)
can toggle things like when does the
[10:05] (605.36s)
week start for you or when do you want
[10:07] (607.04s)
to start your day. And these are just
[10:08] (608.24s)
drop downs in the settings menu. But for
[10:10] (610.00s)
AI products, personalization is a little
[10:12] (612.16s)
bit different and actually a lot cooler.
[10:13] (613.92s)
For time boxing preferences for the
[10:15] (615.68s)
user, instead of having a toggle and
[10:17] (617.28s)
drop down for everything, I could just
[10:18] (618.88s)
have a text box and have the user input
[10:20] (620.80s)
whatever preferences they want. So they
[10:22] (622.48s)
can say something like, "I like to go to
[10:23] (623.92s)
the gym in the morning. I like to do all
[10:25] (625.60s)
my personal tasks after work, and I need
[10:28] (628.00s)
a 15-minute break in between each
[10:29] (629.92s)
meeting at minimum." Because at the end
[10:31] (631.36s)
of the day, what I'm going to do is take
[10:32] (632.56s)
this text and inject it into the prompt
[10:34] (634.88s)
so that when the AI is coming up with
[10:36] (636.40s)
the schedule, it just factors all that
[10:38] (638.40s)
stuff in. Maybe I'm alone here, but I
[10:39] (639.84s)
thought that was really cool and it made
[10:40] (640.96s)
me think a lot more about how software
[10:42] (642.88s)
is going to be more personalized in the
[10:44] (644.40s)
future. The last observation was how my
[10:46] (646.32s)
chat and my agent compared to general
[10:48] (648.48s)
tools like chat GPT because a common
[10:50] (650.40s)
thing that I hear from people is what's
[10:52] (652.00s)
the point of building this if Chat GPT
[10:53] (653.84s)
is just going to add this feature or
[10:55] (655.12s)
Claude's going to add this feature. I've
[10:56] (656.72s)
actually used Claude and Chatpt's
[10:58] (658.48s)
calendar integrations to time box and
[11:00] (660.56s)
plan my day. And after using both, I can
[11:02] (662.56s)
say that the experience in Ellie was
[11:04] (664.96s)
completely different than the experience
[11:06] (666.40s)
in Chat EPT. Even though in theory they
[11:08] (668.80s)
do the same thing. At the time of
[11:10] (670.16s)
recording to do something like create a
[11:11] (671.76s)
calendar event in Claude, you type in
[11:13] (673.44s)
the calendar event you want, but it's
[11:14] (674.96s)
going to ask permission to run certain
[11:16] (676.72s)
tools. Then it's going to run the tool.
[11:18] (678.32s)
Then it's going to confirm with you and
[11:19] (679.92s)
then it's going to go create the task.
[11:21] (681.36s)
For some reason, it just feels kind of
[11:22] (682.88s)
clunky and cumbersome with all these
[11:24] (684.32s)
steps. Whereas in Ellie, if I say the
[11:26] (686.16s)
exact same thing, it's just going to do
[11:27] (687.76s)
it and it's just going to make it happen
[11:29] (689.28s)
in one message. I get why they have all
[11:31] (691.04s)
these confirmations. They're building
[11:32] (692.16s)
for a million users, but we are not them
[11:34] (694.32s)
and we can take a little bit more risk
[11:35] (695.84s)
than they can. So, in my case, I did
[11:37] (697.44s)
feel confident enough to just bypass all
[11:39] (699.36s)
of those confirmations and just allow
[11:40] (700.96s)
the tool calls to happen automatically.
[11:42] (702.72s)
And it really does change the nature of
[11:44] (704.88s)
the experience. It feels like there's a
[11:46] (706.24s)
lot less friction and makes me want to
[11:47] (707.84s)
use it compared to using it in chat GPT
[11:50] (710.08s)
or Claude. I think the best way to think
[11:51] (711.44s)
about it is it's like a general app
[11:53] (713.04s)
versus a hypersp specific app. When I
[11:54] (714.96s)
think about general apps versus focused
[11:56] (716.80s)
apps that really solve a problem for a
[11:58] (718.72s)
specific niche, most cases the niche app
[12:00] (720.88s)
probably solves the problem a lot better
[12:02] (722.72s)
than the general app. And users usually
[12:04] (724.72s)
can feel that. And I think the same
[12:06] (726.24s)
thing does apply to AI products.
[12:10] (730.80s)
Shipping AI products is pretty hard, but
[12:12] (732.72s)
not in the ways that I expected. There
[12:14] (734.64s)
was a challenge to make sure that the AI
[12:16] (736.08s)
was smart enough and execute things the
[12:18] (738.00s)
way that I envisioned. But there were
[12:19] (739.52s)
also a lot of consideration like cost,
[12:21] (741.68s)
security, the form factor, a lot of
[12:23] (743.60s)
these things that I don't hear a lot of
[12:25] (745.04s)
people talking about that I wish someone
[12:26] (746.72s)
told me earlier. To summarize everything
[12:28] (748.48s)
here, if you're building an AI product,
[12:30] (750.24s)
I recommend tracking cost from day one.
[12:33] (753.04s)
Building preventions to prevent abuse,
[12:34] (754.96s)
whether it's intentional or
[12:36] (756.08s)
unintentional abuse, remembering and
[12:38] (758.00s)
honestly planning to use multiple models
[12:40] (760.48s)
from the start, considering what the
[12:42] (762.24s)
optimal form factor for your AI is,
[12:44] (764.48s)
whether it be mobile or voice or on the
[12:46] (766.64s)
web. But really considering that when
[12:48] (768.24s)
you're working on a product roadmap,
[12:49] (769.60s)
leveraging existing frameworks like the
[12:51] (771.36s)
Verscell AI SDK to make sure you're not
[12:53] (773.36s)
reinventing the wheel and then figuring
[12:54] (774.96s)
out what your edge is going to be when
[12:56] (776.48s)
you're building your product, especially
[12:58] (778.00s)
comparing against something general like
[12:59] (779.52s)
Chat GPT and Claude and figuring out a
[13:01] (781.84s)
way to make your app stand out. And the
[13:03] (783.28s)
agent that I showed you at the beginning
[13:04] (784.48s)
of the video, hopefully by the time that
[13:06] (786.16s)
this video is out, it should be launched
[13:07] (787.92s)
and you can actually try it yourself if
[13:09] (789.44s)
you want. If you're building an AI
[13:10] (790.64s)
product, I would love to hear what
[13:11] (791.84s)
you're building and some of the problems
[13:13] (793.12s)
that you've encountered along the way.
[13:14] (794.40s)
If it wasn't for you guys, I would not
[13:15] (795.44s)
have found the AI SDK from Versel.
[13:17] (797.52s)
Please drop any tips that you have. I
[13:18] (798.96s)
read every single comment. And if you
[13:20] (800.24s)
like this content, check out my
[13:21] (801.52s)
Instagram and Tik Tok. I post almost
[13:23] (803.12s)
every other day about building
[13:24] (804.24s)
productivity apps. And obviously, if you
[13:25] (805.92s)
like this content, don't forget to
[13:27] (807.36s)
subscribe. But thank you guys so much
[13:28] (808.48s)
for watching and I'll see you guys in
[13:29] (809.92s)
the next video.
[13:35] (815.26s)
[Music]