[00:00] (0.24s)
Like a lot of people talk about how
[00:01] (1.84s)
we're going to have way fewer software
[00:03] (3.28s)
engineers in the near future. I think it
[00:05] (5.20s)
feels like it's people that hate
[00:06] (6.48s)
software engineers largely speaking that
[00:08] (8.00s)
say this. It feels pessimistic not only
[00:09] (9.76s)
towards these people but I would say
[00:11] (11.36s)
just in terms of what the ambitions for
[00:13] (13.12s)
companies are. I think the ambitions for
[00:14] (14.64s)
a lot of companies is to build a lot
[00:16] (16.08s)
better product and if you now give the
[00:18] (18.32s)
ability for companies to now have a
[00:20] (20.24s)
better return on investment for building
[00:22] (22.00s)
technology right because the cost of
[00:23] (23.92s)
building software has gone down. What
[00:25] (25.44s)
should you be doing? You should be
[00:26] (26.64s)
building more because now the ROI for
[00:28] (28.96s)
software and developers is even higher
[00:30] (30.96s)
because a singular developer can do more
[00:32] (32.72s)
for your business. So technology
[00:34] (34.24s)
actually increases the ceiling of your
[00:35] (35.76s)
company much faster. Windsurf is one of
[00:37] (37.92s)
the popular ideas of software engineers
[00:39] (39.76s)
use thanks to AI coding capabilities.
[00:42] (42.40s)
But what are the unique engineering
[00:43] (43.76s)
challenges that go into building it and
[00:45] (45.52s)
how could tools like Windsurf change
[00:46] (46.96s)
software engineering? Today I sat down
[00:48] (48.88s)
with Vun Moan, co-founder and CEO of
[00:50] (50.96s)
Windsurf. We talk about why the Windsurf
[00:54] (54.08s)
team built their own LLMs and how LMS
[00:56] (56.40s)
for text are missing capabilities
[00:58] (58.00s)
necessary for coding like fill in the
[01:00] (60.08s)
middle. How Windsurf uses a mix of
[01:02] (62.16s)
techniques for many cases like to solve
[01:04] (64.72s)
for search. How they use a combination
[01:06] (66.24s)
of embeddings and keyword- based
[01:07] (67.76s)
searches. Why latency is their number
[01:09] (69.84s)
one challenge and how incorrectly
[01:11] (71.68s)
balancing GPU compute load and memory
[01:13] (73.76s)
load can lead to higher latency for code
[01:15] (75.52s)
suggestions popping up. how Varum thinks
[01:17] (77.68s)
his software engineering field will
[01:19] (79.04s)
evolve and why he stopped worrying about
[01:20] (80.96s)
predictions like 90% of code will be
[01:23] (83.04s)
generated by AI in 6 months. If you want
[01:25] (85.36s)
to understand the engineing that goes
[01:26] (86.64s)
into these next generation ids, then
[01:28] (88.48s)
this episode is for you. If you enjoy
[01:30] (90.64s)
the show, please do subscribe on any
[01:32] (92.08s)
podcast platform and on YouTube. Welcome
[01:35] (95.04s)
to the podcast. Yeah, thanks for having
[01:38] (98.72s)
You've recently launched GPC 4.1 support
[01:42] (102.32s)
in Windsurf which uh by the time this is
[01:45] (105.92s)
out it will have been a few weeks but
[01:48] (108.24s)
what are your initial impressions so far
[01:51] (111.32s)
and in general when you introduce a new
[01:53] (113.92s)
model how do you evaluate like how it's
[01:56] (116.48s)
working for the coding use cases that we
[01:58] (118.64s)
all use? Yeah, maybe I can talk about
[02:01] (121.20s)
the the second part and then I can talk
[02:03] (123.36s)
about you know GBT 4.1 the other models
[02:05] (125.76s)
um afterwards. uh basically internally
[02:08] (128.96s)
you know these models have these
[02:10] (130.16s)
non-deterministic properties right they
[02:12] (132.08s)
they sometimes perform uh differently in
[02:14] (134.24s)
different tasks in ways that are
[02:15] (135.28s)
unexpected uh you know you can't just
[02:17] (137.20s)
look at a score on an competitive
[02:19] (139.04s)
programming competition and decide hey
[02:20] (140.72s)
it's going to be awesome for for
[02:22] (142.16s)
programming and you know interestingly
[02:24] (144.32s)
about the company maybe this is this is
[02:26] (146.24s)
going to be helpful context a lot of us
[02:28] (148.00s)
at the company previously worked in
[02:29] (149.28s)
autonomous vehicles and I think in
[02:30] (150.96s)
autonomous vehicles we had a similar
[02:33] (153.28s)
type of behavior where you had a piece
[02:35] (155.36s)
of software the software was very
[02:37] (157.12s)
modular, lots of different pieces. Uh
[02:39] (159.84s)
each piece was machine learning driven,
[02:42] (162.00s)
so there was some non-determinism and
[02:44] (164.00s)
it's very hard to test it in the real
[02:45] (165.60s)
world, right? Actually, it's much harder
[02:47] (167.20s)
than it is to test uh I guess winfur out
[02:49] (169.36s)
in the real world. It's much harder to
[02:50] (170.56s)
test autonomous vehicle software out in
[02:52] (172.16s)
the real world because if you ship bad
[02:54] (174.16s)
software, you have the chance of hurting
[02:55] (175.52s)
a lot of people, right? Hurting a lot of
[02:57] (177.12s)
people, hurting the, you know, I don't
[02:58] (178.80s)
know, just the general public
[03:00] (180.00s)
infrastructure, right? So in that case
[03:02] (182.08s)
we needed to build really good
[03:03] (183.36s)
simulation evaluation infrastructure in
[03:05] (185.44s)
autonomous vehicles and I guess we
[03:06] (186.72s)
brought that over here as well where hey
[03:09] (189.44s)
if you want to test out a new model we
[03:11] (191.20s)
have evaluation suites and the
[03:12] (192.88s)
evaluation suites not only test
[03:14] (194.72s)
endto-end software performance which is
[03:16] (196.48s)
to say you give a highle task what is
[03:18] (198.72s)
the you know what is the pass rate of
[03:20] (200.56s)
actually completing the highle task on a
[03:22] (202.24s)
bunch of unit tests it also tests
[03:24] (204.24s)
retrieval accuracy edit accuracy right
[03:27] (207.04s)
redundant changes all these different
[03:28] (208.88s)
parts of a model that are like negative
[03:30] (210.48s)
behavior Because for our product, it not
[03:32] (212.64s)
only matters that you pass a test, it
[03:34] (214.64s)
also matters that you didn't go out and
[03:36] (216.40s)
make 10 steps that were unnecessary
[03:38] (218.00s)
because the human is going to be waiting
[03:39] (219.28s)
on the other end for all of those
[03:40] (220.48s)
changes. So we have we have metrics for
[03:42] (222.80s)
all of these things and we're able to
[03:44] (224.48s)
put each model through like I guess a
[03:46] (226.40s)
suite of tests that give us metrics and
[03:48] (228.64s)
that's like the way we decide, hey, this
[03:50] (230.24s)
is a good model for our end users,
[03:52] (232.40s)
right? And that's like the the highle
[03:54] (234.08s)
way that we go about testing and and
[03:55] (235.68s)
like these tests, you know, they sound
[03:57] (237.20s)
great in theory, but in practice, what
[03:59] (239.12s)
does it look like? Like I'm going to
[04:00] (240.96s)
assume you're going to have you know I
[04:02] (242.24s)
we can imagine us engineers who've been
[04:04] (244.80s)
writing you know code uh probably not
[04:06] (246.88s)
autonomous vehicles but similar ones you
[04:08] (248.80s)
know we know our unit test our
[04:10] (250.08s)
integration tests if you do mobile you
[04:11] (251.92s)
know your end to end test I'm assuming
[04:13] (253.76s)
this will be a little bit different but
[04:16] (256.32s)
with some similarities like do you
[04:17] (257.92s)
actually like code some scenarios you
[04:19] (259.76s)
have like example codes example prompts
[04:22] (262.64s)
and and then so I assume you can do a
[04:24] (264.96s)
bit of that but then what else and and
[04:26] (266.96s)
you know how does this all come together
[04:28] (268.48s)
and like how can I imagine in this test
[04:30] (270.32s)
suite is it like one big giant blob that
[04:32] (272.32s)
runs for I don't know how long. Yeah,
[04:34] (274.24s)
one of the aspects of code that is
[04:35] (275.76s)
really good is it can be run right. So
[04:37] (277.60s)
it's not like a very you know touchyfey
[04:39] (279.92s)
kind of thing in the end like a test can
[04:41] (281.92s)
be passed. So what we can do is we can
[04:43] (283.68s)
take a bunch of open source repositories
[04:45] (285.76s)
we can find previous pull requests or
[04:48] (288.40s)
commits that actually not only add tests
[04:50] (290.88s)
but also add the implementations
[04:53] (293.04s)
correspondingly. And what we can do is
[04:55] (295.28s)
instead of just taking the commit
[04:56] (296.72s)
description we can remake what the
[04:58] (298.88s)
description of the commit should have
[05:00] (300.32s)
been like a very high level intent and
[05:02] (302.80s)
then from there it becomes a very I
[05:04] (304.72s)
guess programmatic problem which is to
[05:06] (306.32s)
say hey like first of all find the right
[05:08] (308.96s)
files that you need to go and make
[05:10] (310.32s)
changes to right then there is a ground
[05:12] (312.16s)
truth for that right because the base
[05:13] (313.60s)
code actually has a set of five 10 files
[05:16] (316.00s)
that changes were made to then after
[05:17] (317.76s)
that what is the intent on those files
[05:20] (320.00s)
you can actually go from the ground
[05:21] (321.92s)
truth backwards which is that you know
[05:23] (323.44s)
what what the final change was from the
[05:25] (325.52s)
from the actual code and you can have
[05:27] (327.92s)
the model generate that intent and then
[05:29] (329.68s)
after that you can you can you can see
[05:31] (331.36s)
if the edit given that intent is
[05:33] (333.12s)
correct. So you now have three layers of
[05:34] (334.96s)
tests uh which is that hey did I
[05:37] (337.12s)
retrieve the right things did I have the
[05:38] (338.80s)
highle intent correctly and is the edit
[05:40] (340.88s)
performance good right and then you can
[05:42] (342.72s)
imagine doing much more than just that
[05:44] (344.64s)
and but at a high level now you know
[05:47] (347.28s)
just from a pure commit or a pure actual
[05:49] (349.60s)
ground truth piece of code you now have
[05:51] (351.36s)
multiple metrics that you can go about
[05:52] (352.96s)
and then obviously the final thing you
[05:54] (354.32s)
can actually do is run the code right so
[05:56] (356.32s)
it's not just like a you know when you
[05:58] (358.00s)
when you measure some of these chat
[05:59] (359.36s)
products they actually you know the
[06:01] (361.28s)
evaluation is a little bit different
[06:02] (362.64s)
which is to say the evaluation is you
[06:04] (364.40s)
give it to multiple hum humans in a
[06:06] (366.40s)
blind test in an AB test and you ask
[06:08] (368.40s)
them which one did you like more
[06:10] (370.00s)
obviously for us to quickly evaluate we
[06:12] (372.00s)
can't be giving it to like tens of
[06:13] (373.68s)
thousands of humans like in in a in a
[06:15] (375.52s)
second and with this now within like
[06:17] (377.36s)
minutes we can get answers to what is
[06:19] (379.52s)
the performance on tens of thousands of
[06:20] (380.96s)
repositories and tests basically this
[06:23] (383.04s)
episode is brought to you by modal the
[06:24] (384.88s)
cloud platform that makes AI development
[06:26] (386.80s)
simple need GPUs without the headache
[06:29] (389.68s)
with modal just add one line of code to
[06:31] (391.60s)
any Python function And boom, it's
[06:33] (393.52s)
running in the cloud on your choice of
[06:34] (394.96s)
CPU or GPU. And the best part, you only
[06:38] (398.40s)
pay for what you use. With sub-second
[06:40] (400.96s)
container start and instant scaling to
[06:42] (402.80s)
thousands of GPUs, it's no wonder
[06:44] (404.56s)
companies like Sunno, RAMP, and Substack
[06:46] (406.64s)
already trust Modal for their AI
[06:48] (408.52s)
applications. Getting an H100 is just a
[06:50] (410.80s)
PIP install away. Go to
[06:53] (413.16s)
modal.com/pragmatic to get $30 in free
[06:55] (415.36s)
credits every month. That is m
[06:57] (417.44s)
o.com/pragmatic.
[07:00] (420.64s)
This episode is brought to you by Code
[07:02] (422.32s)
Rabbit, the AI code review platform
[07:04] (424.32s)
transforming how engineering teams ship
[07:06] (426.00s)
faster without sacrificing code quality.
[07:09] (429.20s)
Code reviews are critical but
[07:11] (431.16s)
timeconuming. Code Rabbit acts as your
[07:13] (433.28s)
AI co-pilot, providing instant code
[07:15] (435.28s)
review comments and potential impacts of
[07:17] (437.36s)
every pull request. Beyond just flagging
[07:20] (440.16s)
issues, code driver provides one-click
[07:22] (442.00s)
fix solutions and lets you define custom
[07:24] (444.24s)
code quality rules using a graph
[07:26] (446.56s)
patterns, catching sub issues that
[07:28] (448.72s)
traditional static analysis tools might
[07:30] (450.76s)
miss. Code Rabbit has so far reviewed
[07:33] (453.44s)
more than 5 million pull requests, is
[07:35] (455.44s)
installed on 1 million repositories, and
[07:37] (457.68s)
is used by 50,000 open source projects.
[07:40] (460.64s)
Try Code Rabbit free for one month at
[07:42] (462.72s)
colder rabbit.ai using the code
[07:45] (465.32s)
Pragmatic. That is code rabbit.ai. AI
[07:48] (468.96s)
and use the code pragmatic.
[07:51] (471.52s)
I I really like how much engineering you
[07:54] (474.00s)
can bring in because it's code and
[07:56] (476.56s)
because we have repositories because you
[07:58] (478.80s)
can use all these things that it feels
[08:00] (480.56s)
to me it's it gives a bit of an edge to
[08:02] (482.08s)
like some of the other use cases just as
[08:04] (484.16s)
you mentioned. No, I think you're I
[08:06] (486.32s)
think you're totally right. Like we
[08:07] (487.84s)
think about this a lot of like what
[08:09] (489.52s)
would have happened if we were to pick a
[08:10] (490.96s)
different sort of category um entirely.
[08:13] (493.44s)
It's just I think the ground truth is
[08:15] (495.28s)
just very hard. you don't even know if
[08:16] (496.88s)
the ground truth is great, right? In
[08:18] (498.88s)
some ways, in some cases, for all we
[08:20] (500.48s)
know, the ground truth is not good. But
[08:22] (502.16s)
in this case, I think it's a lot easier
[08:24] (504.16s)
because of the verifiability or kind of
[08:26] (506.00s)
if you have a good test, it's it's a lot
[08:27] (507.68s)
more easy to verify software
[08:30] (510.32s)
and can can you give us a sense of what
[08:32] (512.56s)
is the team behind Windsurf and and also
[08:34] (514.88s)
how complex this thing is and how did it
[08:37] (517.12s)
even come about because I for all I
[08:39] (519.60s)
know, you know, like a few months ago
[08:41] (521.60s)
when this podcast started, there was no
[08:43] (523.36s)
windfur. There was Kodium. We actually
[08:45] (525.68s)
talked a bit about what Kodium was a
[08:47] (527.36s)
little bit different and then out of
[08:48] (528.88s)
nowhere boom winter comes out a week
[08:51] (531.12s)
later already in the pragmatic engineer
[08:52] (532.96s)
about 10% of people that we surveyed
[08:55] (535.20s)
were already using it which was a I
[08:57] (537.12s)
think the second largest uh usage of of
[08:59] (539.52s)
tools and people were enthusiastic about
[09:01] (541.04s)
it but I assume there's more to this it
[09:03] (543.44s)
didn't just come out of you know like
[09:05] (545.44s)
nothing right yeah so happy to talk a
[09:08] (548.48s)
little bit about like our our story and
[09:10] (550.16s)
and summarize it so we started the
[09:12] (552.40s)
company now close to four years ago
[09:13] (553.92s)
which substantially before I guess the
[09:16] (556.00s)
you know the co-pilot and chachi bt sort
[09:18] (558.08s)
of moment. Um we a lot of us at the
[09:20] (560.32s)
company as I mentioned previously worked
[09:21] (561.84s)
I would say on these hard tech problems
[09:24] (564.08s)
you know AR VR uh autonomous vehicles
[09:27] (567.20s)
and I guess at that point what we
[09:28] (568.72s)
started building out and we had a
[09:30] (570.32s)
different company name at that point it
[09:31] (571.68s)
was called Exaunction uh we we started
[09:34] (574.16s)
building out GPU virtualization systems
[09:36] (576.00s)
so we built out systems to make it very
[09:37] (577.84s)
fast to and efficient to run GPU based
[09:40] (580.48s)
workloads and we would enable companies
[09:42] (582.32s)
to run these GPU workloads on CPUs and
[09:44] (584.80s)
we would transparently offload all GPU
[09:47] (587.28s)
computations to remote machines And that
[09:49] (589.28s)
could be that could be CUDA kernels all
[09:51] (591.20s)
the way down to full-on model calls,
[09:52] (592.88s)
right? We were it was a it was a very
[09:54] (594.56s)
lowle abstraction that we we provided
[09:56] (596.56s)
people and so much so that if another if
[09:58] (598.64s)
the remote machine died, we would be
[10:00] (600.16s)
able to reconstruct the state of what
[10:01] (601.52s)
was on that GPU and another GPU, right?
[10:03] (603.68s)
And and the main use case we targeted
[10:05] (605.36s)
were these large scale simulation
[10:06] (606.88s)
workloads uh for these deep learning
[10:08] (608.64s)
workloads that a lot of these robotics
[10:09] (609.92s)
and autonomous vehicle companies had.
[10:11] (611.60s)
And we thought, hey, the world was going
[10:13] (613.12s)
to look like that in the future. A lot
[10:14] (614.56s)
of companies would be running deep
[10:15] (615.68s)
learning workloads. What ended up
[10:17] (617.68s)
happening was in the middle of 2022, I
[10:20] (620.00s)
think text da Vinci 3 sort of came out,
[10:22] (622.16s)
which was I guess the uh you know, the
[10:24] (624.84s)
GBT3 sort of instruction model sort of
[10:27] (627.52s)
came out and I guess that changed a lot
[10:29] (629.28s)
of our priors uh like both me and my
[10:31] (631.52s)
co-founders prior which is to say we
[10:33] (633.68s)
thought that the set of models that
[10:34] (634.88s)
would run were going to look a lot more
[10:36] (636.40s)
homogeneous, right? If you were to
[10:38] (638.08s)
imagine in the past the number of
[10:40] (640.08s)
different models that people would run
[10:41] (641.44s)
was very diverse, right? People would
[10:43] (643.12s)
run convolutional neural nets, right?
[10:45] (645.12s)
recurrent neural nets, LSTMs, right?
[10:47] (647.52s)
Graph neural nets, they were there was a
[10:49] (649.36s)
whole suite of different types of
[10:50] (650.72s)
models. We thought in that case, hey, if
[10:52] (652.88s)
we were an infrastructure company, we
[10:54] (654.64s)
can make it a lot easier for these
[10:56] (656.08s)
companies to run these workloads. But
[10:57] (657.76s)
the thing is with text 3, we actually
[10:59] (659.60s)
thought that actually there would be a
[11:01] (661.12s)
simplification of the set of models that
[11:02] (662.56s)
would run. Why go out and train a very
[11:04] (664.96s)
custom BERT model if you could go out
[11:06] (666.80s)
and just ask a very large genital model,
[11:08] (668.80s)
is this a positive or negative
[11:10] (670.16s)
sentiment? And we thought that that was
[11:11] (671.84s)
where the puck was going. I guess like
[11:13] (673.28s)
for us like we believe in scaling laws
[11:14] (674.88s)
and all these things. If it's this good
[11:16] (676.48s)
today, how good is a much smaller model
[11:18] (678.48s)
going to be in two years, it's probably
[11:19] (679.60s)
going to be way better. So what we
[11:21] (681.52s)
decided to do was actually focus on the
[11:22] (682.96s)
application layer. Take the
[11:24] (684.72s)
infrastructure that we had and actually
[11:26] (686.00s)
build out an application. And that was
[11:27] (687.36s)
what Kodium was. So we built out
[11:29] (689.36s)
extensions in all the major IDEs, right?
[11:32] (692.08s)
And and very quickly we were able to get
[11:33] (693.84s)
to get to that point. And we actually
[11:35] (695.68s)
did train our own models and and and run
[11:37] (697.76s)
them ourselves with our own inference
[11:39] (699.20s)
stack. And the reason why we did that is
[11:40] (700.72s)
at the time the models were not very
[11:42] (702.00s)
good. the open models are not very good
[11:44] (704.00s)
and also for the workload that we had
[11:45] (705.92s)
which was autocomplete. It was a very
[11:47] (707.60s)
weird workload. It's not very similar to
[11:49] (709.28s)
the chat workload. Code is in a very
[11:51] (711.12s)
incomplete state. You need to fill in
[11:52] (712.96s)
code in the middle of a line. There's a
[11:54] (714.56s)
bunch of reasons why this workload is
[11:56] (716.08s)
not very similar and we thought we could
[11:57] (717.68s)
do a much better job. So we provided
[11:59] (719.12s)
that because of our infrastructure
[12:00] (720.40s)
background for free to basically every
[12:02] (722.96s)
developer in the world. There was no way
[12:04] (724.32s)
to pay for the product. And then very
[12:06] (726.24s)
quickly enterprises started to reach
[12:07] (727.68s)
out. we were able to handle the security
[12:09] (729.60s)
requirements and personalization because
[12:11] (731.20s)
the companies not only care about hey
[12:13] (733.28s)
it's fast it's free but is this the best
[12:15] (735.68s)
code for my company right and we were
[12:17] (737.52s)
able to meet that workload and then fast
[12:19] (739.68s)
forward to today and I I know that this
[12:21] (741.52s)
is this is this is a long answer what we
[12:24] (744.24s)
felt was agents in the beginning of last
[12:26] (746.08s)
year would be very huge the problem was
[12:27] (747.92s)
the the models were not there yet right
[12:30] (750.40s)
we we had internal we had teams inside
[12:32] (752.80s)
the company building these agent use
[12:34] (754.40s)
cases and they were just not good enough
[12:36] (756.32s)
but the middle of last year we were like
[12:37] (757.76s)
hey it's actually going to be good
[12:39] (759.36s)
enough but the problem is the IDE is
[12:41] (761.44s)
going to be a limitation for us because
[12:43] (763.44s)
VS code is not evolving fast enough to
[12:45] (765.92s)
enable us to provide the best experience
[12:47] (767.52s)
for our end users in a world in which
[12:49] (769.44s)
agents were going to write 90% of or 95%
[12:51] (771.92s)
of software developers would still be in
[12:53] (773.76s)
the loop but the way they would interact
[12:55] (775.52s)
with their IDs would look marketkedly
[12:56] (776.96s)
different and that's why we ended up
[12:58] (778.64s)
building out windsurf uh in the first
[13:00] (780.72s)
place we thought that there was a much
[13:02] (782.32s)
higher ceiling on what IDs could provide
[13:04] (784.56s)
and with the agent product which is
[13:06] (786.16s)
cascade we were able to deliver what we
[13:08] (788.48s)
felt was a premier experience right off
[13:10] (790.24s)
the bat that we couldn't have with VS
[13:11] (791.68s)
Code. How large is the team who's
[13:13] (793.76s)
working on Windsurf and how complex is
[13:15] (795.68s)
Windurf as a as a product? Like I'm not
[13:18] (798.40s)
sure how much we can quantify it. Yeah,
[13:21] (801.20s)
you know, I I try to I try to be pretty
[13:23] (803.76s)
like you know sort of uh like modest
[13:26] (806.56s)
with with some of these things but just
[13:28] (808.08s)
to say we're a pretty small team. So
[13:29] (809.60s)
right now the engineering team is a bit
[13:30] (810.88s)
over 50 people. um at the time when we
[13:34] (814.08s)
maybe that's like that's that's large
[13:35] (815.60s)
compared to compared to other startups
[13:37] (817.36s)
but if I were to say compared to other
[13:39] (819.04s)
you know large engineering projects in
[13:40] (820.96s)
the grand scheme of things like one of
[13:42] (822.72s)
the books that I read a while ago was
[13:44] (824.64s)
this book called showstopper right and
[13:46] (826.64s)
it's this book about how how Microsoft
[13:48] (828.80s)
built Windows NG uh right and it's a
[13:51] (831.36s)
much larger team obviously but operating
[13:53] (833.28s)
systems are more are a very complex
[13:54] (834.88s)
piece of software but my viewpoint on
[13:56] (836.72s)
this is that this is a very very complex
[13:58] (838.88s)
piece of software in terms of where the
[14:00] (840.96s)
goalpost is which is to say I would say
[14:03] (843.36s)
the goalpost is constantly moving right
[14:05] (845.60s)
one of the one of the goals that I give
[14:07] (847.12s)
to the company is that we should be
[14:08] (848.96s)
reducing the time it takes to build
[14:10] (850.32s)
applications by 99%. Right? And I would
[14:13] (853.20s)
say pre-winsurf it was probably 20 and
[14:15] (855.68s)
post windsurf it was probably over 40
[14:17] (857.76s)
but we are very far from 99 right we're
[14:20] (860.56s)
still like you know a 60x away from 99
[14:23] (863.52s)
right like if we if there's a if there's
[14:25] (865.36s)
a 60 units of time and we want to make
[14:27] (867.04s)
it one we're quite far so in my head
[14:29] (869.76s)
there's a lot of different engineering
[14:31] (871.20s)
projects that we have at the company in
[14:32] (872.40s)
fact like I would say over maybe close
[14:34] (874.24s)
to half of the engineering team is
[14:35] (875.84s)
working on projects that have not seen
[14:37] (877.36s)
the light of day right and and that's
[14:40] (880.00s)
like an interesting decision that I
[14:41] (881.60s)
guess we've made Because I think we
[14:43] (883.04s)
cannot be embracing incremental, right?
[14:45] (885.12s)
Like we're not going to win and be a a
[14:47] (887.28s)
valuable company to our customers if all
[14:49] (889.44s)
we're doing is changing the location of
[14:50] (890.88s)
buttons. Like I think people will like
[14:52] (892.56s)
us for great UI, but that cannot be the
[14:54] (894.64s)
only reason why we win. No. And I love
[14:56] (896.80s)
it. I mean this is, you know, when
[14:58] (898.08s)
you're a startup, I think you need to
[14:59] (899.44s)
aim really big. You cannot just do
[15:00] (900.88s)
incremental. You can do incremental
[15:02] (902.24s)
later. Hopefully, you're going to get
[15:03] (903.68s)
there. And what are some interesting
[15:05] (905.52s)
numbers that you can share about the
[15:07] (907.44s)
usage of of of wind surf or or the load
[15:10] (910.00s)
that you're handling? I'm assuming this
[15:12] (912.48s)
is just going to it's pretty easy to
[15:14] (914.48s)
tell it will be keep going up, right?
[15:16] (916.80s)
That's an easy prediction. No, I I think
[15:19] (919.12s)
you're right. So, one of the interesting
[15:20] (920.56s)
numbers I can pro I can I can or a
[15:22] (922.32s)
handful of numbers is within a couple
[15:23] (923.84s)
months of the product existing, we had
[15:25] (925.44s)
like well over a million developers try
[15:26] (926.96s)
the product. Uh so, it's been growing
[15:29] (929.36s)
quite quickly. within pricing coming
[15:30] (930.96s)
out, we've we've reached over within a
[15:33] (933.44s)
month, we reached over sort of eight
[15:35] (935.28s)
figures in in ARR. Um, and uh and I
[15:38] (938.64s)
think I think all of those are kind of
[15:39] (939.92s)
interesting metrics, but also on on top
[15:41] (941.84s)
of that sort of we run our own model
[15:43] (943.76s)
still in a lot of places like you can
[15:45] (945.28s)
imagine the fast passive experience is
[15:47] (947.04s)
completely our own model. A lot of the
[15:48] (948.80s)
models to go out and and retrieve parts
[15:51] (951.36s)
of the codebase and find relevant
[15:53] (953.20s)
snippets are our own models. And that
[15:54] (954.88s)
system processes well over sort of 500
[15:57] (957.04s)
billion tokens of code every day right
[15:58] (958.88s)
now. So that system itself is huge. It's
[16:01] (961.36s)
a huge work um that we that we actually
[16:03] (963.60s)
run. Yeah. And and I guess the history
[16:06] (966.40s)
of the of Windsurf is interesting once
[16:10] (970.00s)
we I understand that you've actually
[16:11] (971.92s)
been building your own models for quite
[16:13] (973.68s)
the time. You know, you've not just
[16:15] (975.36s)
started here because I think for for
[16:17] (977.84s)
most engineering teams that will be
[16:19] (979.44s)
daunting and also it's just a lot of
[16:21] (981.28s)
time, right? Like it's it's not
[16:22] (982.48s)
something that you would just like it's
[16:24] (984.08s)
harder to do from scratch. I I'll say
[16:25] (985.76s)
that because not nothing's impossible
[16:27] (987.04s)
here. I totally agree with you. I think
[16:29] (989.52s)
you know one of the weird things is
[16:31] (991.52s)
because of the time that we started and
[16:33] (993.52s)
the fact that we were like in the very
[16:35] (995.36s)
beginning first of all we had the
[16:36] (996.32s)
infrastructure background but we were
[16:37] (997.92s)
first saying we need to go out and build
[16:40] (1000.08s)
an autocomplete model the best model at
[16:42] (1002.32s)
the time that was open source end of
[16:44] (1004.96s)
2022 was Salesforce codejet and I'm not
[16:47] (1007.60s)
saying it was a bad model it was awesome
[16:48] (1008.96s)
that Salesforce did open source that
[16:50] (1010.56s)
model but it was missing a lot of
[16:52] (1012.24s)
capabilities that we needed for our
[16:54] (1014.00s)
product right it was missing fill in the
[16:56] (1016.00s)
middle which feels like a very very
[16:58] (1018.16s)
obvious capab capability, but the model
[16:59] (1019.92s)
fill in the middle. What is that? So the
[17:02] (1022.32s)
idea of fill in the middle is basically
[17:04] (1024.16s)
if you look at the task of writing
[17:05] (1025.52s)
software, it's very different than than
[17:07] (1027.36s)
chat. And maybe an example of what chat
[17:09] (1029.28s)
is, you're always appending something to
[17:10] (1030.80s)
the very end and maybe adding an
[17:12] (1032.16s)
instruction. But the problem for writing
[17:14] (1034.08s)
code is you're writing code in ways that
[17:16] (1036.08s)
are in the middle of a line, in the
[17:17] (1037.92s)
middle of a snippet of code.
[17:20] (1040.24s)
That kind of stuff. Yeah. In the middle
[17:21] (1041.52s)
of a function. And the problem there is
[17:23] (1043.28s)
actually there's a lot of issues that
[17:24] (1044.80s)
pop up, which is to say actually the
[17:26] (1046.80s)
tokenization. So these models when they
[17:28] (1048.48s)
consume files they actually tokenize the
[17:30] (1050.32s)
files right which is they don't consume
[17:32] (1052.24s)
them bite by bite they consume them
[17:33] (1053.68s)
token by token but Phil but the fact
[17:35] (1055.60s)
that the code when you write it at any
[17:37] (1057.76s)
given point doesn't tokenize into
[17:39] (1059.76s)
something that looks like in
[17:40] (1060.80s)
distribution and I'll give you an
[17:41] (1061.84s)
example how many times do you think in
[17:44] (1064.08s)
the training data set for these models
[17:45] (1065.52s)
does it see instead of
[17:47] (1067.48s)
return only without the RN probably
[17:50] (1070.32s)
never it probably never sees that so
[17:52] (1072.48s)
it's completely out of distribution but
[17:54] (1074.56s)
we still need to when we see REU predict
[17:57] (1077.20s)
who are going to do RN space a bunch of
[17:59] (1079.04s)
other stuff right it sounds like a very
[18:00] (1080.80s)
small detail but that is actually very
[18:02] (1082.64s)
important if you want to build a product
[18:04] (1084.24s)
and that is a capability that cannot be
[18:06] (1086.48s)
slightly post-trained onto the models
[18:08] (1088.08s)
it's actually something where like you
[18:09] (1089.52s)
need to do a non-trivial amount of
[18:10] (1090.96s)
training on top of a model or
[18:12] (1092.56s)
pre-trained to get that capability and
[18:14] (1094.40s)
it was table stakes for us to provide
[18:15] (1095.92s)
that for our users so that forced us
[18:17] (1097.76s)
very early on to actually build out our
[18:19] (1099.44s)
own models and figure out training
[18:20] (1100.96s)
recipes and make sure we could run them
[18:22] (1102.56s)
at massive scale ourselves for our end
[18:24] (1104.24s)
users and what are other things that are
[18:27] (1107.04s)
unique in terms of building models for
[18:29] (1109.48s)
code as opposed to the the usual text
[18:32] (1112.80s)
models. I I I can think of things like
[18:34] (1114.56s)
you know the brackets for example and
[18:36] (1116.16s)
some languages. Maybe this is just just
[18:38] (1118.24s)
naive. You're you have all seen so many
[18:40] (1120.64s)
more. So like what what makes code what
[18:43] (1123.92s)
makes it interesting slashworthwhile to
[18:46] (1126.16s)
build your own model for code? Yeah, I
[18:50] (1130.00s)
think I think what you said is is is
[18:52] (1132.80s)
definitely like one thing. the
[18:54] (1134.32s)
fill-in-the-middle capability. I would
[18:55] (1135.76s)
say another thing, another thing you can
[18:57] (1137.28s)
do is code is like quite easy to to and
[19:00] (1140.40s)
quite easy is maybe you know an an
[19:02] (1142.56s)
overstatement but quite easy to parse
[19:04] (1144.72s)
right you could actually asd parse code
[19:06] (1146.72s)
you can find relationships of the code
[19:08] (1148.96s)
uh because code is a system that is like
[19:10] (1150.80s)
evolved over time you could actually
[19:12] (1152.64s)
look at the commit history of code to
[19:14] (1154.40s)
see to build a knowledge graph of the
[19:16] (1156.16s)
codebase and you can start putting
[19:19] (1159.12s)
details what do do you do that yep yeah
[19:22] (1162.16s)
yeah yeah we look at the previous
[19:24] (1164.24s)
commits and And one of the things that
[19:26] (1166.08s)
it enables us to do is build a
[19:27] (1167.60s)
probability distribution of the codebase
[19:29] (1169.12s)
of conditional on you modifying a piece
[19:31] (1171.44s)
of code. What is the probability of you
[19:32] (1172.88s)
modifying another piece of code? So
[19:34] (1174.56s)
there's you know when you get into the
[19:36] (1176.32s)
weeds code is a very it's very
[19:38] (1178.40s)
information dense right it's testable.
[19:40] (1180.80s)
Um there's a way that it evolves people
[19:43] (1183.04s)
write comments which is which is also
[19:44] (1184.80s)
cool which is to say once a pull request
[19:46] (1186.56s)
gets created people actually say I
[19:48] (1188.16s)
didn't like this code. So there's a lot
[19:49] (1189.60s)
of signal on what good and bad looks
[19:51] (1191.20s)
like within a company. And you can use
[19:53] (1193.36s)
that actually as a way to make to
[19:55] (1195.52s)
automatically make the product much
[19:57] (1197.20s)
better for companies, right? You know,
[19:58] (1198.72s)
one of the things that I think we were
[20:00] (1200.56s)
all of us were talking about I would say
[20:02] (1202.40s)
like a couple years ago when when and I
[20:04] (1204.48s)
guess I guess we've been here in the
[20:05] (1205.92s)
space quite a long time. I know couple
[20:07] (1207.76s)
years is not a very long time in most
[20:09] (1209.20s)
categories but in this category it's you
[20:10] (1210.88s)
know dinosaur years. Um so so you know
[20:15] (1215.04s)
one of the things that I think is is is
[20:17] (1217.04s)
kind of interesting is we in the
[20:18] (1218.56s)
beginning we were we were saying hey
[20:20] (1220.48s)
people would write all these guidelines
[20:22] (1222.24s)
and documentations on how best to use
[20:24] (1224.00s)
the product but the interesting thing is
[20:25] (1225.52s)
code is such a treasure trove you can go
[20:27] (1227.52s)
out and probably make a good first cut
[20:29] (1229.76s)
on what the best way to write software
[20:31] (1231.44s)
is inside JP Morgan Chase inside Dell
[20:34] (1234.08s)
you can go out and do that by using the
[20:35] (1235.76s)
rich history inside the company um so
[20:38] (1238.08s)
there's a lot of things that you can
[20:39] (1239.28s)
start doing autonomously as well if that
[20:41] (1241.12s)
makes sense
[20:42] (1242.40s)
Yeah, one thing I I' I'd love to get
[20:44] (1244.88s)
your your take on on how it might have
[20:46] (1246.64s)
changed. A year or two ago when using
[20:50] (1250.08s)
when copilot started to become more
[20:51] (1251.68s)
popular again on earlier version
[20:53] (1253.84s)
companies like source graph and others
[20:55] (1255.84s)
have started to to build out their
[20:57] (1257.28s)
capabilities there was this debate of
[21:00] (1260.00s)
would it be worth fine-tuning a model on
[21:03] (1263.92s)
my company's codebase talking about
[21:05] (1265.92s)
large companies let's talk JP Morgan or
[21:07] (1267.68s)
or those others and you know there were
[21:09] (1269.20s)
two strange of thoughts one said like oh
[21:11] (1271.36s)
it's probably worth it because our code
[21:13] (1273.44s)
is so unique it it might be worth it and
[21:15] (1275.52s)
some other people were thinking like it
[21:16] (1276.88s)
might not be worth it because uh it it
[21:19] (1279.44s)
might be too resource intensive. The
[21:21] (1281.04s)
models are too generic. Did you try this
[21:23] (1283.12s)
out and where did you land in this?
[21:24] (1284.96s)
Because I I never got an answer to you
[21:26] (1286.96s)
know what happened like what what was
[21:28] (1288.80s)
worth it, what was not worth it in the
[21:30] (1290.16s)
end. So So for what it's worth, we did
[21:32] (1292.16s)
try it out. Um we built out some crazy
[21:34] (1294.32s)
infrastructure to go out and try it out.
[21:35] (1295.76s)
I you know I guess this will be the
[21:37] (1297.60s)
first place where I talk about the
[21:38] (1298.80s)
actual infrastructure. We built out
[21:40] (1300.56s)
systems. So transformers have these many
[21:42] (1302.96s)
layers, right? And if you were to
[21:44] (1304.80s)
imagine if when we we actually enable
[21:46] (1306.96s)
companies to selfo at some point in the
[21:48] (1308.96s)
past we were enabling companies to
[21:50] (1310.48s)
self-host the system and the fine tuning
[21:52] (1312.88s)
system as well. So at that time you'll
[21:54] (1314.96s)
have five self-hosting that is you built
[21:57] (1317.36s)
this out. We built out self-hosted not
[21:59] (1319.52s)
only deployment but also fine-tuning and
[22:01] (1321.60s)
the way that that actually worked was
[22:03] (1323.36s)
was actuallyite quite crazy which was to
[22:05] (1325.68s)
say um okay where do you get the
[22:07] (1327.76s)
capacity to fine-tune a model if you're
[22:09] (1329.84s)
already running it for inference like
[22:11] (1331.28s)
the company may not want to give you so
[22:12] (1332.72s)
many GPUs. So we just said hey why don't
[22:15] (1335.04s)
we use the preemptable time which is to
[22:17] (1337.60s)
say when the model is not running
[22:18] (1338.80s)
inference what if we actually go out and
[22:20] (1340.72s)
back propagate do backrops on the
[22:22] (1342.88s)
transformer model while this is
[22:24] (1344.48s)
happening and then what we found was oh
[22:26] (1346.24s)
the back props take a long time and it
[22:28] (1348.08s)
might cause downtime on the inference
[22:29] (1349.76s)
side. So what we enabled it was we
[22:31] (1351.68s)
enabled the back prop to be able to be
[22:33] (1353.84s)
preemptable on every layer of the
[22:35] (1355.76s)
transformer. So if that's to say let's
[22:37] (1357.68s)
say you send an inference request and
[22:39] (1359.28s)
it's going to do it's doing back
[22:40] (1360.56s)
propagation um and it's on layer 10
[22:42] (1362.88s)
it'll just stop at layer 10 and it will
[22:44] (1364.40s)
continue after your inference request
[22:45] (1365.84s)
completes. So we built a lot of crazy
[22:47] (1367.92s)
systems to actually go out and do this.
[22:50] (1370.32s)
I guess here's the thing we found. We
[22:52] (1372.24s)
found that fine-tuning was a bump but it
[22:54] (1374.32s)
was a very modest bump compared to what
[22:56] (1376.72s)
great personalization and great
[22:58] (1378.32s)
retrieval could do. That's what we
[22:59] (1379.76s)
found. Now does that mean fine-tuning in
[23:02] (1382.00s)
the future is not going to be valuable?
[23:03] (1383.60s)
I think actually per person fine-tuning
[23:05] (1385.68s)
could actually work quite well. I think
[23:07] (1387.76s)
though maybe some of the techniques that
[23:09] (1389.28s)
we do it are going to need to change.
[23:10] (1390.96s)
And here's the way I like to look at it,
[23:12] (1392.40s)
right? Or anytime a system, you know,
[23:14] (1394.72s)
you build a system, there are many ways
[23:16] (1396.24s)
to improve it. Some of them are much
[23:17] (1397.68s)
easier than other ways. And you can
[23:19] (1399.52s)
imagine there's a hill to climb for
[23:21] (1401.20s)
everything. And some hills are much
[23:22] (1402.64s)
easier. And the right strategy to do
[23:24] (1404.16s)
when a hill is much easier and it
[23:25] (1405.68s)
provides a lot of value is climb that
[23:27] (1407.28s)
hill fully before you go out and do
[23:29] (1409.04s)
something that's a lot harder. Because
[23:30] (1410.56s)
when you do the thing that's a lot
[23:31] (1411.76s)
harder, you are like adding some amount
[23:33] (1413.44s)
of tech debt if that's not the right
[23:35] (1415.04s)
solution. Right? If what I described to
[23:37] (1417.04s)
you in terms of like the solution of
[23:38] (1418.56s)
doing backdrop on a layer bylayer basis,
[23:40] (1420.40s)
it's a cool idea, but you could imagine
[23:42] (1422.24s)
it added a lot of tech technical
[23:43] (1423.84s)
complexity to the software that might
[23:45] (1425.68s)
have been unnecessary if we thought that
[23:47] (1427.52s)
purely doing better retrieval was going
[23:49] (1429.20s)
to be much better. So there's this like
[23:51] (1431.04s)
I guess there's this tight rope to kind
[23:52] (1432.64s)
of you know balance on top of on on how
[23:54] (1434.72s)
you how you decide these things. Now I I
[23:57] (1437.36s)
I was asking around I've been using
[23:58] (1438.56s)
windsurf as well but I'm I'm not a very
[24:00] (1440.40s)
heavy user but I have been asking around
[24:01] (1441.84s)
more heavy users and one of the biggest
[24:03] (1443.96s)
criticisms both of windsurf but also in
[24:07] (1447.20s)
of every tool in this area has been like
[24:09] (1449.20s)
look I start off it's good it works good
[24:11] (1451.76s)
I have a relatively small complex my
[24:13] (1453.44s)
project grows either because windsurf
[24:15] (1455.12s)
generates code or it's just a big
[24:16] (1456.88s)
project and after a while it starts to
[24:18] (1458.88s)
struggle with the context maybe it
[24:21] (1461.36s)
doesn't see u you know part it gets
[24:23] (1463.92s)
confused etc uh And clearly as as an
[24:27] (1467.28s)
engineer understand that it it is going
[24:29] (1469.44s)
to be a problem of like you have a
[24:30] (1470.72s)
growing context window, you still want
[24:32] (1472.64s)
to have similar quality. How do you
[24:36] (1476.40s)
tackle this challenge? What what
[24:38] (1478.08s)
progress have you made? And like I I I
[24:40] (1480.00s)
think this is a bit of a million-dollar
[24:41] (1481.36s)
question in the sense of like if if we
[24:43] (1483.20s)
could somehow have a solution for this,
[24:45] (1485.20s)
we would be better off. Uh where have
[24:47] (1487.20s)
you gotten on this? I'm assuming this is
[24:50] (1490.00s)
a pretty common challenge and difficult.
[24:51] (1491.76s)
I think I think it's a very hard
[24:53] (1493.20s)
problem. Um you're you're totally right.
[24:55] (1495.44s)
There's a lot of things that we can do
[24:56] (1496.80s)
which is to say obviously we need to
[24:58] (1498.64s)
work around the fact that the models
[25:00] (1500.00s)
don't have infinite context and when
[25:01] (1501.84s)
they do have larger and larger context
[25:03] (1503.76s)
you are paying a lot more and you take a
[25:05] (1505.92s)
lot more time right and developers
[25:07] (1507.76s)
usually a lot of the time don't really
[25:09] (1509.36s)
want to wait and you know one of the
[25:10] (1510.80s)
things that we have for our products we
[25:12] (1512.56s)
hate waiting. Yeah. Exactly. But one of
[25:14] (1514.64s)
the things that we have for our products
[25:15] (1515.76s)
that we've learned is if you make a
[25:17] (1517.36s)
developer wait the answer better be 100%
[25:19] (1519.44s)
correct. And I don't think we're at a
[25:20] (1520.88s)
time right now where I can guarantee you
[25:23] (1523.04s)
with like a magic wand that all of our
[25:25] (1525.20s)
cascade responses are 100% correct,
[25:27] (1527.36s)
right? I don't think we're at that right
[25:28] (1528.80s)
now. So there's a lot of a lot of things
[25:30] (1530.72s)
that we need to do that are almost
[25:32] (1532.16s)
approximations, right? Um how do we keep
[25:34] (1534.24s)
a very large context? But despite that,
[25:36] (1536.08s)
we have chats that are so long that how
[25:38] (1538.08s)
do you accurately checkpoint the past
[25:40] (1540.40s)
conversation? But that has some natural
[25:42] (1542.56s)
lossiness attached to it, right? And
[25:44] (1544.64s)
then similarly uh if the codebase gets
[25:46] (1546.88s)
very large how do we get very very
[25:48] (1548.72s)
confident that the retrieval is very
[25:50] (1550.08s)
good and we have evaluations for all of
[25:51] (1551.68s)
these things right this is not something
[25:53] (1553.12s)
which we're like shooting in the dark
[25:54] (1554.80s)
and being like hey yolo let's try a new
[25:56] (1556.80s)
approach and like give it to half of our
[25:58] (1558.32s)
users um but I think you're totally
[26:00] (1560.16s)
right there's no I don't think there's
[26:02] (1562.24s)
like a complete solution for it. What I
[26:04] (1564.16s)
think it's going to be is like a mixture
[26:05] (1565.68s)
of a bunch of things which is to say
[26:07] (1567.44s)
much better checkpointing coupled with
[26:09] (1569.44s)
better usage of context link much faster
[26:11] (1571.60s)
LMS and much better models. So it's
[26:13] (1573.52s)
going to be it's not going to be I think
[26:15] (1575.04s)
a silver bullet and by the way that
[26:16] (1576.56s)
could be tied with hey you know
[26:19] (1579.72s)
understanding you know understanding the
[26:22] (1582.40s)
codebase much better from the
[26:23] (1583.76s)
perspective of the codebase already
[26:25] (1585.04s)
existed able to use the knowledge graph
[26:27] (1587.28s)
right able to use a lot of the
[26:28] (1588.56s)
dependencies within the codebase a lot
[26:30] (1590.00s)
better so it's a bunch of things that I
[26:31] (1591.76s)
think are going to multiply together to
[26:32] (1592.96s)
solve the problem I don't think there's
[26:34] (1594.08s)
going to be like one silver bullet that
[26:35] (1595.60s)
makes it so you're going to be able to
[26:36] (1596.96s)
have amazingly coherent conversations
[26:38] (1598.80s)
that are very very long basically to be
[26:41] (1601.52s)
fair as an engineer It kind of this
[26:43] (1603.68s)
might feel weird but it makes me feel a
[26:45] (1605.52s)
bit better that we're we're actually
[26:47] (1607.60s)
back to talking about like engineering
[26:49] (1609.36s)
step by step as opposed to like okay you
[26:51] (1611.28s)
know having these you know it feels like
[26:52] (1612.88s)
you get a new model not now but early on
[26:54] (1614.96s)
when we got a new model it was like oh
[26:56] (1616.56s)
my gosh this magic and it it took a
[26:58] (1618.16s)
while to understand how it works how
[27:00] (1620.48s)
it's broken down etc. You did mention
[27:03] (1623.12s)
your infrastructure. Can you talk a
[27:05] (1625.04s)
little bit about how we can imagine your
[27:08] (1628.00s)
hardware and and backend stack if I was
[27:10] (1630.24s)
to join it uh Windsurf as an engineer?
[27:13] (1633.12s)
Like is is it going to be a bunch of you
[27:15] (1635.12s)
know cloud deployments here and there?
[27:17] (1637.36s)
Do you do you self-host some of your
[27:18] (1638.72s)
your GPUs? Cuz a lot of AI startups who
[27:22] (1642.24s)
are smaller or more modest, they're just
[27:24] (1644.16s)
going to you know platform as a service.
[27:26] (1646.48s)
I it sounds like you might be at the
[27:28] (1648.16s)
scale where maybe you're outgrowing this
[27:29] (1649.60s)
as well. Yeah, I think we might have we
[27:32] (1652.40s)
might have just never done kind of, you
[27:34] (1654.16s)
know, buying offtheshelf stuff in the in
[27:36] (1656.32s)
the early part of the company. Maybe to
[27:37] (1657.76s)
our your background, I keep forgetting
[27:40] (1660.08s)
this. Yeah, but even more than the
[27:41] (1661.68s)
background, I think there were cases
[27:42] (1662.72s)
where we could have and and made maybe
[27:44] (1664.40s)
should have. One of the reasons why we
[27:46] (1666.48s)
also didn't was very quickly we got
[27:48] (1668.24s)
brought into working with very large
[27:49] (1669.84s)
enterprises and I think the more
[27:51] (1671.36s)
dependencies you have in your in your in
[27:53] (1673.44s)
your software, it just makes it harder
[27:55] (1675.04s)
and harder for these larger companies to
[27:56] (1676.64s)
integrate the technology, right? um like
[27:58] (1678.64s)
they don't want a ton of subprocessors
[28:00] (1680.48s)
attached to it, right? We recently got
[28:02] (1682.16s)
Fed Ramp compliant, Fedramp high
[28:04] (1684.08s)
compliance. We're the only AI software
[28:06] (1686.48s)
assistant with Fedramp high compliance
[28:08] (1688.32s)
and the only reason why that's the case
[28:10] (1690.16s)
is we've kept our systems like very
[28:12] (1692.00s)
tight, right? Um and for these
[28:15] (1695.12s)
compliances, I did some but not
[28:16] (1696.56s)
specifically Fedramp. What what do you
[28:18] (1698.00s)
need to prove that you are uh this
[28:20] (1700.16s)
compliant? Yeah, I think basically you
[28:22] (1702.56s)
need to map out uh map out the high
[28:25] (1705.52s)
levels of sort of all the interactions.
[28:27] (1707.84s)
You need to be very methodical about
[28:29] (1709.36s)
releases and how the releases make it
[28:31] (1711.28s)
into the system. You need to be very uh
[28:33] (1713.28s)
methodical about where data is persisted
[28:35] (1715.36s)
at a layer that is like probably much
[28:37] (1717.20s)
deeper than sock 2. I think like going
[28:39] (1719.20s)
through the sock 2 versus feder uh I did
[28:42] (1722.24s)
so two and that was already pretty long.
[28:44] (1724.24s)
So it was already a really long Yeah.
[28:46] (1726.56s)
It's impressive that you did this as a
[28:47] (1727.92s)
as a startup scale of congrats. Yeah,
[28:51] (1731.04s)
one of the reasons why was I guess like
[28:52] (1732.64s)
one of our first customers that were a
[28:54] (1734.16s)
large enterprise was was like Dell,
[28:56] (1736.00s)
right? Which is like not a usual first
[28:58] (1738.08s)
large enterprise and I guess for
[28:59] (1739.68s)
startups no. For startup definitely no.
[29:01] (1741.84s)
So it forces down a path of how do we
[29:03] (1743.84s)
build very scalable infrastructure? How
[29:05] (1745.76s)
do we make sure our systems work at a
[29:07] (1747.36s)
codebase that is 100 plus million lines
[29:09] (1749.04s)
of code? What does our GPU provisioning
[29:11] (1751.28s)
need to look like for this large a team?
[29:13] (1753.04s)
It's just forced us to become a lot more
[29:14] (1754.88s)
I guess operationally sound um for these
[29:17] (1757.76s)
kinds of problems if that makes sense.
[29:19] (1759.80s)
Yeah. And how do you deal with with
[29:23] (1763.52s)
inference? You're you're you're serving
[29:25] (1765.92s)
the these systems that serve probably
[29:28] (1768.16s)
billions or hundreds of billions well
[29:29] (1769.68s)
actually hundreds of billions tokens per
[29:31] (1771.04s)
day as just as you just said with low
[29:33] (1773.00s)
latency. What smart approaches do you do
[29:36] (1776.40s)
to to do this? What kind of
[29:38] (1778.16s)
optimizations have you looked into?
[29:40] (1780.32s)
Yeah, I mean like a lot as you can
[29:42] (1782.32s)
imagine. Um, one of the interesting
[29:43] (1783.76s)
things about some of the products that
[29:45] (1785.20s)
we have like the passive experience,
[29:46] (1786.80s)
latency matters a ton in a way that's
[29:48] (1788.56s)
like very different than some of these
[29:50] (1790.08s)
API providers. Like I think for the API
[29:52] (1792.40s)
providers, time to first token is
[29:54] (1794.08s)
important, but it doesn't matter that
[29:55] (1795.36s)
time to first token is 100 milliseconds.
[29:57] (1797.44s)
For us, that's the that's the bar we are
[29:59] (1799.60s)
trying to look for, right? Can we get it
[30:01] (1801.28s)
to sub, you know, a couple hundred
[30:02] (1802.88s)
milliseconds and then hundreds of tokens
[30:04] (1804.80s)
a second um of for the generation time.
[30:07] (1807.76s)
So much faster than what all of the
[30:09] (1809.36s)
providers are providing in terms of
[30:10] (1810.88s)
throughput as well. um just because of
[30:12] (1812.80s)
how quickly we want this product to kind
[30:14] (1814.72s)
of run. And you can imagine there's a
[30:16] (1816.08s)
lot of things that we we want to do,
[30:17] (1817.60s)
right? How do we run how do we do things
[30:19] (1819.28s)
like speculative decoding? How do we do
[30:20] (1820.88s)
things like model parallelism, right?
[30:22] (1822.56s)
How do we make sure we can actually
[30:24] (1824.32s)
batch requests properly to get the
[30:26] (1826.16s)
maximum utilization of the GPU all the
[30:28] (1828.48s)
while not hurting latency, right? That's
[30:30] (1830.72s)
an important thing. And one of the
[30:32] (1832.08s)
interesting things just to give uh some
[30:33] (1833.92s)
of the listeners some mental model, GPUs
[30:36] (1836.56s)
are are amazing. They have a lot of
[30:38] (1838.16s)
compute. If I were to draw an analogy to
[30:40] (1840.64s)
computer or to CPUs, GPUs have over sort
[30:43] (1843.84s)
of two orders of magnitude more compute
[30:45] (1845.68s)
than a CPU, right? It might actually be
[30:47] (1847.44s)
more on the on the more recent GPUs, but
[30:49] (1849.20s)
keep that in mind. But GPUs only have an
[30:51] (1851.68s)
order of magnitude more memory bandwidth
[30:53] (1853.36s)
than a CPU. So what that actually means
[30:55] (1855.28s)
is if you do things that are not compute
[30:57] (1857.36s)
intense, you will be memory bound,
[30:59] (1859.36s)
right? So that necessarily means to get
[31:01] (1861.36s)
the most out of the compute of your
[31:02] (1862.88s)
processor, you need to be doing a lot of
[31:04] (1864.40s)
things in parallel. But if you need to
[31:06] (1866.40s)
wait to do a lot of things in parallel,
[31:08] (1868.24s)
you're going to be hurting the latency.
[31:09] (1869.60s)
So there's there's all of these
[31:11] (1871.36s)
different trade-offs that we need to
[31:12] (1872.56s)
make to ensure a quality of experience
[31:14] (1874.88s)
for our users that we think is high for
[31:16] (1876.72s)
the product. And we've obviously mapped
[31:18] (1878.64s)
out all of these. We've seen how hey,
[31:20] (1880.80s)
like if we change the latency by this
[31:22] (1882.88s)
much, what does this change in terms of
[31:24] (1884.48s)
people's willingness to use the product?
[31:26] (1886.08s)
And and it's very stark, right? Like a
[31:27] (1887.92s)
10 millisecond increase in latency
[31:29] (1889.84s)
affects people's willingness to use the
[31:31] (1891.52s)
product materially, right? It's
[31:33] (1893.28s)
percentage points that we're talking
[31:34] (1894.48s)
about. So these are all parts of the
[31:36] (1896.24s)
inference stack that we've needed to
[31:37] (1897.60s)
optimize. And is latency important
[31:40] (1900.64s)
enough or or does the location factor
[31:43] (1903.12s)
factor into this? The the physically how
[31:45] (1905.52s)
close people you know using windsurf are
[31:48] (1908.56s)
to wherever wherever your server and
[31:50] (1910.88s)
then your GPUs are running. You need to
[31:53] (1913.20s)
worry about that as well. You do need to
[31:55] (1915.04s)
worry about that like speed of light
[31:56] (1916.32s)
starts uh starts mattering.
[31:58] (1918.08s)
Interestingly, you know, this is not
[31:59] (1919.52s)
something I would have expected, but we
[32:00] (1920.96s)
do have users in India and
[32:02] (1922.96s)
interestingly, the speed of light is not
[32:04] (1924.88s)
actually what is bottlenecking their uh
[32:07] (1927.20s)
their performance. It's actually the
[32:09] (1929.20s)
local network. So, just the time it
[32:10] (1930.88s)
takes for the packet to get from maybe
[32:12] (1932.80s)
like from their home to the major ISP is
[32:16] (1936.40s)
actually somehow there's a lot of
[32:18] (1938.08s)
congestion there and that's the kind of
[32:19] (1939.92s)
stuff that we need to kind of deal with.
[32:21] (1941.60s)
Um, but by the way, that is something
[32:23] (1943.20s)
that we just cannot solve right now. So
[32:25] (1945.20s)
you're totally right. The data center
[32:26] (1946.64s)
placement matters. Like for instance, if
[32:28] (1948.48s)
you force a data center in Sydney and
[32:30] (1950.48s)
you have people in Europe, they're not
[32:31] (1951.92s)
going to be happy about the latency. So
[32:33] (1953.60s)
we do think about like where the
[32:34] (1954.96s)
location of our GPUs are to make sure
[32:36] (1956.80s)
that we do have good performance. But
[32:38] (1958.64s)
there are some places where there are
[32:39] (1959.84s)
some issues that even we can't get
[32:41] (1961.20s)
around basically. No, I the last time I
[32:44] (1964.00s)
heard this complaint before Windsurf
[32:45] (1965.92s)
because this came up with actually again
[32:48] (1968.32s)
some someone who's using Windsor and
[32:50] (1970.40s)
other tools a lot said that specifically
[32:52] (1972.16s)
for one of the tools he can tell that
[32:55] (1975.12s)
the data centers are far away because
[32:57] (1977.20s)
it's just slow. Cloud development
[32:58] (1978.88s)
environments had the exact same thing
[33:00] (1980.96s)
because they were similar, right? Like
[33:02] (1982.56s)
this was I'm not sure they're as popular
[33:04] (1984.24s)
right now, but there was a time where it
[33:05] (1985.52s)
looked like it might be the future. you
[33:06] (1986.96s)
just log on to your remote environment
[33:09] (1989.28s)
which is running on CPUs or GPUs
[33:11] (1991.04s)
somewhere else and it again I think it
[33:13] (1993.76s)
might have to do with as as does when
[33:15] (1995.28s)
you're typing like when I'm when I'm
[33:17] (1997.04s)
using it I mean I'm just used to like I
[33:19] (1999.04s)
do on subsecond probably like sub few
[33:21] (2001.36s)
hundred milliseconds I I I just notice
[33:23] (2003.12s)
right you feel it's slow and it it it
[33:25] (2005.28s)
just bothers you like it just
[33:28] (2008.48s)
No, I agree. I think if I had to if I
[33:30] (2010.72s)
had to even see every time I typed a
[33:32] (2012.72s)
keystroke like a couple hundred
[33:33] (2013.84s)
milliseconds later the the key would
[33:35] (2015.60s)
show up like I would rage quit. I that
[33:37] (2017.36s)
would be a terrible experience for me.
[33:39] (2019.44s)
How do you deal with indexing of the
[33:42] (2022.08s)
code? So you're you're going to be
[33:43] (2023.36s)
indexing you know depends on the code
[33:44] (2024.88s)
base it'll be more or less but if you
[33:46] (2026.24s)
add it up I'm sure we're talking
[33:47] (2027.52s)
billions or or a lot more in in code and
[33:50] (2030.24s)
for for your enterprise customers you
[33:52] (2032.08s)
might actually have you know the
[33:53] (2033.76s)
hundreds of millions or or even more
[33:55] (2035.68s)
lines of code. Is is there anything like
[33:59] (2039.36s)
novel or interesting that you're using
[34:01] (2041.20s)
or is it just kind of tried and proven
[34:04] (2044.24s)
things for example that search engines
[34:06] (2046.08s)
might might use? It's a little bit of
[34:08] (2048.40s)
both to be honest and and what I mean by
[34:10] (2050.24s)
that it's not a very clean answer. We do
[34:13] (2053.12s)
try approaches that are embedding based.
[34:14] (2054.96s)
We have approaches that are keyword
[34:16] (2056.32s)
based um on the on the indexing. Um,
[34:18] (2058.80s)
interestingly actually one of the
[34:20] (2060.40s)
approaches that we've taken that's very
[34:21] (2061.92s)
different than search and maybe actually
[34:23] (2063.36s)
systems like Google actually do this is
[34:25] (2065.60s)
we not only actually look at just the
[34:28] (2068.40s)
retrieval we do a lot of computation at
[34:31] (2071.04s)
retrieval time. So what that means is
[34:33] (2073.12s)
let's say you want to go out and like
[34:34] (2074.72s)
ask a question you know one of the
[34:36] (2076.56s)
things that you can go out and do is
[34:37] (2077.68s)
like ask it to an embedding store and
[34:40] (2080.00s)
get a bunch of locations. What we found
[34:41] (2081.68s)
was the recall of that operation was
[34:43] (2083.36s)
quite low and you know one of the
[34:46] (2086.08s)
reasons why that happens is embedding
[34:49] (2089.20s)
search is a little bit lossy like let's
[34:50] (2090.88s)
say I was to go to a codebase and ask
[34:53] (2093.04s)
hey give me all cases where this
[34:55] (2095.92s)
function uh this spring boot version xt
[34:58] (2098.64s)
type function was there I don't think
[35:00] (2100.72s)
anyone would believe embedding search
[35:02] (2102.24s)
would be comprehensive right because
[35:04] (2104.32s)
it's just not like you're taking
[35:06] (2106.00s)
something that is very high
[35:07] (2107.04s)
dimensionality and reducing it to
[35:08] (2108.80s)
something very low dimensionality
[35:10] (2110.48s)
without any knowledge of the question.
[35:12] (2112.24s)
That's like the most important thing. So
[35:14] (2114.08s)
it like it needs to somehow encode all
[35:16] (2116.08s)
the possible be relevant for all the
[35:18] (2118.24s)
possible questions. So instead what we
[35:20] (2120.00s)
decided to do is take a variety of
[35:21] (2121.68s)
approaches to retrieve a large amount of
[35:24] (2124.00s)
data and that it could include the
[35:25] (2125.20s)
knowledge graph that include could
[35:26] (2126.56s)
include the dependencies from the
[35:27] (2127.68s)
abstract syntax tree that could include
[35:29] (2129.60s)
like as like keyword search that could
[35:31] (2131.52s)
include embedding search and you kind of
[35:32] (2132.88s)
fuse them all together and then after
[35:34] (2134.16s)
that we throw compute at this and
[35:35] (2135.92s)
actually go out and process large chunks
[35:38] (2138.72s)
of the codebase at at inference time and
[35:42] (2142.24s)
go out and like say hey these are the
[35:44] (2144.00s)
most relevant snippets and this gives us
[35:46] (2146.00s)
much higher precision recall right on
[35:48] (2148.32s)
the retrieval side to actually go out
[35:50] (2150.08s)
and and by the way that is like very
[35:51] (2151.76s)
important for an agent because imagine
[35:53] (2153.28s)
if an agent kind of like doesn't have
[35:54] (2154.80s)
access and doesn't deeply understand the
[35:56] (2156.64s)
codebase all the while the codebase is
[35:58] (2158.24s)
much larger than the context length of
[35:59] (2159.60s)
what an agent is able to take in right
[36:01] (2161.36s)
so we've you know optimizing the
[36:03] (2163.28s)
precision recall of the system is
[36:04] (2164.72s)
actually something that we spent a lot
[36:05] (2165.92s)
of time and built a lot of systems for
[36:08] (2168.08s)
it's interesting because it feels like
[36:09] (2169.72s)
you're it shows how well a it's code so
[36:13] (2173.04s)
you can more easily work with it
[36:14] (2174.64s)
especially with certain keywords for
[36:16] (2176.00s)
example on some languages I can imagine
[36:17] (2177.68s)
that you can even just, you know, you
[36:18] (2178.80s)
can even list all the keywords that are
[36:20] (2180.40s)
pretty common and you can decide if it's
[36:21] (2181.92s)
a keyword or if it's something special
[36:24] (2184.00s)
where and if it's a keyword, you can
[36:25] (2185.68s)
already just like do it. And it's
[36:27] (2187.60s)
interesting how you can combine the kind
[36:28] (2188.88s)
of old school or old school before
[36:31] (2191.52s)
before LMS and then add the best parts
[36:34] (2194.08s)
of LMS but not forgetting about the you
[36:36] (2196.64s)
know what worked before.
[36:39] (2199.52s)
That's right. I'm just I I wonder if
[36:41] (2201.84s)
there's other any other industry that
[36:44] (2204.48s)
that has this that that we we do have
[36:46] (2206.24s)
this like lower the dimensionality space
[36:48] (2208.32s)
in terms of the the grammar and all
[36:50] (2210.64s)
these things. We understand the usage
[36:52] (2212.16s)
pretty well and then the users are power
[36:54] (2214.32s)
users who actually you know the same
[36:56] (2216.72s)
people use it who could actually build
[36:58] (2218.48s)
you know this tool. Yeah. I you know I
[37:00] (2220.80s)
feel I feel like Google's like Google's
[37:03] (2223.12s)
system is probably ridiculously complex
[37:04] (2224.96s)
and sophisticated for obvious reasons
[37:07] (2227.04s)
just because for for one they've been
[37:08] (2228.96s)
doing this for so long and they've been
[37:10] (2230.40s)
the they obviously they were they've
[37:12] (2232.16s)
been at the top for such a long time and
[37:14] (2234.08s)
then also on top of that they're the
[37:16] (2236.40s)
monetary value they get from delivering
[37:18] (2238.24s)
great search is so high given ads that
[37:21] (2241.04s)
they are incentivized to throw a lot of
[37:22] (2242.64s)
compute even at at the at the query time
[37:25] (2245.20s)
right to make sure that the quality of
[37:26] (2246.72s)
suggestions is is really good so I
[37:28] (2248.72s)
assume they're doing a lot of a lot of
[37:30] (2250.32s)
tactics there. Obviously, I'm not privy
[37:31] (2251.92s)
to to all the details uh of the system.
[37:34] (2254.16s)
So, I I don't know. Well, it's
[37:35] (2255.76s)
interesting because I I would have
[37:36] (2256.88s)
agreed with you until recently, but
[37:38] (2258.80s)
there are some search engines that are
[37:40] (2260.32s)
are doing really good results. Uh so, I
[37:43] (2263.12s)
I wonder if Google is less focused on
[37:45] (2265.44s)
the actual the hay stack and the needle
[37:47] (2267.44s)
and maybe more on revenue or maybe
[37:48] (2268.88s)
they're doing it as unvisible. I'm I'm
[37:50] (2270.40s)
sure they're doing an amazing job by the
[37:51] (2271.84s)
way behind the hood, but I I wonder if
[37:53] (2273.60s)
some of that knowledge has commoditized,
[37:54] (2274.96s)
but you know, we'll see. But moving on
[37:57] (2277.12s)
from indexing, in terms of databases, uh
[37:59] (2279.84s)
what kind of databases do you use and
[38:01] (2281.68s)
what challenges are they giving you?
[38:03] (2283.28s)
Like again, you're you're not I'm
[38:05] (2285.20s)
assuming you're not just going to be
[38:06] (2286.56s)
happy with like the usual let's throw
[38:08] (2288.08s)
everything in Postgress or or or do you
[38:11] (2291.60s)
actually you might be able to I don't
[38:13] (2293.04s)
know sounds like these days Postgress is
[38:15] (2295.04s)
is can be used surprisingly well for
[38:16] (2296.80s)
even embeddings. Yeah, you know I I
[38:19] (2299.20s)
think we we uh we do a combination of
[38:21] (2301.52s)
things. So we do like some amount of
[38:22] (2302.88s)
local indexing. We do some remote
[38:24] (2304.88s)
indexing as well. Uh local indexing as
[38:27] (2307.52s)
on the user machine on the user's
[38:29] (2309.28s)
machine that helps us get in in some
[38:31] (2311.28s)
ways the benefit of that is it helps you
[38:33] (2313.04s)
build up um if you were to say hey you
[38:35] (2315.04s)
have some understanding of the codebase.
[38:36] (2316.80s)
Um the problem is that understanding
[38:38] (2318.56s)
changes very quickly as the user starts
[38:40] (2320.08s)
changing code starts checking out new
[38:42] (2322.16s)
branches and you don't want to like
[38:44] (2324.00s)
basically say all of your information
[38:45] (2325.44s)
about the codebase you need to throw
[38:46] (2326.64s)
away because of that. So, it's good to
[38:47] (2327.92s)
have like some information about like
[38:49] (2329.52s)
the user's a user's history and what
[38:51] (2331.84s)
they've done locally kind of like
[38:53] (2333.20s)
stored. Um, in terms of remote, I think
[38:55] (2335.28s)
it would be a lot simpler than people
[38:56] (2336.96s)
would imagine. One of the complexities
[38:58] (2338.72s)
of our product, the reason why the
[39:00] (2340.32s)
product is very complex is actually the
[39:02] (2342.16s)
fact that we need to run all of this GPU
[39:04] (2344.24s)
infrastructure, right? That's actually a
[39:06] (2346.08s)
large chunk of the complexity because if
[39:07] (2347.68s)
you were to look at our QPS, our QPS is
[39:09] (2349.92s)
high, but it is not like tens of
[39:11] (2351.84s)
thousands of QPS, right? Actually it's
[39:14] (2354.80s)
not it doesn't need to be that high
[39:16] (2356.16s)
because in in some ways in in some ways
[39:18] (2358.72s)
like actually each of the queries that
[39:20] (2360.80s)
is happening is actually a really
[39:22] (2362.08s)
expensive query. It's doing trillions of
[39:23] (2363.84s)
operations remotely. So actually the
[39:25] (2365.52s)
complexity of the problem is how do you
[39:27] (2367.12s)
optimally do that process right? So we
[39:30] (2370.24s)
can actually get away with things like
[39:31] (2371.44s)
postgress like we're not in fact I would
[39:33] (2373.44s)
say I like to keep things pretty simple
[39:35] (2375.12s)
if the if it's possible to keep things
[39:36] (2376.72s)
very simple and we should not be rolling
[39:38] (2378.88s)
any type of our own database like I
[39:40] (2380.88s)
think databases are very very complex
[39:42] (2382.72s)
pieces of technology. I think we're good
[39:44] (2384.32s)
engineers but we're definitely not good
[39:45] (2385.60s)
enough to kind of like on the side build
[39:47] (2387.52s)
our own database. And then for local
[39:49] (2389.44s)
indexing what database do you use it?
[39:52] (2392.08s)
Yeah, we have our own combination of
[39:53] (2393.84s)
like sort of like SQL based database. We
[39:55] (2395.76s)
have a local SQL database and then like
[39:57] (2397.44s)
some like sort of embedding databases as
[39:59] (2399.76s)
well that we store locally as well. What
[40:02] (2402.16s)
is your view on the value of embedding
[40:04] (2404.64s)
databases? This has been an ongoing
[40:06] (2406.08s)
debate for for the past like since Chad
[40:08] (2408.56s)
GBC became big. Again, there were two
[40:10] (2410.56s)
schools of thoughts. One is we do need
[40:12] (2412.64s)
uh embedding based database embedding
[40:14] (2414.48s)
databases because they can give us
[40:15] (2415.76s)
vector search. They can give us all
[40:16] (2416.96s)
these other features that LLMs and
[40:19] (2419.04s)
embeddings will need. And the other
[40:20] (2420.72s)
school of thought is well let's just
[40:21] (2421.92s)
expand relational databases. We add a
[40:23] (2423.60s)
few extra indexes and boom we're done.
[40:25] (2425.76s)
From you're you know you're more of a
[40:27] (2427.76s)
user of this but you're a heavy user at
[40:29] (2429.52s)
at Windsurf and and
[40:32] (2432.04s)
Kodium. What pros and cons are you
[40:34] (2434.72s)
seeing? I'm just trying to get you to
[40:36] (2436.40s)
like go to one direction the other. It's
[40:38] (2438.72s)
a good question. So my our viewpoint on
[40:41] (2441.36s)
embeddings are probably that they are
[40:42] (2442.80s)
not uh they don't solve uh a problem by
[40:45] (2445.20s)
themselves. They actually just do not.
[40:47] (2447.44s)
So the answer is going to be mixed. And
[40:49] (2449.12s)
then the question is why do we even do
[40:50] (2450.48s)
it in the first place? Right? And I
[40:52] (2452.24s)
think it really boils down to it's a
[40:53] (2453.84s)
it's a recall problem, right? When you
[40:55] (2455.60s)
want to do good retrieval, you need the
[40:57] (2457.76s)
input to what you're willing to consider
[41:00] (2460.08s)
to be large and high recall, right? And
[41:03] (2463.04s)
if you were to think about it, the
[41:04] (2464.32s)
problem is if you only have something
[41:05] (2465.84s)
like keyword search and you have a very
[41:07] (2467.92s)
very large uh sort of uh codebase
[41:10] (2470.64s)
actually what happens if the user typos
[41:12] (2472.40s)
something, right? Then your recall is
[41:14] (2474.08s)
going to be bad. So I the way I like to
[41:16] (2476.48s)
think about it is each of these
[41:17] (2477.44s)
approaches keyword search right um like
[41:20] (2480.00s)
sort of knowledge graph based retrieval
[41:21] (2481.68s)
all of them they're all like different
[41:22] (2482.96s)
circles and what you're trying to do is
[41:25] (2485.04s)
get get something where the union of
[41:26] (2486.56s)
these circles is going to give you the
[41:28] (2488.56s)
highest recall uh ultimately for the
[41:30] (2490.64s)
retrieval query and I think embedding
[41:32] (2492.16s)
can give you good recall because it is
[41:34] (2494.80s)
able to summarize or actually able to
[41:37] (2497.60s)
distill somewhat of semantic information
[41:39] (2499.36s)
about the about the chunk of code the a
[41:41] (2501.76s)
or the file or the directory and all
[41:43] (2503.68s)
this other stuff. So what I would say is
[41:45] (2505.36s)
it's a tool in the toolkit, right? It's
[41:47] (2507.84s)
not like you cannot build our product
[41:50] (2510.08s)
entirely with an embedding system, but
[41:51] (2511.68s)
also does the embedding system help? I
[41:53] (2513.12s)
think it actually does help, right? It
[41:54] (2514.24s)
does improve our recall metrics and our
[41:55] (2515.68s)
precision metrics.
[41:58] (2518.00s)
So I talked with your your head of
[42:00] (2520.64s)
research Nicholas Moy and he told me
[42:03] (2523.12s)
about a really interesting challenge
[42:04] (2524.40s)
that you're facing which he called it
[42:06] (2526.00s)
the split split brain situation. He he
[42:09] (2529.04s)
was basically saying that you it's
[42:10] (2530.32s)
almost like the the team and everyone on
[42:12] (2532.08s)
the team needs to have two brains. One
[42:13] (2533.36s)
is just being aggressively in the
[42:15] (2535.28s)
present, shipping improvements as you
[42:16] (2536.72s)
go, but also then do a long-term vision
[42:18] (2538.80s)
where you're building for the long run.
[42:21] (2541.08s)
How do you do this? Like how did you
[42:23] (2543.52s)
start doing it and how how do you keep
[42:25] (2545.12s)
doing it? You did mention earlier,
[42:26] (2546.64s)
right, that half the team is working on
[42:28] (2548.32s)
other stuff, but you kind of do you kind
[42:29] (2549.92s)
of like split people so like people
[42:31] (2551.68s)
focus on short-term, long-term or or do
[42:34] (2554.48s)
does everyone including you juggle these
[42:36] (2556.56s)
things in your head dayto day? It's an
[42:39] (2559.04s)
interesting one. Yeah, I don't want to
[42:40] (2560.72s)
give uh myself like that much credit
[42:42] (2562.40s)
here. I think like our engineers uh are
[42:44] (2564.64s)
probably probably should be given most
[42:46] (2566.16s)
of the credit here. But I think in terms
[42:47] (2567.60s)
of like maybe company strategic
[42:49] (2569.20s)
direction both me and my co-founder the
[42:51] (2571.44s)
CTO of the company. Uh he we we try to
[42:54] (2574.64s)
think a lot about how do we disrupt
[42:56] (2576.60s)
ourselves right? Um because I think it's
[42:59] (2579.28s)
very easy to get into the state where
[43:01] (2581.28s)
hey I added this cool button I added
[43:03] (2583.28s)
this like way to control X with a knob
[43:05] (2585.84s)
and you keep going down this path and
[43:07] (2587.60s)
yeah your users get very happy but what
[43:09] (2589.92s)
happens if tomorrow I told you users
[43:11] (2591.84s)
don't need to do that and it's an
[43:13] (2593.76s)
amazing experience and it's like a
[43:15] (2595.60s)
better experience. Your users are going
[43:17] (2597.12s)
to feel like why why do I need to do
[43:18] (2598.64s)
this? So here's the thing. Users are
[43:20] (2600.56s)
right up to a certain point, right? They
[43:23] (2603.76s)
will not be able to see a like, by the
[43:26] (2606.48s)
way, if they can, then we should not be
[43:28] (2608.08s)
doing this. They will not be able to see
[43:29] (2609.44s)
exactly what the future solution should
[43:31] (2611.04s)
be, right? If our users can see the
[43:32] (2612.64s)
future solution better than we can, like
[43:34] (2614.96s)
we should just pack up our bags and
[43:36] (2616.40s)
leave, right, at that point. Like what
[43:38] (2618.00s)
are we actually doing here? So I think
[43:39] (2619.84s)
basically, you know, you have this you
[43:41] (2621.60s)
have this tension here where you need to
[43:43] (2623.28s)
build features to make the product more
[43:45] (2625.60s)
usable today, right? And our users are
[43:47] (2627.76s)
100% right. They understand this. They
[43:49] (2629.76s)
they face pain uh through many different
[43:51] (2631.68s)
axes that we don't and we should listen
[43:53] (2633.28s)
to them. But also at the same time, we
[43:55] (2635.52s)
might have an opinionated stance on
[43:57] (2637.36s)
where coding and where these models and
[43:59] (2639.36s)
where this product can go that we should
[44:01] (2641.04s)
go out and build towards and we should
[44:03] (2643.28s)
be expanding expounding a large amount
[44:06] (2646.16s)
of our engineering capital on that. And
[44:09] (2649.44s)
C can you talk about like some kind of
[44:11] (2651.28s)
bets that you're having? you know, not
[44:13] (2653.76s)
necessarily giving away everything, but
[44:15] (2655.84s)
like some some promising directions that
[44:17] (2657.76s)
might or might not work out or even in
[44:19] (2659.36s)
the past some some promising directions
[44:20] (2660.72s)
that maybe did not work out. Yeah, I'll
[44:22] (2662.32s)
tell you a lot of them. Yeah. So, so so
[44:24] (2664.40s)
we failed a lot. Um and and I think
[44:26] (2666.56s)
failing is great. Uh and one of the
[44:28] (2668.56s)
things that I I tell our engineers is
[44:30] (2670.16s)
like engineering is not like a factory
[44:32] (2672.24s)
building, right? It's it's actually, you
[44:34] (2674.24s)
know, you have a hypothesis, you go in
[44:35] (2675.92s)
and you shouldn't be penalized if you
[44:37] (2677.52s)
failed. Actually, I love the idea of
[44:40] (2680.48s)
hey, an idea sounds interesting. We
[44:42] (2682.24s)
tried it and it didn't work because we
[44:43] (2683.92s)
at least learned something and learning
[44:45] (2685.68s)
something is awesome. And I'll give you
[44:46] (2686.80s)
an example. The agent work that we did
[44:48] (2688.64s)
for we didn't even start beginning of
[44:50] (2690.32s)
last year. We started even before
[44:51] (2691.68s)
beginning of last year. It was not
[44:53] (2693.12s)
working for many months. And actually
[44:55] (2695.28s)
Nick Moy was working on who who you
[44:57] (2697.28s)
probably spoke with was the one who was
[44:58] (2698.80s)
working on on some of this stuff. And
[45:00] (2700.80s)
you know for a long time a lot of what
[45:02] (2702.56s)
he was doing was just not working. And
[45:04] (2704.64s)
he would come to us and we would say
[45:06] (2706.48s)
okay fine. It doesn't seem like it's
[45:07] (2707.92s)
working. So we're definitely not going
[45:08] (2708.88s)
to ship this. Uh but let's keep doing
[45:10] (2710.80s)
it. let's keep working on it uh because
[45:13] (2713.04s)
because we believe it's going to get
[45:14] (2714.24s)
better and better but it was a it was
[45:15] (2715.84s)
failing for a long time right we came
[45:18] (2718.00s)
out with a with a review product right
[45:20] (2720.08s)
in uh beginning of last year or around
[45:22] (2722.40s)
then called forge for code reviews um we
[45:25] (2725.12s)
thought it was it was kind of useful
[45:26] (2726.56s)
internally at the company and we thought
[45:27] (2727.76s)
we could continue to improve it people
[45:29] (2729.36s)
did not find it that useful right um it
[45:31] (2731.76s)
was not actually that useful and and you
[45:33] (2733.60s)
know this was we were going in with the
[45:35] (2735.60s)
assumption code reviews take a long time
[45:37] (2737.36s)
what if we could help people and the
[45:38] (2738.72s)
fact of the matter was the way we
[45:40] (2740.48s)
thought we could help people wasn't
[45:41] (2741.68s)
actually material enough for people to
[45:43] (2743.04s)
want to take on this new tool, right?
[45:46] (2746.16s)
And there's a lot of things that that
[45:48] (2748.16s)
sort of obviously that we've tried in
[45:50] (2750.24s)
the past that just didn't work the way
[45:51] (2751.76s)
we the way we thought it did. And you
[45:54] (2754.00s)
know, for me, I think I would be totally
[45:55] (2755.92s)
fine if 50% of the bets we make don't
[45:57] (2757.76s)
work. Yeah. And it's a lot of startups
[46:00] (2760.56s)
say that. And then after a while, what I
[46:02] (2762.56s)
notice is as a company becomes bigger, I
[46:04] (2764.64s)
saw this as Uber, it's actually not
[46:06] (2766.56s)
really the case. there's like failures
[46:09] (2769.04s)
kind of on paper it's embraced but
[46:11] (2771.52s)
actually it's not. So I think you know
[46:13] (2773.04s)
like there there's this tricky thing
[46:14] (2774.40s)
that when it's actually meant like it's
[46:16] (2776.40s)
awesome otherwise people just like start
[46:18] (2778.24s)
to like polish things and make things
[46:20] (2780.16s)
look good when they're not pretend that
[46:21] (2781.84s)
it's not a failure but it was a success
[46:23] (2783.36s)
and we're just walking away that kind of
[46:24] (2784.96s)
stuff. So it's it's nice to see that
[46:26] (2786.80s)
you're doing it. What was the the thing
[46:29] (2789.20s)
that turned the agents around which then
[46:31] (2791.36s)
I assume became Cascade? Like like was
[46:34] (2794.08s)
it a breakthrough on your end? Was it
[46:35] (2795.68s)
the models getting better? or was it a
[46:36] (2796.96s)
mix of something else? Yeah, I think it
[46:39] (2799.12s)
was a handful of things. So, I'll walk
[46:40] (2800.64s)
through it. So, first of all, the models
[46:42] (2802.24s)
got better. 100% the models got better.
[46:44] (2804.16s)
I think even with all the internal
[46:45] (2805.84s)
breakthroughs we had, if the models
[46:46] (2806.96s)
hadn't gotten better, we wouldn't have
[46:48] (2808.00s)
been able to release this. So, I don't
[46:49] (2809.52s)
want to trivialize that matter. It was
[46:51] (2811.60s)
huge. The two other pieces that were
[46:54] (2814.32s)
quite important was our retrieval stack
[46:56] (2816.00s)
was also getting better and better,
[46:57] (2817.28s)
which I think enabled us to work much
[46:59] (2819.28s)
better at these larger code bases. I
[47:01] (2821.04s)
guess table stakes it's it's quite good
[47:02] (2822.88s)
at zero to one programming but I think
[47:04] (2824.80s)
the thing that was like like like a
[47:07] (2827.36s)
groundbreaking to us was our developers
[47:09] (2829.52s)
on a complex codebase we're getting a
[47:11] (2831.92s)
lot of value from it right and I would
[47:14] (2834.00s)
say something quite interesting which is
[47:15] (2835.84s)
that chajbt by itself wasn't incredibly
[47:19] (2839.76s)
groundbreaking to our developers inside
[47:21] (2841.84s)
the company and that's not because
[47:23] (2843.76s)
that's not because chajbt is not a very
[47:25] (2845.36s)
useful product is a ridiculously useful
[47:27] (2847.68s)
product it's actually just because you
[47:29] (2849.60s)
need to think about it from the
[47:30] (2850.88s)
perspective of opportunity cost and how
[47:32] (2852.72s)
much more efficient you get. Our
[47:34] (2854.16s)
developers a lot of them have been
[47:35] (2855.68s)
developers in the past. They are quite I
[47:37] (2857.92s)
think we do have an exceptional
[47:39] (2859.12s)
engineering team. They were used to how
[47:40] (2860.80s)
to use Stack Overflow and all these
[47:42] (2862.48s)
other tools to get what they wanted. So,
[47:44] (2864.88s)
but suddenly when the model had the
[47:47] (2867.60s)
capability to not only understand your
[47:49] (2869.36s)
codebase and start to make larger and
[47:50] (2870.96s)
larger changes, it changed the behavior
[47:52] (2872.96s)
of the people inside the company.
[47:55] (2875.60s)
And not only change make changes, we
[47:57] (2877.36s)
built systems to very quickly edit the
[47:59] (2879.12s)
code, right? the edit the ability to
[48:01] (2881.12s)
edit code. We built uh the kind of
[48:03] (2883.52s)
models to take a highle plan and make an
[48:05] (2885.68s)
edit to a piece of code very fast. So
[48:07] (2887.84s)
all of these together made it so that
[48:09] (2889.28s)
this was a workflow that our developers
[48:11] (2891.28s)
wanted to use, right? We had the speed
[48:13] (2893.76s)
covered. We had the fact that the that
[48:15] (2895.60s)
it it understood the codebase well and
[48:17] (2897.52s)
then we also had massive model
[48:19] (2899.12s)
improvements to actually be able to call
[48:20] (2900.40s)
these tools and make these iterative
[48:21] (2901.84s)
changes, right? That's like, you know, I
[48:23] (2903.36s)
don't want to diminish that you you have
[48:24] (2904.80s)
all of these and suddenly now you have a
[48:26] (2906.48s)
real product.
[48:28] (2908.48s)
I've been meaning to ask you this, but
[48:30] (2910.08s)
how how is the team, you know, like
[48:32] (2912.56s)
using windsurf to develop windsurf
[48:35] (2915.20s)
because you're doing it, right? Like you
[48:36] (2916.96s)
just told me how you're you're doing it.
[48:38] (2918.88s)
Do you have from two two perspective?
[48:41] (2921.28s)
One, from the the technical feasib
[48:43] (2923.36s)
feasibility, I'm assuming, you know,
[48:45] (2925.04s)
like just, you know, you're not going to
[48:46] (2926.32s)
work on the exact same codebase or you
[48:48] (2928.80s)
have a fork or something like that or a
[48:50] (2930.40s)
build or something like that. And then
[48:52] (2932.00s)
the the other hand on like, you know, do
[48:54] (2934.32s)
you kind of force people to dock food?
[48:55] (2935.68s)
Do people just do it? Do people get
[48:57] (2937.44s)
stuck on certain versions? Do they turn
[48:58] (2938.88s)
on features for themselves? etc. So the
[49:01] (2941.92s)
way we do it is we do have like an
[49:03] (2943.36s)
insiders developer mode. So this enables
[49:05] (2945.84s)
us to test new features. I guess anyone
[49:07] (2947.92s)
at the company should be able to create
[49:09] (2949.20s)
a feature and then deploy to everyone
[49:10] (2950.80s)
internally. And now we have a large
[49:12] (2952.48s)
number of developers. We'll get
[49:13] (2953.52s)
feedback. We have an ability for our own
[49:15] (2955.68s)
developers to dog food releases. We can
[49:18] (2958.24s)
have our own developers say, "I hate
[49:19] (2959.84s)
this thing. Please don't ever do this."
[49:21] (2961.52s)
And it's nice because then we don't need
[49:22] (2962.88s)
to give it to our own developers
[49:24] (2964.84s)
but other developers. So I think we have
[49:27] (2967.68s)
this tiered system at the company. We
[49:29] (2969.28s)
have our own sort of release. We have
[49:31] (2971.20s)
next which is future-looking products
[49:33] (2973.04s)
that we that we are releasing that that
[49:35] (2975.60s)
are are a little bit more raw and then
[49:37] (2977.52s)
we have like the actual release that we
[49:38] (2978.96s)
give to developers which we're willing
[49:40] (2980.64s)
to AB test things but we're not willing
[49:42] (2982.16s)
to AB test things in such a way where we
[49:43] (2983.84s)
give people a comically bad experience
[49:45] (2985.52s)
just to AB test something right like
[49:47] (2987.36s)
it's bad because people are using this
[49:48] (2988.80s)
for their real work. So if you're using
[49:50] (2990.48s)
it for your real work, we don't want to
[49:51] (2991.68s)
be hurting you, right? So I think one of
[49:54] (2994.32s)
the things that's quite valuable to us
[49:55] (2995.92s)
is probably this is a this is you would
[49:58] (2998.48s)
think this is a failure mode for our
[50:00] (3000.08s)
company which is that we use windsurf
[50:01] (3001.76s)
largely speaking to modify large code
[50:03] (3003.60s)
bases, right? For obvious reasons
[50:05] (3005.28s)
because I think our developers aren't
[50:06] (3006.56s)
building these toy apps over and over
[50:08] (3008.00s)
again. But crazily enough, one of our
[50:10] (3010.48s)
biggest power users inside our company
[50:12] (3012.16s)
is actually a non-developer. He leads
[50:14] (3014.24s)
partnerships. He's never written
[50:15] (3015.84s)
software before. and he routinely builds
[50:18] (3018.48s)
apps with Windsorf right and he's one of
[50:21] (3021.52s)
our biggest users inside the company and
[50:23] (3023.28s)
we've used this to actually replace
[50:24] (3024.64s)
buying other SAS tools and he's actually
[50:26] (3026.96s)
even deployed some of these tools inside
[50:28] (3028.40s)
the company what function is this person
[50:30] (3030.64s)
in it's partnership so I'll give you an
[50:32] (3032.96s)
example like some of the tools some of
[50:34] (3034.80s)
the tools these are not complex pieces
[50:36] (3036.48s)
of software but you would be surprised
[50:37] (3037.68s)
at how much they actually cost they're
[50:38] (3038.96s)
six figures in in cost because it's
[50:41] (3041.12s)
bespoke software right I'll give you an
[50:42] (3042.72s)
example you have a quoting tool so the
[50:44] (3044.56s)
idea of a quoting tool is you have a
[50:46] (3046.32s)
customer, the customer has this size,
[50:47] (3047.60s)
they're in this vertical, you know, they
[50:49] (3049.36s)
want this kind of uh deal, here's the
[50:51] (3051.68s)
way it would look, here's the amount of
[50:53] (3053.20s)
discount we're willing to give them as a
[50:54] (3054.96s)
customer. Yeah. And usually these
[50:57] (3057.04s)
systems are really like systems that you
[50:59] (3059.76s)
would need to pay a lot of money for.
[51:01] (3061.20s)
And the reason is because I I don't know
[51:03] (3063.28s)
like our there's no reason for us for
[51:05] (3065.60s)
our developers to go out and build this
[51:07] (3067.04s)
internally, right? It's a big
[51:08] (3068.24s)
distraction from going out and building
[51:09] (3069.92s)
our product. But now on the other hand,
[51:12] (3072.00s)
you have a domain expert in the person
[51:14] (3074.56s)
that actually runs partnerships. He
[51:16] (3076.24s)
doesn't know software, but he knows this
[51:17] (3077.84s)
really well, right? And because of that,
[51:20] (3080.40s)
he's able to create these apps really
[51:21] (3081.92s)
quickly. Now, granted, we do have a
[51:23] (3083.44s)
person inside the company that looks at
[51:24] (3084.80s)
the app, make sure that it it
[51:26] (3086.64s)
logistically makes sense. It has it's
[51:28] (3088.80s)
secure, can be deployed inside the
[51:30] (3090.32s)
company. But these are more ephemeral
[51:32] (3092.16s)
apps, right? They're quite stateless. If
[51:34] (3094.08s)
you were to look at the input output of
[51:35] (3095.44s)
this app, it is it is not as complex as
[51:38] (3098.08s)
like let's say the Windinsserve project,
[51:40] (3100.32s)
right? Um but now we now have this like
[51:43] (3103.28s)
growing set of people inside the company
[51:44] (3104.80s)
that are not developers that are getting
[51:46] (3106.16s)
value from which we found a little
[51:47] (3107.68s)
surprising too. Yeah. And can you also
[51:50] (3110.56s)
give maybe just like some other examples
[51:53] (3113.36s)
of what you think it might be your
[51:54] (3114.56s)
place? Reason being is this like I'm
[51:56] (3116.08s)
actually really interested in this
[51:57] (3117.04s)
because I do hear a lot of people either
[51:59] (3119.04s)
on social media or CEOs saying that SAS
[52:01] (3121.44s)
apps could be the end of it and I've
[52:03] (3123.04s)
always been skeptical for the reason
[52:04] (3124.40s)
that you know there's two types of SAS
[52:05] (3125.76s)
apps and the most of the SAS apps I see
[52:07] (3127.76s)
for example workday which is HR platform
[52:10] (3130.40s)
and and they will have hosting they will
[52:12] (3132.24s)
have business rules they will like
[52:13] (3133.68s)
update to some extent with regulations
[52:15] (3135.20s)
and all that stuff so they do a lot of
[52:16] (3136.64s)
stuff that is the UI I know we can
[52:19] (3139.12s)
trivialize but it's it's a lot more than
[52:20] (3140.72s)
that and then there are a few of these
[52:23] (3143.20s)
simpler ones like I I don't want to put
[52:25] (3145.36s)
names but there's like a a polling app
[52:27] (3147.36s)
where internally inside the company you
[52:28] (3148.88s)
can pull it has state but it's
[52:30] (3150.72s)
relatively simple you can see behind it
[52:32] (3152.48s)
it's just going to I I could build I
[52:34] (3154.24s)
could build it but I just don't want to
[52:35] (3155.92s)
deal with uh authentication to to host
[52:38] (3158.80s)
it inside the company but it's it's
[52:40] (3160.64s)
already there and then there's ones you
[52:42] (3162.40s)
mentioned that are stateless so like
[52:43] (3163.76s)
what kinds of SAS tools you do you see
[52:46] (3166.32s)
that you're replacing and you might see
[52:48] (3168.32s)
other companies potentially using you
[52:50] (3170.32s)
know like tools like this actually have
[52:53] (3173.20s)
with need one dedicated helper devel
[52:56] (3176.44s)
developer build build it internally
[52:58] (3178.88s)
bring it in house yeah you know I think
[53:01] (3181.52s)
it's hubris to believe that products
[53:03] (3183.36s)
like workday and salesforce are going to
[53:05] (3185.12s)
get replaced by this I think you're
[53:06] (3186.40s)
totally right um these products have a
[53:08] (3188.72s)
lot of state they encapsulate business
[53:10] (3190.56s)
workflows there's actually for a product
[53:12] (3192.32s)
like workday probably compliance that
[53:14] (3194.24s)
you need to do because of how business
[53:15] (3195.76s)
critical the system is so this isn't the
[53:17] (3197.92s)
kind of system that this would replace
[53:19] (3199.60s)
it probably falls in the latter two
[53:21] (3201.44s)
categories and probably even just the
[53:22] (3202.96s)
last one which is to say kind of these
[53:24] (3204.48s)
stateless systems that don't do rights
[53:27] (3207.44s)
to the most business critical parts of
[53:29] (3209.44s)
your your databases. It's probably
[53:32] (3212.00s)
actually those kinds of systems that
[53:33] (3213.76s)
very quickly can get replaced. And I
[53:35] (3215.28s)
would say there's a new category where
[53:36] (3216.88s)
think about the amount of software that
[53:38] (3218.48s)
would benefit a business that just isn't
[53:40] (3220.32s)
getting created that now could get
[53:41] (3221.84s)
created, right? Because and the reason
[53:43] (3223.68s)
why that software couldn't get created
[53:44] (3224.96s)
is a company couldn't be created that
[53:46] (3226.96s)
would be able to sustain itself that
[53:48] (3228.64s)
would have an economic a business model
[53:50] (3230.32s)
that would justify it existing. But now
[53:52] (3232.40s)
since the software is very easy to
[53:53] (3233.84s)
create, these pieces of software are
[53:55] (3235.44s)
going to proliferate, right? And one of
[53:57] (3237.68s)
the things that I like to talk about for
[53:59] (3239.84s)
software is there's a little bit of a
[54:01] (3241.68s)
we've been, you know, because the cost
[54:03] (3243.60s)
of building software was a lot higher,
[54:05] (3245.36s)
right? Of of of simple software was a
[54:07] (3247.52s)
lot higher, right? Right now for a front
[54:08] (3248.88s)
end, we have to admit it's gotten a lot
[54:10] (3250.80s)
cheaper to build a basic front-end
[54:12] (3252.48s)
system, right? Radically radically
[54:15] (3255.28s)
cheaper. So I think the way I would sort
[54:17] (3257.92s)
of look at it is for these kind of uh
[54:21] (3261.08s)
systems, what are you really paying for
[54:23] (3263.36s)
when you pay a SAS vendor? You're not
[54:25] (3265.20s)
only paying for your product, you're
[54:26] (3266.64s)
paying for the maintenance. You're
[54:28] (3268.32s)
paying for the fact that actually, you
[54:30] (3270.32s)
know, this company actually is building
[54:32] (3272.32s)
a bunch of other features that you don't
[54:33] (3273.84s)
need. And the reason why is because they
[54:36] (3276.40s)
need to support a bunch of customers,
[54:37] (3277.76s)
but you're still paying for that R&D,
[54:39] (3279.92s)
right? You're paying for their sales and
[54:41] (3281.52s)
marketing, a bunch of other stuff there.
[54:43] (3283.04s)
So my viewpoint is if you can build
[54:45] (3285.92s)
custom software for yourself that is not
[54:48] (3288.24s)
very complex but helps you in your own
[54:50] (3290.00s)
business process I think that might
[54:51] (3291.72s)
proliferate inside companies and that
[54:54] (3294.08s)
might actually cause a whole host of
[54:56] (3296.16s)
kind of companies that fall into that
[54:57] (3297.92s)
category that is simple business
[54:59] (3299.84s)
software that is feels largely stateless
[55:02] (3302.24s)
to kind of have trouble unless they like
[55:04] (3304.40s)
kind of reinvent themselves. Yeah. And I
[55:06] (3306.72s)
I guess you know one obvious reinventing
[55:08] (3308.64s)
that could happen later is once this
[55:10] (3310.32s)
happened let's just continue this
[55:11] (3311.44s)
thought of like companies are building a
[55:13] (3313.44s)
lot of internal software they they might
[55:16] (3316.08s)
start to have some similar problems of
[55:18] (3318.64s)
let let's take you know 3 five years
[55:20] (3320.48s)
that under old maintenance storage uh uh
[55:25] (3325.20s)
compliance just just going through if if
[55:27] (3327.20s)
if they're working re-evaluating if it
[55:29] (3329.28s)
makes sense to actually bring it into
[55:30] (3330.80s)
something. So like this could create a
[55:32] (3332.40s)
lot of new opportunities for other
[55:34] (3334.40s)
software businesses or or or software
[55:36] (3336.00s)
developers or or you know maybe these
[55:37] (3337.84s)
companies or maybe a new job role in
[55:39] (3339.76s)
software engineering which is you know
[55:41] (3341.52s)
I'm now specialized in I I've built some
[55:43] (3343.60s)
so many of these apps and I can help you
[55:45] (3345.36s)
with them. Who knows? No, I think I
[55:47] (3347.52s)
think that like a lot of people talk
[55:49] (3349.12s)
about how we're going to have like way
[55:51] (3351.20s)
fewer software engineers um in the near
[55:53] (3353.28s)
future. I think it feels like it feels
[55:56] (3356.48s)
like it's people that hate software
[55:57] (3357.84s)
engineers largely speaking that say
[55:59] (3359.28s)
this. uh it feels like like pessimistic
[56:01] (3361.84s)
not only towards these people but I
[56:03] (3363.60s)
would say just in terms of what the
[56:05] (3365.12s)
ambitions for companies are right I
[56:06] (3366.96s)
think the ambitions for a lot of
[56:08] (3368.08s)
companies is to build a lot better
[56:09] (3369.52s)
product and if you now give the ability
[56:12] (3372.56s)
for companies to now have a better
[56:14] (3374.32s)
return on investment for building
[56:15] (3375.92s)
technology right because the cost of
[56:17] (3377.76s)
building software has gone down what
[56:19] (3379.28s)
should you be doing you should be
[56:20] (3380.48s)
building more because now the ROI for
[56:23] (3383.12s)
software and developers is even higher
[56:25] (3385.12s)
because a singular developer can do more
[56:26] (3386.88s)
for your business right so technology
[56:29] (3389.20s)
actually increases the ceiling of your
[56:30] (3390.80s)
company much faster. Yeah. And I'm going
[56:32] (3392.96s)
to just double click on that because
[56:34] (3394.48s)
like you know you're you have been
[56:36] (3396.32s)
building wind surf and and you've been
[56:37] (3397.84s)
building these tools but you've also
[56:39] (3399.60s)
worked with the team in fact with the
[56:40] (3400.96s)
same team even before these tools today
[56:44] (3404.16s)
one of your you know solid engineers who
[56:46] (3406.48s)
was a solid engineer four years ago. How
[56:49] (3409.60s)
has their work changed now that they
[56:51] (3411.60s)
have access to wind surf agentic you
[56:54] (3414.08s)
know cascade all these other tools in
[56:57] (3417.12s)
including you know like chat gpz etc.
[56:59] (3419.60s)
What what's changed and then not just
[57:01] (3421.20s)
your your engineering but also the team
[57:02] (3422.56s)
that you had four years ago uh you know
[57:04] (3424.48s)
that was doing work how has their work
[57:07] (3427.76s)
changed in terms of I don't want to
[57:09] (3429.92s)
point you in any direction but I'm just
[57:11] (3431.44s)
interested like like what you would say
[57:12] (3432.72s)
how does that seem different in what
[57:14] (3434.08s)
they do or how they do how much they do.
[57:16] (3436.60s)
Yeah, I think I think there's maybe a
[57:19] (3439.20s)
couple things. So, first of all, the
[57:20] (3440.72s)
amount of code that we have in the
[57:22] (3442.00s)
company is quite high. It now dominates
[57:23] (3443.84s)
what a single person knows at the
[57:25] (3445.76s)
moment. So, in the beginning of the
[57:27] (3447.44s)
company, that's not the case. So,
[57:28] (3448.56s)
actually, this is something that I can't
[57:29] (3449.84s)
point to because the company was quite
[57:31] (3451.92s)
small. Right now, I would say it enables
[57:34] (3454.88s)
like there's less there's more
[57:37] (3457.04s)
fearlessness to jump into a new part of
[57:39] (3459.20s)
the codebase and start making changes.
[57:41] (3461.20s)
Right? I would say in the past you would
[57:43] (3463.04s)
be you would you would more say hey this
[57:45] (3465.36s)
person has way more familiarity with
[57:46] (3466.96s)
this with this part of the code that is
[57:48] (3468.56s)
still the case right when you say
[57:50] (3470.16s)
familiarity now it is it's like
[57:52] (3472.56s)
understanding the code but this person
[57:53] (3473.84s)
also knows where the dead bodies are
[57:55] (3475.60s)
which is to say um hey when that where
[57:58] (3478.08s)
all you know you did X and you got Y
[58:01] (3481.04s)
that happened and that means you always
[58:02] (3482.56s)
should do Z right and and there are
[58:05] (3485.60s)
still people like that at the company
[58:06] (3486.80s)
and I'm not saying that that is not
[58:07] (3487.92s)
valuable but I think now engineers feel
[58:10] (3490.56s)
more empowered forward to go out and
[58:12] (3492.24s)
make changes throughout the codebase
[58:14] (3494.08s)
which is actually awesome. And the
[58:15] (3495.76s)
second key piece is our developers now
[58:17] (3497.92s)
go to the AI first to see what value it
[58:20] (3500.40s)
would generate for them before making a
[58:22] (3502.24s)
change which is something which I would
[58:24] (3504.16s)
say in the autocomplete days you would
[58:25] (3505.68s)
go out and type it and you would get a
[58:27] (3507.04s)
lot of advantage from autocomplete and
[58:28] (3508.56s)
the passive AI but now the active AI is
[58:31] (3511.20s)
something that developers more and more
[58:32] (3512.80s)
reach towards to actually go out and
[58:34] (3514.24s)
make changes at the very beginning.
[58:36] (3516.40s)
Right. Yeah. I'm I'm interested in how
[58:38] (3518.88s)
this will change software engineering as
[58:40] (3520.96s)
a whole cuz I I also noticed I noticed
[58:42] (3522.72s)
both things on myself like I I I still I
[58:45] (3525.44s)
still code and I do my side projects but
[58:47] (3527.28s)
I always drag my feet of getting back
[58:49] (3529.04s)
into the context of the code that I
[58:50] (3530.88s)
wrote which was you know I kind of
[58:52] (3532.96s)
forgot part of it getting back into the
[58:54] (3534.64s)
language because I use multiple
[58:55] (3535.84s)
languages between because they're side
[58:57] (3537.12s)
projects and AI like it does help me
[58:59] (3539.44s)
just like jump into it. I I no longer
[59:01] (3541.60s)
have the thing and sometimes yeah I I
[59:03] (3543.52s)
just prompt the AI saying what would you
[59:05] (3545.36s)
do? I just want to know and then if it
[59:07] (3547.36s)
looks good I do it. If it not I just
[59:09] (3549.52s)
scrap it. maybe I prompted or sometimes
[59:11] (3551.52s)
I just like nah I'm just going to do it
[59:13] (3553.36s)
because either I didn't give it right
[59:15] (3555.12s)
instructions like you know there's this
[59:17] (3557.44s)
thing especially when you're working on
[59:18] (3558.64s)
on stuff uh you know the codebase you've
[59:21] (3561.36s)
onboarded you know what you want to do
[59:23] (3563.44s)
but I think it really help it helps me
[59:25] (3565.76s)
at least with the effort sorry with the
[59:29] (3569.76s)
with the thing that that that wouldn't
[59:31] (3571.12s)
take like much creativity but but it
[59:34] (3574.00s)
would just be time a drag figuring out
[59:36] (3576.56s)
the the the right things finding the the
[59:39] (3579.36s)
right dependency changing those things,
[59:41] (3581.04s)
that kind of stuff. I think you're
[59:43] (3583.20s)
you're exactly right. I think this
[59:44] (3584.88s)
reducing friction piece is something
[59:47] (3587.52s)
that is it's it's kind of hard to
[59:49] (3589.68s)
quantify the value because it makes you
[59:51] (3591.68s)
more excited to do more, right? You
[59:54] (3594.32s)
know, this stuff I think software
[59:56] (3596.32s)
development is a very weird profession
[59:57] (3597.84s)
and I'll give you an example of why it's
[60:00] (3600.24s)
weird and and a lot of people would
[60:01] (3601.92s)
think, oh, this is a very easy job and I
[60:03] (3603.92s)
actually think it's it's quite hard on
[60:05] (3605.84s)
on you mentally and I'll I'll walk you
[60:08] (3608.00s)
through what I mean by that. It's, you
[60:09] (3609.68s)
know, you're doing a hard project, you
[60:11] (3611.20s)
sometimes go home with incomplete with,
[60:13] (3613.44s)
you know, with an incomplete idea. The
[60:15] (3615.04s)
code didn't pass a bunch of tests and it
[60:17] (3617.92s)
just it just bothers you when you sleep
[60:19] (3619.84s)
and you need to go back and kind of fix
[60:21] (3621.92s)
it. And this could be for days, right?
[60:24] (3624.56s)
And for other jobs, I don't think you
[60:26] (3626.88s)
kind of feel that, right? It's it's it's
[60:29] (3629.28s)
a lot more procedural potentially for
[60:31] (3631.20s)
other types of jobs. I'm not saying for
[60:32] (3632.88s)
every job. There are obviously jobs
[60:34] (3634.16s)
where there's a massive like problem
[60:35] (3635.60s)
solving component, but that just means
[60:37] (3637.84s)
that this this kind of you you do get a
[60:40] (3640.64s)
fatigue if you if you know at some point
[60:43] (3643.52s)
even the easy things just forcing you to
[60:45] (3645.20s)
do new easy things, it adds some amount
[60:47] (3647.28s)
of mental fatigue. And I think you now
[60:50] (3650.24s)
have a very powerful system that you now
[60:52] (3652.56s)
trust that should ideally reduce the
[60:55] (3655.52s)
this fatigue and and be able to do a lot
[60:57] (3657.52s)
of the things that are in the past high
[60:59] (3659.92s)
activation energy and do it really fast
[61:01] (3661.92s)
for you. Yeah, this is this is really
[61:03] (3663.68s)
interesting because I I was just talking
[61:05] (3665.12s)
with a former colleague of mine who had
[61:07] (3667.44s)
a few months where he just wasn't
[61:09] (3669.44s)
producing much code. really good
[61:11] (3671.28s)
engineer, really solid and at the time
[61:14] (3674.32s)
uh I didn't know why and he didn't tell
[61:16] (3676.80s)
me and then he kind of you know snapped
[61:18] (3678.64s)
out of it but we're just talking he said
[61:20] (3680.48s)
like he said that actually he was at a
[61:22] (3682.80s)
really bad time in his life lots of
[61:24] (3684.72s)
stress in a relationship in at home with
[61:28] (3688.40s)
family all these things and he said that
[61:30] (3690.64s)
he just realizes how mental game
[61:33] (3693.04s)
software engineering is he at work he
[61:35] (3695.60s)
just couldn't get himself to you know
[61:37] (3697.28s)
get into the zone we know how it is
[61:39] (3699.12s)
especially before AI tools and what you
[61:42] (3702.32s)
said I'm starting to get a bit of an
[61:43] (3703.84s)
appreciation and the fact that I I I I
[61:45] (3705.76s)
remember you know stressful but I I
[61:47] (3707.04s)
couldn't turn off like you go home
[61:48] (3708.80s)
you're having dinner you're still
[61:50] (3710.00s)
thinking about how you would change that
[61:52] (3712.88s)
or why it's not working
[61:54] (3714.68s)
it's like I I I don't think we'll be
[61:57] (3717.84s)
able to like you know go onwards but I
[61:59] (3719.68s)
think for for listeners it's it's worth
[62:01] (3721.28s)
thinking about like how it a a how weird
[62:03] (3723.52s)
it is I think it's good to reflect on it
[62:05] (3725.12s)
because it is a unique it's it is for
[62:07] (3727.60s)
for so many jobs you can actually you
[62:09] (3729.60s)
know just put down your work and leave
[62:11] (3731.36s)
the office and you cannot continue and
[62:13] (3733.04s)
and that that's it and cannot even think
[62:14] (3734.72s)
about it because all your work is there
[62:16] (3736.40s)
and also like how these tools might just
[62:18] (3738.48s)
change it for the better in many ways
[62:20] (3740.48s)
and and maybe just in in weird ways that
[62:22] (3742.48s)
we don't expect in in others. No, I
[62:24] (3744.88s)
think you're totally this idea of I I
[62:27] (3747.28s)
think this is why like finding amazing
[62:28] (3748.96s)
software engineers is a very it's rare.
[62:31] (3751.12s)
It's rare because these people are
[62:33] (3753.44s)
people that I guess have gone through
[62:35] (3755.04s)
this and are willing to put themselves
[62:36] (3756.80s)
through the idea of like, hey, all of
[62:38] (3758.56s)
the learnings that I had from like the
[62:40] (3760.08s)
lowest level to the highest level and
[62:42] (3762.08s)
then willing to go to the go down to the
[62:44] (3764.00s)
weeds to to to kind of make sure you
[62:46] (3766.64s)
solve the problem is it's a rare skill.
[62:48] (3768.64s)
It's that, you know, you would imagine,
[62:50] (3770.24s)
hey, this is something that everyone
[62:51] (3771.60s)
would be able to do, but it it like
[62:53] (3773.36s)
takes a lot of dedication and as you as
[62:55] (3775.36s)
you pointed out, it's like this for an
[62:57] (3777.36s)
activity that is that is not a very
[62:58] (3778.88s)
normal activity.
[63:00] (3780.48s)
Yeah. Well, going back to engineering
[63:02] (3782.72s)
challenges and decisions, one super
[63:05] (3785.44s)
interesting thing that I've been dying
[63:06] (3786.96s)
to ask you is you did mention in the
[63:10] (3790.24s)
beginning that you know like it's you
[63:12] (3792.48s)
when you started Windsurf you you
[63:14] (3794.96s)
realized like Visual Studio Code is just
[63:17] (3797.36s)
it's not there where it should be.
[63:19] (3799.04s)
However, you started by forking Visual
[63:21] (3801.52s)
Studio Code, right? Do I know that
[63:22] (3802.96s)
right? That's exactly right. C can you
[63:24] (3804.80s)
tell me the pros and cons of of doing
[63:27] (3807.04s)
this as opposed to like building your
[63:28] (3808.64s)
own own editor? And I'm aware that there
[63:30] (3810.80s)
are some downsides of doing there's
[63:32] (3812.16s)
there's some licensing things. So that's
[63:34] (3814.08s)
one part of the question. The second
[63:35] (3815.20s)
part of the question like why did you
[63:36] (3816.72s)
think that forking is the right move to
[63:39] (3819.36s)
build a much better much more capable
[63:41] (3821.92s)
thing of whatever Visual Studio was back
[63:44] (3824.32s)
VS Code was back in the day. Yeah. So
[63:47] (3827.36s)
just maybe some clarifications just on
[63:49] (3829.68s)
terminology. VS Code is like is a is a
[63:53] (3833.36s)
an like a a product that is built on top
[63:55] (3835.76s)
of code OSS which is the ultimate which
[63:58] (3838.16s)
is the the basically the the open source
[64:00] (3840.48s)
project. I did not know that. Yeah. So
[64:02] (3842.80s)
because VS code has proprietary pieces
[64:04] (3844.96s)
on top of the open source on top of the
[64:07] (3847.04s)
open source. I I do know that and a lot
[64:09] (3849.28s)
of people don't know that actually.
[64:10] (3850.72s)
Yeah. Exactly. So, so I guess one of the
[64:13] (3853.36s)
things that we actually did was we
[64:14] (3854.88s)
wanted to make sure we did this right.
[64:16] (3856.96s)
And what I mean by that is when we
[64:18] (3858.72s)
actually built our product, we did fork
[64:21] (3861.12s)
code OSS, but we did not support any of
[64:24] (3864.08s)
the proprietary pieces that Microsoft
[64:26] (3866.48s)
had. And we never actually provided
[64:28] (3868.32s)
support uh for those uh not through a
[64:30] (3870.88s)
marketplace or anything. We actually use
[64:32] (3872.56s)
an open marketplace that is it is
[64:34] (3874.16s)
completely fine. And this, by the way,
[64:35] (3875.44s)
this forced us to actually have to build
[64:37] (3877.36s)
out a lot of extensions that people
[64:39] (3879.36s)
needed and bake it into the product.
[64:40] (3880.80s)
I'll give you an example. For Python
[64:42] (3882.64s)
language servers, we actually now we
[64:44] (3884.32s)
have our own version, right? For remote
[64:46] (3886.24s)
SSH, we have our own version. For dev
[64:48] (3888.08s)
containers, we have our own version. So,
[64:49] (3889.92s)
this actually forced us to get a lot
[64:51] (3891.76s)
tighter on what we need to do. And we
[64:53] (3893.84s)
never took I guess a shortcut of hey,
[64:56] (3896.16s)
let's go out and do something that we
[64:58] (3898.00s)
shouldn't be doing. Um because hey, we
[65:00] (3900.40s)
work with real companies. We work with
[65:01] (3901.84s)
real developers and why should we be
[65:04] (3904.08s)
putting them in that position, right? I
[65:06] (3906.00s)
guess we we kind of took that position
[65:07] (3907.76s)
and um so so that was like that was the
[65:11] (3911.28s)
positioning that was the positioning we
[65:12] (3912.88s)
had. Obviously there were some
[65:14] (3914.08s)
complexities but this this just caused
[65:15] (3915.76s)
us more engineering effort before we
[65:17] (3917.52s)
launched the product right we did launch
[65:19] (3919.20s)
the product with an ability to connect
[65:20] (3920.96s)
to remote SSH and do all this other
[65:22] (3922.72s)
stuff and we did have like internal
[65:24] (3924.24s)
engineering effort to actually go out
[65:25] (3925.76s)
and and do that. Um now the question
[65:28] (3928.08s)
might be why even fork VS code or test
[65:32] (3932.64s)
in the first place. I think it's because
[65:34] (3934.40s)
it's a very it's a very um well-known ID
[65:38] (3938.32s)
where people have workflows. Um there
[65:40] (3940.64s)
are also many extensions there that
[65:43] (3943.36s)
people rely on uh that are extremely
[65:45] (3945.92s)
popular, right? An ID is not just uh
[65:48] (3948.40s)
like I guess the the the the place where
[65:50] (3950.64s)
you write software. It's also the place
[65:52] (3952.08s)
where you attach a debugger and do all
[65:54] (3954.08s)
these other operations. And we didn't
[65:56] (3956.48s)
want to reinvent the wheel on that. We
[65:58] (3958.16s)
didn't think we were better than I guess
[65:59] (3959.60s)
the entire open source community uh on
[66:01] (3961.68s)
that, right? um in terms of all the ways
[66:03] (3963.76s)
you could use the product and I'll give
[66:04] (3964.96s)
you an example of of of maybe how we're
[66:07] (3967.76s)
trying to be pragmatic here. We didn't
[66:09] (3969.52s)
go out and try to replace Jetrains with
[66:11] (3971.52s)
this product. We actually put all the
[66:13] (3973.20s)
capabilities of Windsurf into Jetrains
[66:15] (3975.12s)
in what's called a Windinssurf plug-in.
[66:17] (3977.52s)
And this is where our goal is to meet
[66:19] (3979.68s)
developers where they are. And meeting
[66:22] (3982.40s)
VS Code developers where they are means
[66:24] (3984.00s)
we should give them a familiar
[66:25] (3985.12s)
experience. Meeting Jetrains developers
[66:27] (3987.12s)
means we should give them a familiar
[66:28] (3988.64s)
experience which is actually use
[66:29] (3989.76s)
Jetrains. Now a question might be why
[66:31] (3991.52s)
didn't we fork jeopardize and the answer
[66:32] (3992.80s)
is two reasons. First of all we can't
[66:34] (3994.72s)
it's it's it's close to us. Second of
[66:37] (3997.28s)
all, the answer is actually because
[66:38] (3998.56s)
Jepperins is actually a fantastic IDE
[66:40] (4000.64s)
for Java developers and and and and in a
[66:44] (4004.00s)
lot of cases C++ and Python developers
[66:46] (4006.00s)
in so far as PHP as well, PHP Storm if
[66:48] (4008.80s)
you ever That's exactly right. They have
[66:52] (4012.00s)
one for almost a every single language
[66:54] (4014.08s)
for every single language. And the
[66:55] (4015.44s)
reason is because they have great
[66:56] (4016.88s)
debuggers, great language servers that I
[66:58] (4018.80s)
actually think are not even present on
[67:00] (4020.32s)
VS Code right now. If you are a great
[67:02] (4022.08s)
Java developer, most of them and
[67:04] (4024.00s)
probably 80 plus% right now use
[67:06] (4026.08s)
intelligent in the market. So the
[67:09] (4029.20s)
question there is like I think as a
[67:10] (4030.56s)
company our goal is not to be dogmatic.
[67:12] (4032.40s)
Our goal is to build the best technology
[67:14] (4034.40s)
and provide it and democratize it and
[67:16] (4036.24s)
provide it to as many developers as
[67:17] (4037.44s)
possible. No, I love it. And and this is
[67:19] (4039.92s)
actually I I was talking with one of
[67:21] (4041.44s)
your your software engineers who did
[67:23] (4043.36s)
mention an interesting challenge because
[67:25] (4045.04s)
of just this the fact that you do have a
[67:27] (4047.12s)
Jet Brains plugin and then you have the
[67:28] (4048.72s)
ID and now you're apparently you're
[67:30] (4050.64s)
sharing some binaries between the two.
[67:32] (4052.56s)
Can you talk a little bit about that
[67:34] (4054.24s)
engineering? Yeah, so this was actually
[67:36] (4056.56s)
an engineering decision we needed to
[67:38] (4058.00s)
make a couple months into starting
[67:39] (4059.60s)
working on podium which is that hey
[67:41] (4061.60s)
we're going to go out and build a VS
[67:43] (4063.28s)
Code extension. That's what we started
[67:44] (4064.56s)
out with. But very quickly like the next
[67:46] (4066.56s)
step is let's go implement it in
[67:48] (4068.16s)
jeopard. The problem is if we need to
[67:50] (4070.24s)
duplicate all the code it's going to be
[67:52] (4072.08s)
really really annoying for us to support
[67:53] (4073.60s)
all this. So what we decided to do is
[67:55] (4075.44s)
actually go out and build almost the
[67:57] (4077.28s)
shared binary between both that we call
[67:59] (4079.04s)
a language server that actually does the
[68:00] (4080.80s)
heavy lifting. So the the goal there is
[68:03] (4083.20s)
hopefully we're not just duplicating the
[68:05] (4085.28s)
work in a bunch of places and this
[68:06] (4086.88s)
enables us to support many many ideas
[68:09] (4089.28s)
from an architecture standpoint. Um and
[68:11] (4091.44s)
that's why we were able to support not
[68:12] (4092.88s)
just jet brains, Eclipse, you know, Vim,
[68:15] (4095.20s)
all of these other IDs that people
[68:16] (4096.96s)
would, you know, uh that are that are
[68:18] (4098.80s)
that are popular um without much lift.
[68:22] (4102.56s)
Okay, I need to ask you about MCP. You
[68:26] (4106.32s)
have started to support it, which is
[68:27] (4107.60s)
really cool. I I play around with it and
[68:29] (4109.36s)
I think it's a good first step. What is
[68:32] (4112.16s)
your take on MCP, especially with uh
[68:34] (4114.16s)
with the security uh worries and and
[68:36] (4116.32s)
also where do you see MCP going right
[68:38] (4118.08s)
now? Now I think it's a bit of an open
[68:39] (4119.28s)
book but uh you you are probably a bit
[68:41] (4121.44s)
more exposed to this than most listeners
[68:43] (4123.04s)
will be. You know I think it's I think
[68:45] (4125.20s)
it's very exciting. Um I have some maybe
[68:48] (4128.08s)
maybe one concern but let me start with
[68:50] (4130.16s)
the exciting part. I think the exciting
[68:51] (4131.68s)
part is now it's it's democratizing
[68:54] (4134.24s)
access to everything inside a company or
[68:56] (4136.24s)
everything a user would want um within
[68:58] (4138.96s)
their coding environment for our product
[69:01] (4141.12s)
in particular. Obviously there are other
[69:02] (4142.40s)
products maybe it can help you buy buy
[69:04] (4144.32s)
goods and grocery and stuff like that.
[69:06] (4146.32s)
Obviously, we're not that interested in
[69:07] (4147.68s)
that case, but but um but that is that
[69:10] (4150.32s)
is nice. Um one of the other things that
[69:12] (4152.48s)
it lets companies do is they can
[69:14] (4154.00s)
implement their own MCP servers with
[69:15] (4155.76s)
security guarantees, which is to say
[69:17] (4157.76s)
they can implement a a uh battle tested
[69:21] (4161.04s)
MCP server that talks to an internal
[69:23] (4163.04s)
service that actually does off and all
[69:25] (4165.28s)
these other things uh for the end user
[69:27] (4167.12s)
and they can they can own the
[69:28] (4168.40s)
implementation of that. So there's a way
[69:29] (4169.92s)
for companies now to to enable us to use
[69:33] (4173.44s)
to to interact with their internal
[69:35] (4175.12s)
services in a secure way. But you're but
[69:37] (4177.60s)
you're totally right like there could be
[69:39] (4179.20s)
a slippery slope where this where this
[69:40] (4180.72s)
causes everyone to have immediate access
[69:42] (4182.56s)
to everything in a right based fashion
[69:44] (4184.64s)
that could have negative consequences.
[69:46] (4186.72s)
But the thing I'm like I'm particularly
[69:48] (4188.88s)
maybe a little bit uh you know worried
[69:51] (4191.44s)
about and it's not worry it's more so
[69:53] (4193.04s)
like the paradigm itself is is MCP the
[69:56] (4196.16s)
right work uh like right way of
[69:58] (4198.48s)
encapsulating talking to other systems
[70:00] (4200.48s)
or is it like like actual workflows of
[70:02] (4202.72s)
developers like going and interacting
[70:05] (4205.04s)
with these systems and I'll give you an
[70:06] (4206.24s)
example of that. One of the problems
[70:07] (4207.44s)
with MCP is it forces you to to hit a
[70:10] (4210.00s)
particular spec and you know this
[70:12] (4212.72s)
actually the best spec is flexibility.
[70:14] (4214.88s)
Yeah, right. It's it's flexibility and
[70:17] (4217.04s)
you know if you ask these systems now to
[70:19] (4219.12s)
integrate with another like you ask an
[70:21] (4221.04s)
LM like a GPT41 or a Sonnet hey you know
[70:24] (4224.48s)
build an integration to this system to a
[70:26] (4226.80s)
notion it will do it zero shot now. Yep.
[70:29] (4229.92s)
So you could build an MCP server that is
[70:32] (4232.40s)
particular that only lets you have
[70:34] (4234.08s)
access to two things in notion or the
[70:36] (4236.24s)
models themselves are capable of doing a
[70:38] (4238.00s)
lot and it's like how much do you want
[70:39] (4239.76s)
to constrain versus have freedom and
[70:41] (4241.84s)
then also there is the corresponding
[70:43] (4243.12s)
security issue too.
[70:44] (4244.72s)
Like look, it's awesome that we have
[70:46] (4246.32s)
access to it. Is this the final version?
[70:48] (4248.16s)
I don't know if that this is the final
[70:49] (4249.52s)
version. Yeah. Is it I'm going to
[70:52] (4252.16s)
rephrase it and let me know if if if you
[70:55] (4255.36s)
think I'm off, but when you set up, for
[70:57] (4257.60s)
example, you know, I'm I'm building a
[70:59] (4259.20s)
web project. I'm using Node and I have I
[71:01] (4261.44s)
have my my packages JSON that that
[71:03] (4263.52s)
specify what packages I'm going to use.
[71:05] (4265.12s)
Now on my machine I will have a lot of
[71:06] (4266.80s)
packages installed but for each specific
[71:08] (4268.96s)
project I'm going to be very clear of
[71:11] (4271.12s)
what I want to use what package maybe a
[71:14] (4274.40s)
subset of it and you know like I right
[71:17] (4277.52s)
now it feels to me that the current
[71:18] (4278.72s)
version of MCP it just lets me connect
[71:20] (4280.96s)
everything and I I can't really you know
[71:24] (4284.00s)
say that for example on this project
[71:25] (4285.68s)
like I actually want you to only talk to
[71:27] (4287.76s)
this table in my database. I don't want
[71:29] (4289.76s)
you to to access all the other stuff cuz
[71:32] (4292.16s)
it's just this it's a proud database and
[71:34] (4294.00s)
I have a test table there. That that
[71:35] (4295.44s)
that kind of stuff, right? Like are are
[71:37] (4297.92s)
we talking about this like granularity
[71:39] (4299.36s)
and and figuring out what would actually
[71:41] (4301.76s)
help me as an engineer be productive?
[71:46] (4306.48s)
No, it's an interesting point. So like
[71:48] (4308.24s)
you're totally right. You want these
[71:49] (4309.60s)
systems to have access to a lot of
[71:51] (4311.60s)
things so that you can get be
[71:52] (4312.88s)
productive. All the while you want to be
[71:54] (4314.32s)
imperative uh and and very instructive
[71:57] (4317.60s)
on on on how on what systems they should
[72:00] (4320.80s)
have access to internally. But the
[72:02] (4322.72s)
problem is people are very I'm not going
[72:04] (4324.40s)
to say lazy but it is annoying if you
[72:06] (4326.24s)
have 50 services and you need to tell it
[72:08] (4328.08s)
you need to do this, you need to do
[72:09] (4329.12s)
that, you need to do this. And what can
[72:11] (4331.28s)
very quickly happen is people don't and
[72:12] (4332.88s)
they get like mixed results or it has
[72:14] (4334.56s)
like negative consequences. So look, I
[72:16] (4336.88s)
think we're figuring this out. I think
[72:18] (4338.56s)
the whole industry is kind of figuring
[72:20] (4340.00s)
this out what the right model is and
[72:22] (4342.64s)
maybe it actually is a lot of
[72:24] (4344.40s)
engineering that needs to get done post
[72:26] (4346.48s)
the MCP server which is to say the MCP
[72:28] (4348.88s)
server provides a very free flowing
[72:30] (4350.80s)
interface but there's a lot of like kind
[72:32] (4352.80s)
of understanding of the server to who
[72:34] (4354.96s)
the user is what service they're trying
[72:36] (4356.64s)
to touch what codebase they're in and
[72:38] (4358.40s)
there's like proper access controls that
[72:40] (4360.64s)
are implemented you know afterwards that
[72:42] (4362.80s)
helps you kind of like do that I I I'm
[72:44] (4364.96s)
I'm thinking these languages are not
[72:46] (4366.72s)
really popular but When I when I started
[72:48] (4368.64s)
programming, I used C# and in C for the
[72:51] (4371.44s)
classes, you had keywords. You know, you
[72:53] (4373.12s)
have classes, but you couldn't just
[72:54] (4374.16s)
access them. You had you had public
[72:55] (4375.84s)
classes which everyone can access. You
[72:57] (4377.60s)
had protected classes. You actually had
[72:59] (4379.04s)
internal classes that were inside the
[73:00] (4380.64s)
module. You had private classes which
[73:02] (4382.88s)
were not accessible unless you were a
[73:04] (4384.32s)
child class. And these were just
[73:05] (4385.44s)
keywords of how what module can access
[73:08] (4388.40s)
what parts of your code inside the
[73:10] (4390.56s)
codebase. And we back then, this was
[73:12] (4392.64s)
like the 2000s, we spent a lot of care
[73:15] (4395.44s)
deciding who can access what and how.
[73:17] (4397.92s)
Even though technically you could have
[73:19] (4399.28s)
just everyone could have talked with
[73:20] (4400.56s)
everyone, but we we decided this was,
[73:22] (4402.24s)
you know, evolution of a few decades
[73:23] (4403.60s)
that it wasn't a good idea. So, I'm
[73:25] (4405.84s)
wondering if we're going to get there.
[73:27] (4407.52s)
For example, with MCP, we might remend
[73:29] (4409.76s)
some parts of it because that didn't
[73:31] (4411.68s)
come up because like, you know, like
[73:33] (4413.36s)
someone thought it was like just lick
[73:36] (4416.56s)
their finger. It was because we needed
[73:38] (4418.00s)
it to organize large amounts of code
[73:40] (4420.56s)
back then when we didn't have the tools
[73:42] (4422.00s)
that we have today. No, I think you're
[73:44] (4424.16s)
right. I think some primitives are
[73:45] (4425.68s)
missing right now for sure. It's too
[73:47] (4427.52s)
free form right now. It's it's going to
[73:49] (4429.60s)
be super exciting though because we are
[73:51] (4431.12s)
seeing it it that it is going somewhere.
[73:53] (4433.04s)
Maybe MCP, maybe not. And we're in the
[73:54] (4434.72s)
middle of it, you know. Who knows? Some
[73:56] (4436.48s)
some people listening to it might
[73:57] (4437.84s)
actually influence the direction of this
[74:00] (4440.00s)
new thing that we're going to use in
[74:01] (4441.20s)
like five years from now. It's awesome.
[74:03] (4443.32s)
Yeah. What is your take on this uh 70
[74:06] (4446.96s)
30% of mental model for AI tools? This
[74:09] (4449.20s)
is something that comes up every now and
[74:10] (4450.48s)
then especially with with folks who are
[74:12] (4452.08s)
less technical that today they can you
[74:14] (4454.88s)
know prompt AI tools u from windsurf to
[74:18] (4458.08s)
lovable and others of like hey generate
[74:20] (4460.00s)
this idea that I have and they do a good
[74:21] (4461.84s)
job of the the you know the oneshot or
[74:23] (4463.84s)
or the the the tweaking and then the
[74:26] (4466.16s)
last 30% especially when they're not
[74:28] (4468.48s)
experienced software engineers they just
[74:29] (4469.92s)
get a little stuck or hopelessly stuck.
[74:33] (4473.28s)
Uh do you observe this with uh with with
[74:36] (4476.56s)
Windsor fusers or this is not really a
[74:38] (4478.24s)
thing when when like people are pretty
[74:40] (4480.00s)
technical and and developers? Yeah, I
[74:42] (4482.72s)
think we do have non-developers that use
[74:44] (4484.56s)
the product and I do think the level of
[74:46] (4486.64s)
frustration for them and by the way my
[74:48] (4488.32s)
viewpoint on this is not like just let
[74:50] (4490.16s)
them be frustrated. It's I would love to
[74:51] (4491.68s)
help them but the level of frustration
[74:53] (4493.20s)
when they get it when they have a
[74:54] (4494.56s)
problem is much higher. And the reason
[74:56] (4496.32s)
is because for for you and I when we go
[74:59] (4499.04s)
and and use this and it gets into this
[75:01] (4501.28s)
degenerate state where it goes out and
[75:02] (4502.80s)
it tries to make a change and it does a
[75:04] (4504.32s)
series of changes that doesn't make
[75:05] (4505.36s)
sense. Our first instinct is don't just
[75:07] (4507.44s)
like do it 10 more times when five times
[75:09] (4509.92s)
it didn't work. It's probably like look
[75:11] (4511.36s)
at the code and see what step didn't
[75:13] (4513.04s)
work and revert back to the step that
[75:14] (4514.48s)
works. Right? Like debugging principles
[75:16] (4516.16s)
because but that's by the way the reason
[75:17] (4517.60s)
why we do that is we understand the
[75:19] (4519.28s)
code. Yeah. We can like go back into the
[75:21] (4521.04s)
code and kind of understand it. But
[75:22] (4522.56s)
you're you're right that for developers
[75:24] (4524.24s)
that can't it's they're kind of in a
[75:26] (4526.56s)
state of helplessness. Um and I I I
[75:29] (4529.28s)
deeply empathize with that. And it's
[75:31] (4531.84s)
like it's our job to figure out ways
[75:33] (4533.84s)
that we can make that a lot better. Now
[75:35] (4535.92s)
granted, right, does that mean we make
[75:38] (4538.40s)
our product completely catered to
[75:39] (4539.84s)
non-developers? No, that's actually not
[75:41] (4541.60s)
what we do. Are there principles from
[75:43] (4543.68s)
that that we can take that help both
[75:45] (4545.52s)
both groups? Right? Because I think for
[75:47] (4547.36s)
us, we do want to get to a state where
[75:48] (4548.64s)
these systems can be more and more
[75:49] (4549.84s)
autonomous, right? a real developer
[75:52] (4552.32s)
needs to go out and needs to fix these
[75:54] (4554.00s)
issues all the time when they prompt it.
[75:55] (4555.68s)
It also just means we're getting we're
[75:58] (4558.16s)
being autonomous as well.
[76:02] (4562.00s)
But but I I do think as as an industry
[76:03] (4563.76s)
and and this is you know there's a
[76:06] (4566.24s)
engineers who like the coders and then
[76:08] (4568.08s)
the non-coders there is a question that
[76:10] (4570.32s)
needs to be asked of do do we eventually
[76:13] (4573.36s)
need to understand what the code does?
[76:14] (4574.80s)
Do you need to be able to read the code?
[76:16] (4576.08s)
Because for example when I was at
[76:17] (4577.28s)
university we studied assembly. Now I I
[76:19] (4579.52s)
never really programmed assembly beyond
[76:21] (4581.20s)
the the class but I have since came
[76:23] (4583.92s)
across assembly code and I'm not afraid
[76:26] (4586.72s)
to look at it. Now again, I'm not saying
[76:28] (4588.72s)
I'm the expert, but you can go all the
[76:30] (4590.64s)
way down to the stack. And I think there
[76:32] (4592.08s)
is something to be said that, you know,
[76:33] (4593.36s)
we're now adding a new level abstraction
[76:35] (4595.60s)
that as a professional, it will always
[76:38] (4598.40s)
be helpful to be able to look through
[76:40] (4600.00s)
the stack, you know, sometimes uh all
[76:42] (4602.16s)
the way to networking logs or or or the
[76:44] (4604.56s)
packet, not often, but just knowing
[76:46] (4606.48s)
where to look and and eventually where
[76:48] (4608.08s)
where to go. So, this might be more of a
[76:50] (4610.16s)
philosophical question because I think a
[76:51] (4611.52s)
lot of people don't want they just
[76:52] (4612.80s)
think, okay, we can just use English for
[76:54] (4614.32s)
everything. But it it does trans
[76:55] (4615.76s)
translate into a level which is
[76:57] (4617.36s)
programming languages translates into
[76:59] (4619.36s)
the next level and so on. I think you're
[77:02] (4622.00s)
you're right. So here's my take on it.
[77:04] (4624.08s)
We are going to have a proliferation of
[77:05] (4625.96s)
software. Some of the software will be
[77:08] (4628.08s)
built by people that don't know code,
[77:09] (4629.92s)
right? I think I it feels simplistic to
[77:12] (4632.96s)
say that that is not going to happen,
[77:14] (4634.24s)
right? And we're already seeing it play
[77:15] (4635.52s)
out in real time. But here's the thing.
[77:17] (4637.52s)
It's almost like when you think about
[77:19] (4639.20s)
the best developer that you know, even
[77:22] (4642.00s)
if they're a full-sight developer, they
[77:23] (4643.76s)
probably understand when the product is
[77:25] (4645.60s)
slow, it's because there's some issue
[77:27] (4647.28s)
with the way that this interacts with
[77:28] (4648.40s)
the operating system. If there's some
[77:30] (4650.00s)
issue with the way that this interacts
[77:31] (4651.44s)
with the networking stack, it's the
[77:32] (4652.96s)
ability for this person to kind of peel
[77:34] (4654.88s)
back layers of abstraction to get to
[77:36] (4656.64s)
ground truth. That is what makes a great
[77:38] (4658.56s)
developer a great developer. And these
[77:40] (4660.48s)
people are more powerful. They're more
[77:42] (4662.48s)
powerful in any organization. you know
[77:44] (4664.00s)
that you can take these people and put
[77:45] (4665.52s)
them on any project and it's just going
[77:47] (4667.28s)
to be a lot more successful with them.
[77:48] (4668.80s)
Yeah. And I think the same thing is
[77:50] (4670.08s)
going to happen which is that some set
[77:51] (4671.84s)
of projects it is going to be fine if
[77:53] (4673.92s)
the level of abstraction you deal with
[77:55] (4675.68s)
is the final application plus English
[77:58] (4678.00s)
and a spec right for some other set of
[78:00] (4680.72s)
applications it's you know actually a
[78:02] (4682.80s)
developer will go in but there's going
[78:04] (4684.40s)
to be some gnarly natures right it's
[78:05] (4685.84s)
going to interact with the database it's
[78:07] (4687.12s)
going to have some it's going to have
[78:08] (4688.72s)
like performance related issues and
[78:10] (4690.96s)
you're going to have an expectation that
[78:12] (4692.16s)
the AI and the human can go down the
[78:14] (4694.00s)
stack and the human can reason about
[78:15] (4695.44s)
this y and I think these people are
[78:18] (4698.64s)
always going to be really valuable.
[78:20] (4700.32s)
similar to how I think actually like our
[78:22] (4702.16s)
best engineers can if I ask them to go
[78:24] (4704.64s)
and look at the object dump of of a of
[78:27] (4707.36s)
like a like C++ program and actually
[78:29] (4709.92s)
understand hey actually you know here's
[78:32] (4712.08s)
a place where we're here's a function
[78:34] (4714.00s)
here's a place where we're seeing a
[78:35] (4715.36s)
massive amount of like contention uh and
[78:37] (4717.68s)
we need to go out and fix this right and
[78:40] (4720.24s)
if if if the developer didn't understand
[78:42] (4722.88s)
the the sort of fundamentals uh they
[78:45] (4725.36s)
would be much worse at our company
[78:46] (4726.80s)
because of that yeah I I wonder if an
[78:49] (4729.20s)
analogy might be that a car mechanic,
[78:50] (4730.88s)
you know, car mechanics evolve over
[78:52] (4732.24s)
time. Like my dad used to we used to
[78:54] (4734.16s)
have like these old school cars where he
[78:55] (4735.68s)
he would take apart the the engine. He
[78:57] (4737.84s)
would take the whole thing apart and
[78:59] (4739.44s)
then put it back together over a weekend
[79:01] (4741.76s)
like all the parts lay. I remember and
[79:04] (4744.08s)
of course by the time you know I got to
[79:05] (4745.60s)
owning a car I can I could change the
[79:07] (4747.60s)
oil and now I have an electric car which
[79:10] (4750.00s)
which is you know like there there's not
[79:12] (4752.00s)
as many moving parts. However, someone
[79:13] (4753.92s)
who understands how cars work, how
[79:16] (4756.40s)
they're built, how they evolved, they
[79:18] (4758.40s)
will always be more in demand for
[79:20] (4760.24s)
special cases. For example, I I just had
[79:22] (4762.32s)
my 12vt battery die in my electric car.
[79:24] (4764.32s)
I had no idea there was a 12vt battery,
[79:25] (4765.92s)
but apparently I talked with someone
[79:27] (4767.12s)
who, you know, is in this and like,
[79:28] (4768.24s)
yeah, it's from the gas cars and this is
[79:29] (4769.84s)
why and this is the reason and this is
[79:31] (4771.04s)
how the new version will evolve. So like
[79:34] (4774.32s)
and clearly we we we will the majority
[79:37] (4777.04s)
of people might not need it eventually
[79:38] (4778.64s)
but there is that expertise plus these
[79:40] (4780.48s)
are the people who understand everything
[79:41] (4781.92s)
who will often take innovation forward
[79:43] (4783.84s)
because they understand what came before
[79:46] (4786.24s)
and they understand what needs to come.
[79:48] (4788.64s)
You're totally right. Well, maybe one
[79:50] (4790.24s)
other thing that I I would want to add
[79:51] (4791.92s)
to what you what you basically said is
[79:54] (4794.08s)
when you look at what great computer
[79:56] (4796.88s)
scientists and software engineers do. I
[79:59] (4799.04s)
think there are great problem solvers
[80:01] (4801.12s)
given understanding sort of a highle
[80:04] (4804.24s)
sort of business case or what the what
[80:06] (4806.72s)
the company really wants to do and there
[80:08] (4808.32s)
are people that can distill it down and
[80:10] (4810.08s)
I think that skill is actually what I
[80:12] (4812.32s)
think boils down to when you meet great
[80:13] (4813.92s)
engineers. It's not just like you tell
[80:15] (4815.52s)
them about a feature. You tell them
[80:17] (4817.12s)
about an issue, a desired outcome, and
[80:20] (4820.00s)
they will go out and find any way
[80:22] (4822.40s)
possible to go out and get get to that.
[80:24] (4824.24s)
I think that's what great engineers are.
[80:25] (4825.68s)
They're problem solvers. I think that's
[80:27] (4827.68s)
always going to be in demand. Now, is
[80:29] (4829.68s)
the person that builds the most
[80:31] (4831.60s)
boilerplate website and that is the only
[80:34] (4834.64s)
thing they are excited to do in the
[80:36] (4836.00s)
future, that person's skill set is going
[80:37] (4837.92s)
to be depreciating with time. I think
[80:40] (4840.16s)
that's a but that's a simplistic way of
[80:41] (4841.76s)
looking at it because you know if they
[80:43] (4843.44s)
were a software engineer they should
[80:45] (4845.20s)
know how to reason about systems. They
[80:47] (4847.04s)
should be good problem solvers. I think
[80:48] (4848.64s)
that's the hallmark of of like software
[80:50] (4850.64s)
engineering as a whole and they will
[80:52] (4852.64s)
always have a position out there in in
[80:54] (4854.80s)
my opinion. Now since you started to
[80:56] (4856.80s)
build windsurf or even kodium how has
[80:59] (4859.04s)
your view changed on the future of
[81:01] (4861.04s)
software engineering and we we've
[81:03] (4863.12s)
touched on a few things but but like h
[81:05] (4865.04s)
have there been some things like before
[81:06] (4866.72s)
and after now you're thinking about
[81:08] (4868.40s)
things differently
[81:10] (4870.48s)
you know I think that timelines for a
[81:12] (4872.32s)
lot of things I'm like less scared of
[81:14] (4874.16s)
them even though like I think a lot of
[81:15] (4875.76s)
them are supposed to come like come out
[81:17] (4877.68s)
like very like as scary numbers you know
[81:20] (4880.00s)
I think recently Dario from anthropic
[81:21] (4881.76s)
was 90% of all committed code is going
[81:23] (4883.84s)
to be AI generated I think the answer to
[81:25] (4885.12s)
that is going to be yes. And my question
[81:26] (4886.64s)
after that is so what? Uh like so what
[81:29] (4889.20s)
if that's the case? Developers don't
[81:30] (4890.56s)
only spend time writing code, right? I
[81:32] (4892.40s)
think there's there's this fear that
[81:34] (4894.80s)
comes from all this stuff. I think I
[81:37] (4897.36s)
think AI systems are going to get
[81:38] (4898.64s)
smarter and smarter very quickly. But
[81:41] (4901.60s)
look what I when I think about what
[81:43] (4903.44s)
engineers love doing, I think they love
[81:45] (4905.52s)
solving problems, right? They love
[81:47] (4907.04s)
collaborating with their peers to find
[81:48] (4908.88s)
out how to make solutions that work. And
[81:51] (4911.68s)
I think when I look at the future, it's
[81:53] (4913.68s)
more like things are going to improve
[81:54] (4914.80s)
very quickly, but I think people are
[81:56] (4916.16s)
going to be able to focus on the things
[81:57] (4917.20s)
that they really want to do when they're
[81:58] (4918.64s)
developers, not like the nitty-gritty
[82:00] (4920.56s)
details that, as you said, you go home
[82:02] (4922.32s)
and you're like, I don't know why this
[82:04] (4924.56s)
doesn't compile. I think that will a lot
[82:07] (4927.12s)
of those small details for most people
[82:09] (4929.76s)
are going to be a relic of the past.
[82:11] (4931.44s)
Well, I'll I'll tell you. I'll give the
[82:12] (4932.96s)
editor side of what why people are are
[82:14] (4934.88s)
stressed, you know, like because and
[82:16] (4936.24s)
they're going to say, you know, some
[82:17] (4937.28s)
listeners will say like, well, you're in
[82:18] (4938.80s)
an easy position because you're in the
[82:20] (4940.00s)
middle of an AI company. You're building
[82:21] (4941.28s)
all these tools, which is the future,
[82:22] (4942.80s)
right? Like, and you're going to be fine
[82:24] (4944.32s)
for the next few years. And they're
[82:25] (4945.84s)
thinking, I'm sitting at a B2B SAS
[82:28] (4948.24s)
company, uh, where like I'm I'm building
[82:30] (4950.32s)
this software, and my employer is
[82:32] (4952.72s)
thinking that these things make us 20%
[82:35] (4955.28s)
or 25% more efficient, and they're going
[82:36] (4956.96s)
to cut a quarter of the team. And I'm
[82:39] (4959.04s)
I'm worried a if it's going to be me, b
[82:41] (4961.04s)
the job market is not that great. And I
[82:42] (4962.80s)
get it that I can be more more
[82:44] (4964.40s)
productive with these things, but I
[82:46] (4966.00s)
still need to find a job. And that is
[82:47] (4967.68s)
the you know like not everyone will
[82:49] (4969.68s)
verbalize this, but this is the thing
[82:51] (4971.04s)
that gives people this is you know when
[82:53] (4973.20s)
they're hearing Dario talk about the 90%
[82:55] (4975.04s)
they're thinking oh damn my employer
[82:56] (4976.48s)
will say like okay Joe we don't need you
[82:58] (4978.16s)
anymore. Yeah. I the problem is I don't
[83:00] (4980.80s)
know what like maybe this is like I
[83:02] (4982.96s)
don't know if this is like a real good
[83:04] (4984.64s)
answer but that feels like the employer
[83:06] (4986.56s)
is being like irrational because okay my
[83:08] (4988.40s)
let me let me provide let me provide the
[83:10] (4990.08s)
take here if the B2B SAS company that is
[83:12] (4992.96s)
not doing well needs to compete with
[83:14] (4994.32s)
other B2B SAS companies if they just if
[83:16] (4996.80s)
they reduce the number of engineers that
[83:18] (4998.16s)
they have they're basically saying their
[83:19] (4999.36s)
product is not going to improve that
[83:20] (5000.56s)
quickly compared to a competitor that is
[83:22] (5002.96s)
willing to hire engineers and improve
[83:24] (5004.56s)
their software much more quickly I do
[83:26] (5006.32s)
think consumers and just businesses are
[83:28] (5008.80s)
going to much higher expectations for
[83:30] (5010.80s)
software. So the demand for software
[83:32] (5012.88s)
that I buy is way higher. Like I don't
[83:35] (5015.12s)
know if I've noticed this. I feel bad
[83:36] (5016.72s)
when I buy a piece of software that that
[83:38] (5018.40s)
looks like it did like you know a couple
[83:40] (5020.64s)
years ago that's like this ugly
[83:41] (5021.84s)
procurement software. Yeah. And these
[83:43] (5023.36s)
days you don't have a I hear you. I
[83:45] (5025.84s)
think I think I see your I see the short
[83:48] (5028.32s)
term of like like are there employers
[83:50] (5030.72s)
that look at this and they're like this
[83:52] (5032.24s)
is an opportunity to cut. I think these
[83:53] (5033.76s)
employers are being really really
[83:55] (5035.12s)
shortsighted. Yeah. And I I think I'm
[83:57] (5037.20s)
I'm getting a little bit of hope from
[83:58] (5038.40s)
even other other industries. There was a
[84:00] (5040.08s)
time where people writers were being
[84:01] (5041.92s)
fired left and right. Like I'm not
[84:03] (5043.04s)
talking software writers but like just
[84:04] (5044.48s)
like old traditional writers. And now
[84:06] (5046.48s)
there's a big hiring spree from all
[84:07] (5047.92s)
sorts of companies of hiring writers
[84:09] (5049.76s)
because turns out the AI is kind of you
[84:12] (5052.16s)
know this a bit bland and a great writer
[84:14] (5054.72s)
with AI is way better than without. I
[84:17] (5057.12s)
think same same for software engineers.
[84:18] (5058.56s)
So that's also a bit of my message for
[84:20] (5060.24s)
anyone listening. But just good to hear
[84:21] (5061.68s)
from you. Exactly. When you have a
[84:23] (5063.52s)
competitive market and you add a lot of
[84:26] (5066.24s)
automation, automation is great, but
[84:27] (5067.84s)
what you actually need to compare is
[84:29] (5069.60s)
automation with a human and if that's
[84:31] (5071.76s)
way more leveraged, then you actually
[84:34] (5074.08s)
should compete with that. That's like
[84:35] (5075.36s)
the game theoretically optimal thing to
[84:36] (5076.96s)
do. And actually that that's the tool
[84:38] (5078.64s)
that you're building right now, which
[84:40] (5080.00s)
which I think is one of the reasons that
[84:41] (5081.52s)
it's it like a reason I like to use it.
[84:43] (5083.60s)
It doesn't feel that it's like trying to
[84:44] (5084.88s)
do anything instead of me doing it with
[84:46] (5086.48s)
me and making me way more efficient as
[84:48] (5088.16s)
an engineer. So to to wrap up, I just
[84:50] (5090.96s)
have some rapid questions. I'm just
[84:52] (5092.48s)
going to ask them and then you can shoot
[84:54] (5094.08s)
the answer. So, I've heard that you are
[84:57] (5097.04s)
really into endurance sports,
[84:58] (5098.32s)
long-distance running, cycling, and you
[85:00] (5100.16s)
do just a lot of it. Now, a lot of
[85:03] (5103.68s)
people are are thinking, well, I'm
[85:05] (5105.28s)
pretty busy with my job with with
[85:06] (5106.56s)
coding, etc. I don't have as much time
[85:08] (5108.08s)
for sports. How do you make time for
[85:09] (5109.44s)
sports? And what would your advice be
[85:10] (5110.96s)
for someone who is like actually want to
[85:12] (5112.80s)
get in a lot better shape while being a
[85:14] (5114.24s)
software engineer and busy with with
[85:15] (5115.60s)
your work? So, I will say this, like
[85:17] (5117.60s)
since the company that has gone down
[85:19] (5119.20s)
drastically, but at my previous company,
[85:21] (5121.60s)
um I I still worked a ton. I worked at
[85:23] (5123.76s)
an autonomous vehicle company. I would
[85:25] (5125.44s)
bike over like 150 miles uh a week, like
[85:28] (5128.96s)
rigorously, like probably close to 160,
[85:31] (5131.04s)
170. Um I think it's just interestingly,
[85:33] (5133.68s)
it's I for for an activity like this, I
[85:36] (5136.48s)
actually got Zift, so like this way to
[85:38] (5138.72s)
bike indoors. Uh and I would just be
[85:41] (5141.20s)
able to knock out like 20 to 25 miles in
[85:43] (5143.68s)
an hour uh like at home. Uh, and and the
[85:46] (5146.40s)
benefit there is like now I can come
[85:48] (5148.24s)
back from work very quickly, do a ride,
[85:50] (5150.80s)
and then, you know, on the weekends on a
[85:52] (5152.72s)
Saturday, I would just dedicate being
[85:54] (5154.40s)
able to do potentially like 70 a 70mi
[85:57] (5157.04s)
loop uh somewhere. One of the lucky
[85:58] (5158.64s)
things for for me is I'm in the Bay
[86:00] (5160.16s)
Area, so there's a lot of like amazing
[86:01] (5161.92s)
places to ride a bike um like that have
[86:04] (5164.40s)
hills and stuff like that. So, I think
[86:06] (5166.40s)
it's easy to carve out this time, but
[86:08] (5168.08s)
you kind of, you know, you need to make
[86:09] (5169.60s)
the friction for yourself a lot lower,
[86:11] (5171.76s)
right? I think if I needed to, I I would
[86:14] (5174.16s)
never go to a gym like rigorously. I
[86:16] (5176.48s)
think I'm I'm not the type of person
[86:17] (5177.68s)
that would like, you know, I would I
[86:19] (5179.28s)
would just find a way to not do it, but
[86:20] (5180.88s)
if it's literally at home right next to
[86:23] (5183.04s)
where I sleep, I'm I'm going to find a
[86:24] (5184.96s)
way to do it. Sounds like just just make
[86:26] (5186.64s)
it work for you. Nice. And what's a book
[86:28] (5188.96s)
that you would recommend and why? You
[86:31] (5191.76s)
know, there was a book that I read a
[86:33] (5193.44s)
long time ago that I really enjoyed. Uh
[86:35] (5195.20s)
it's called The Idea Factory. It's
[86:37] (5197.28s)
basically about how Bell Labs uh kind of
[86:39] (5199.68s)
like innovated so much while being a
[86:41] (5201.84s)
very commercial entity and it was very
[86:44] (5204.08s)
interesting to see some of like the
[86:45] (5205.60s)
great scientists of our time working at
[86:47] (5207.36s)
this company providing so much value. So
[86:49] (5209.12s)
like information theory Claude Shannon
[86:50] (5210.72s)
worked there right um the like the
[86:52] (5212.88s)
founding of the transistor happened uh
[86:54] (5214.80s)
sort of like Shockley and all these
[86:56] (5216.88s)
people kind of were were there too and
[86:58] (5218.72s)
just hearing how a company is able to
[87:00] (5220.96s)
straddle the line between both uh was
[87:02] (5222.96s)
really exciting. Yeah, and I I hear
[87:04] (5224.80s)
that, you know, OpenAI got inspired by
[87:07] (5227.12s)
Bell Labs a lot. Their their titles are
[87:08] (5228.96s)
coming back and I think I actually I
[87:10] (5230.88s)
personally want to read more about that.
[87:12] (5232.08s)
So, thanks for the recommendation. Well,
[87:14] (5234.08s)
well, thank you. This was this was
[87:15] (5235.12s)
great. This was super interesting and
[87:16] (5236.80s)
just just love all the insights that you
[87:19] (5239.36s)
shared. Yeah, thanks a lot for having
[87:21] (5241.12s)
me. I hope you enjoyed this conversation
[87:22] (5242.96s)
with Veroon and the challenges that the
[87:24] (5244.48s)
Windsurf team is solving for. One of the
[87:26] (5246.32s)
things I enjoyed discussing was when
[87:27] (5247.84s)
Veroon shared how they have a bunch of
[87:29] (5249.44s)
features that just didn't work out, like
[87:31] (5251.04s)
their review tool, and then they
[87:32] (5252.48s)
celebrate failure and just move on. I
[87:34] (5254.56s)
also found it fun to learn how any
[87:36] (5256.08s)
developer can roll out any feature they
[87:37] (5257.76s)
build to the whole company and get
[87:39] (5259.36s)
immediate feedback whether it's good or
[87:41] (5261.12s)
bad. For more deep dives on AI coding
[87:43] (5263.52s)
tools, check out the Pragmatic Engineer
[87:44] (5264.96s)
Deep Dives link in the show notes below.
[87:46] (5266.96s)
If you enjoy this podcast, please
[87:48] (5268.48s)
consider leaving a rating. This helps
[87:50] (5270.16s)
more listeners discover the podcast.
[87:52] (5272.08s)
Thanks and see you in the next one.
[87:54] (5274.17s)
[Music]