[00:00] (0.24s)
This is something of a nice change. I've
[00:02] (2.40s)
given a lot of scientific talks and no
[00:04] (4.32s)
one claps and cheers when I come on. Not
[00:06] (6.96s)
normally even when I come on.
[00:13] (13.76s)
It's really exciting. It's really
[00:15] (15.12s)
wonderful to be here. I guess I should
[00:17] (17.76s)
start off assuming that not everyone in
[00:19] (19.76s)
this cavernous hall knows who I am. Who
[00:22] (22.80s)
am I? I'm I'm someone who has done some
[00:25] (25.68s)
work in AI for science who really
[00:28] (28.24s)
believes that we can use the AI systems,
[00:31] (31.52s)
these technologies, these ideas to
[00:34] (34.80s)
change the world in a very specific way
[00:37] (37.04s)
to make science go faster to enable new
[00:39] (39.68s)
discoveries. I think it's really really
[00:42] (42.00s)
wonderful. We have the opportunity to
[00:44] (44.40s)
take these tools, these ideas
[00:47] (47.68s)
and aim them toward the question of how
[00:49] (49.92s)
can we build the right AI systems so
[00:52] (52.72s)
that sick people can become healthy and
[00:54] (54.96s)
go home from the hospital. And it's been
[00:57] (57.84s)
kind of a a really wonderful and winding
[01:00] (60.16s)
journey for me to end up here. I was
[01:02] (62.48s)
originally trained as a physicist. I
[01:04] (64.56s)
thought I was going to be a laws of the
[01:06] (66.00s)
universe physicist. If I was very very
[01:08] (68.72s)
lucky, I could do something that would
[01:10] (70.64s)
end up one sentence in a textbook.
[01:13] (73.60s)
And I did physics and I went to actually
[01:16] (76.48s)
do a PhD in physics. And then kind of
[01:19] (79.68s)
what I was working on didn't really grab
[01:22] (82.24s)
me. I just it didn't feel like what I
[01:24] (84.08s)
wanted to do. So I dropped out. I didn't
[01:26] (86.64s)
start a startup. That would have been
[01:28] (88.08s)
very on point for this event, but I uh
[01:31] (91.04s)
dropped out and I ended up working at a
[01:33] (93.68s)
company that was doing computational
[01:35] (95.76s)
biology. How do we get computers to say
[01:38] (98.08s)
something smart about biology? And I
[01:40] (100.72s)
loved it. I loved it not just because it
[01:43] (103.04s)
was fun, but it was something that would
[01:44] (104.64s)
let me do what I thought I was good at.
[01:47] (107.28s)
Write code, manipulate equations, think
[01:50] (110.08s)
hard thoughts about the nature of the
[01:51] (111.76s)
world and use it toward this very
[01:54] (114.40s)
applied purpose that at the end we want
[01:57] (117.12s)
to ena we want to make medicines or we
[01:59] (119.12s)
want to enable others to make medicines.
[02:01] (121.60s)
Then I really kind of became a biologist
[02:04] (124.56s)
and a machine learner. Actually a
[02:06] (126.16s)
machine learner because I left that job
[02:07] (127.60s)
and I went back to grad school in
[02:09] (129.20s)
biohysics and chemistry and uh I no
[02:13] (133.04s)
longer had access to this incredible
[02:14] (134.96s)
computer hardware that I had when I was
[02:17] (137.60s)
working at my previous job and in fact
[02:19] (139.28s)
they had custom asics for simulating how
[02:22] (142.08s)
proteins this part of your body that
[02:23] (143.60s)
I'll talk about move. And since I didn't
[02:25] (145.76s)
have that anymore but I still wanted to
[02:28] (148.00s)
work on the same problems. Well, I
[02:29] (149.52s)
didn't want to just do the same thing
[02:30] (150.72s)
with less compute. And so I started to
[02:33] (153.92s)
learn and I was getting very interested
[02:35] (155.68s)
in statistics, in machine learning. We
[02:38] (158.64s)
didn't call it AI back then. In fact, we
[02:40] (160.80s)
didn't even call it machine learning.
[02:42] (162.08s)
That was a bit disreputable. I said, I'm
[02:43] (163.68s)
working in statistical physics. But you
[02:46] (166.64s)
know, how are we going to develop
[02:48] (168.40s)
algorithms? How are we going to learn
[02:50] (170.00s)
from data and do that instead of very
[02:52] (172.08s)
large compute? And I guess it turns out
[02:53] (173.60s)
in terms of AI in addition to very large
[02:56] (176.08s)
compute to answer new problems. And
[02:59] (179.84s)
after this I joined uh Google DeepMind
[03:03] (183.36s)
and really joining a company that wanted
[03:06] (186.80s)
to say how are we going to take these
[03:09] (189.12s)
powerful technologies and all kind of
[03:11] (191.68s)
these ideas and we they were becoming
[03:13] (193.60s)
very very readily apparent how powerful
[03:15] (195.68s)
these technologies were with
[03:17] (197.20s)
applications
[03:18] (198.96s)
uh to especially games but also to
[03:22] (202.16s)
things like data centers and others. How
[03:23] (203.84s)
are we going to take these technologies
[03:25] (205.04s)
and use them to advance science and
[03:27] (207.04s)
really push forward scientific frontier?
[03:30] (210.08s)
And how can we do this in an industrial
[03:32] (212.24s)
setting with an incredibly fast pace
[03:35] (215.04s)
working with some really smart people
[03:36] (216.72s)
working with great computer resources
[03:38] (218.64s)
and with all that you darn well better
[03:40] (220.64s)
make some progress and it's been really
[03:42] (222.96s)
really fun and the fact that I'm on this
[03:45] (225.20s)
stage indicates that we made some
[03:46] (226.80s)
progress and I think it really the
[03:49] (229.52s)
guiding principle for me has that when
[03:52] (232.16s)
we do this work that ultimately we are
[03:56] (236.64s)
building tools that will enable
[03:58] (238.16s)
scientists to make discoveries.
[04:00] (240.24s)
And what I think is really heartening
[04:02] (242.40s)
about the work we've done and the part
[04:04] (244.16s)
that really I think still just resonates
[04:07] (247.44s)
with me at my core is there about I
[04:10] (250.00s)
think 35,000 citations of Alphafold. But
[04:13] (253.20s)
within that is there are tens of
[04:16] (256.40s)
thousands of examples of people using
[04:18] (258.72s)
our tools to do science that I couldn't
[04:21] (261.36s)
do on my own but are using it to make
[04:24] (264.48s)
discoveries. be it vaccines, be it drug
[04:27] (267.60s)
development, be it how the body works.
[04:30] (270.08s)
And I think that's really really
[04:31] (271.20s)
exciting. And the part I want to talk to
[04:33] (273.84s)
you about today and the story I want to
[04:35] (275.52s)
tell you is a bit about the problem, a
[04:38] (278.88s)
bit about how we did it. And I think
[04:40] (280.32s)
especially the role of research and
[04:43] (283.20s)
machine learning research and the fact
[04:44] (284.48s)
that it isn't just off-the-shelf machine
[04:46] (286.24s)
learning and then I want to tell you a
[04:48] (288.24s)
little bit about what happens when you
[04:50] (290.00s)
make something great and how people use
[04:52] (292.32s)
it and what it does for the world. So,
[04:54] (294.96s)
I'll start with the world's shortest
[04:56] (296.56s)
biology lesson. The cell is complex.
[05:00] (300.64s)
Um, for people who have only studied
[05:04] (304.08s)
biology in high school or in college,
[05:06] (306.48s)
you might have this idea that the cell
[05:08] (308.24s)
is a couple parts that have labels
[05:10] (310.24s)
attached to them. And it's kind of
[05:12] (312.40s)
simple, but really it looks much more
[05:14] (314.00s)
like what you see on the screen. It's
[05:16] (316.16s)
dense. It's complex. Uh, in terms of
[05:19] (319.12s)
crowding, it's like the swimming pool on
[05:20] (320.96s)
the 4th of July and it's in full of
[05:24] (324.48s)
enormous complexity. Humans have about
[05:27] (327.36s)
20,000 different types of proteins.
[05:30] (330.00s)
Those are some of the blobs you see on
[05:31] (331.60s)
the screen. They come together to do
[05:33] (333.52s)
practically every function in your cell.
[05:36] (336.16s)
You can see that uh kind of green tail
[05:38] (338.88s)
is the psyllium of uh an ecoli. That's
[05:42] (342.80s)
how it moves around. And you can see in
[05:44] (344.96s)
fact how it moves around. And you can
[05:46] (346.64s)
see that thing that looks like it turns
[05:48] (348.08s)
and in fact it turns and drives this
[05:50] (350.72s)
motor. All of this is made of proteins.
[05:52] (352.56s)
When people say that DNA is the
[05:55] (355.52s)
instruction manual for life, well, this
[05:57] (357.28s)
is what it's telling you how to do. It's
[05:59] (359.76s)
telling you how to build these tiny
[06:01] (361.76s)
machines. And biology has evolved an
[06:04] (364.48s)
incredible mechanism to build the
[06:07] (367.20s)
machines it needs, literal nano
[06:09] (369.12s)
machines, and build them out of atoms.
[06:11] (371.44s)
And so your DNA gives you instructions
[06:13] (373.44s)
that say build a protein. Now you might
[06:16] (376.16s)
say your DNA is a line and so are
[06:18] (378.80s)
proteins in a certain sense. It's
[06:20] (380.32s)
instructions on how to attach one bead
[06:22] (382.32s)
after another where each bead is a
[06:24] (384.64s)
specific kind of molecular arrangement
[06:26] (386.32s)
of atoms. And you should wonder if I my
[06:30] (390.08s)
DNA is aligned and I am very much not
[06:32] (392.48s)
one-dimensional,
[06:34] (394.24s)
what happens in between? And the answer
[06:35] (395.92s)
is after you make this protein and
[06:38] (398.72s)
assemble it one piece at a time, it will
[06:41] (401.60s)
fold up spontaneously
[06:43] (403.68s)
into a shape like you've opened your
[06:46] (406.48s)
IKEA bookshelf and instead of having to
[06:48] (408.48s)
do the hard work, it simply builds
[06:50] (410.00s)
itself and you get this quite complex
[06:52] (412.72s)
structure. You can see quite typical
[06:55] (415.04s)
protein, a kynise for those of you who
[06:56] (416.80s)
are biologists in the audience over
[06:58] (418.88s)
there. And you can see this very complex
[07:00] (420.72s)
arrangement of atoms and that
[07:02] (422.72s)
arrangement is functional and and the
[07:06] (426.24s)
majority not everyone of the proteins uh
[07:08] (428.80s)
in your body undergo this transformation
[07:11] (431.28s)
and that is what functions and that is
[07:13] (433.44s)
incredibly small.
[07:15] (435.60s)
So light itself is a few hundred
[07:19] (439.28s)
nanometers in size and that's a few
[07:22] (442.16s)
nanometers in size. So it's smaller than
[07:24] (444.16s)
you can see in a microscope. And for a
[07:26] (446.80s)
long time scientists have wanted to
[07:28] (448.72s)
understand this structure because they
[07:31] (451.12s)
use it to predict how changes in that
[07:34] (454.16s)
protein might affect disease. How does
[07:37] (457.44s)
that work? How does biology work? Often
[07:39] (459.44s)
if you make a drug it is to interrupt
[07:40] (460.96s)
the function of a certain protein like
[07:42] (462.56s)
this one.
[07:44] (464.64s)
Now scientists have through an
[07:47] (467.36s)
incredible amount of cleverness figured
[07:49] (469.52s)
out the structure of lots of proteins
[07:51] (471.92s)
and it remains to this day exceptionally
[07:54] (474.64s)
difficult. Right? You shouldn't imagine
[07:56] (476.88s)
this as I want to determine the
[07:59] (479.84s)
structure of a protein. So I shall open
[08:01] (481.92s)
the lab protocol for protein structure
[08:03] (483.92s)
determination. I shall follow the steps.
[08:06] (486.88s)
It consists of cleverness of ideas of
[08:10] (490.24s)
finding many ways. In this case, I'm
[08:12] (492.08s)
describing one type of protein structure
[08:14] (494.32s)
prediction in or protein structure,
[08:16] (496.48s)
sorry, determination, experimental
[08:17] (497.92s)
measurement, where you convince that big
[08:19] (499.92s)
ugly molecule I just showed you to form
[08:22] (502.24s)
a regular crystal kind of like table
[08:24] (504.00s)
salt. No one has an easy recipe for
[08:26] (506.88s)
this. So, they try many things. They
[08:28] (508.64s)
have ideas and it's exceptionally
[08:32] (512.08s)
difficult and filled with failure like
[08:34] (514.40s)
many things in science.
[08:36] (516.80s)
And you're really looking at
[08:40] (520.32s)
kind of one way to get an idea of how
[08:42] (522.24s)
difficult this is. Just one kind of
[08:43] (523.92s)
ordinary paper that we were using. I
[08:45] (525.68s)
flipped to the back and it said, you
[08:48] (528.08s)
know, in their protocol, after more than
[08:49] (529.68s)
a year, crystals began to form. Right?
[08:52] (532.64s)
So, not only did they do all these hard
[08:54] (534.40s)
experiments, but they had to wait about
[08:56] (536.24s)
a year to find out if it worked. And
[08:58] (538.08s)
probably that year wasn't spent waiting.
[08:59] (539.76s)
It was trying a thousand other things
[09:01] (541.44s)
that didn't work as well.
[09:03] (543.92s)
Once you do that, you can take this to a
[09:06] (546.72s)
uh synretron, a modest thing. You can
[09:09] (549.44s)
see the cars rigging the outside of this
[09:11] (551.44s)
instrument so that you can shine
[09:13] (553.28s)
incredibly bright X-rays on it and get
[09:15] (555.92s)
what is called a defraction pattern and
[09:18] (558.08s)
you can solve that and you can deposit
[09:20] (560.96s)
it in what's called the PDB or the
[09:22] (562.96s)
protein datab bank. And one of the
[09:24] (564.96s)
things that enabled the work we did is
[09:27] (567.52s)
that scientists 50 years ago had the
[09:29] (569.84s)
foresight to say these are important,
[09:32] (572.40s)
these are hard. We should collect them
[09:35] (575.04s)
all in one place. So there's a data set
[09:37] (577.68s)
that represents ex essentially all the
[09:40] (580.32s)
academic output of protein structures in
[09:43] (583.20s)
the community and available to everyone.
[09:46] (586.08s)
So our work was on very public data.
[09:48] (588.96s)
About 200,000 protein structures are
[09:51] (591.20s)
known. They pretty regularly increase at
[09:53] (593.84s)
about 12,000 a year.
[09:57] (597.12s)
But this is much much smaller than the
[10:01] (601.20s)
Getting the kind of input information,
[10:03] (603.76s)
the DNA that tells you about a protein
[10:06] (606.24s)
is much much much much easier. So
[10:09] (609.28s)
billions of protein sequences are being
[10:12] (612.40s)
discovered. About 3,000 times faster are
[10:14] (614.96s)
we learning about protein sequence than
[10:16] (616.64s)
protein structure.
[10:18] (618.80s)
Okay, that's all scientific content, but
[10:21] (621.76s)
I should talk to you about the little
[10:24] (624.32s)
thing we did which has this kind of
[10:26] (626.40s)
schematic diagram.
[10:28] (628.64s)
We wanted to build an AI system. In
[10:31] (631.28s)
fact, we didn't even care if it was an
[10:32] (632.64s)
AI system. That's one of the nice things
[10:35] (635.20s)
about uh working in AI for science is
[10:37] (637.92s)
you don't care how you solve it. If it
[10:39] (639.52s)
ended up being a computer program, if it
[10:41] (641.04s)
ended up being anything else, we want to
[10:43] (643.04s)
find some way to get from the left where
[10:46] (646.00s)
each of those letters represents a
[10:47] (647.68s)
specific building block of the protein
[10:49] (649.68s)
considered an order. We want to put
[10:51] (651.68s)
something in the middle in the alpha
[10:53] (653.76s)
fold and we want to end up with
[10:55] (655.60s)
something on the right. And you'll see
[10:57] (657.76s)
uh two structures there if you look
[10:59] (659.36s)
closely where the blue is our prediction
[11:02] (662.56s)
and the green is the experimental
[11:04] (664.40s)
structure that took someone a year or
[11:06] (666.08s)
two of effort. If you want to put an
[11:07] (667.92s)
economic value on it on the order of
[11:10] (670.80s)
$100,000
[11:13] (673.04s)
and you can see we were able to do this
[11:16] (676.40s)
and I want to tell you how
[11:19] (679.12s)
and there were really three components
[11:21] (681.84s)
to doing this or to do any machine
[11:23] (683.68s)
learning problem and you can say you
[11:25] (685.92s)
have data and you have compute and you
[11:27] (687.60s)
have research
[11:29] (689.68s)
and I feel like we tell too many stories
[11:32] (692.72s)
about the first two and not enough about
[11:34] (694.72s)
the third. In data, we had 200,000
[11:37] (697.92s)
protein structures. Everyone has the
[11:40] (700.08s)
same data.
[11:41] (701.92s)
In terms of compute, this isn't LLM
[11:44] (704.80s)
scale. It's the final model itself was
[11:48] (708.64s)
128 TPU v3 cores, roughly equivalent to
[11:52] (712.16s)
a GPU per core for two weeks. This is
[11:55] (715.52s)
again within the scope of say academic
[11:58] (718.56s)
resources but it's worth saying really
[12:01] (721.68s)
most of your compute when you think
[12:03] (723.12s)
about how much compute you need don't
[12:04] (724.56s)
get distracted by the number for the
[12:06] (726.16s)
final model the real cost of compute is
[12:08] (728.80s)
the cost of ideas that didn't work all
[12:12] (732.00s)
the things you had to do to get there
[12:14] (734.08s)
and then finally research and I would
[12:15] (735.84s)
say this is all but about two people
[12:19] (739.20s)
that worked on this it's a small group
[12:21] (741.28s)
of people that end up doing this So
[12:24] (744.72s)
really when you look at these machine
[12:26] (746.16s)
learning breakthroughs they're probably
[12:28] (748.40s)
fewer people than you imagine and really
[12:31] (751.28s)
this is where our work was
[12:33] (753.36s)
differentiated. We came up with a new
[12:35] (755.12s)
set of ideas on how do we bring machine
[12:39] (759.04s)
learning to this problem and I can say
[12:41] (761.68s)
earlier systems largely based on
[12:44] (764.40s)
convolutional neural networks did okay.
[12:46] (766.72s)
They certainly made progress. If you
[12:48] (768.64s)
replace that with a transformer you're
[12:50] (770.24s)
honestly about the same. If you take the
[12:52] (772.72s)
ideas of a transformer and much
[12:54] (774.56s)
experimentation and many more ideas,
[12:57] (777.28s)
then that's when you start to get real
[12:59] (779.76s)
change. And in almost all the AI systems
[13:03] (783.44s)
you can see today, a tremendous amount
[13:05] (785.92s)
of research and ideas and what I would
[13:07] (787.76s)
call midscale ideas are involved. It
[13:10] (790.64s)
isn't just about the headlines where
[13:12] (792.88s)
people will say transformers,
[13:15] (795.68s)
you know, scaling, test time inference.
[13:18] (798.08s)
These are all important but they're one
[13:20] (800.08s)
of many ingredients in a really powerful
[13:22] (802.88s)
system and in fact we can measure how
[13:26] (806.00s)
much our research was worth. So someone
[13:29] (809.36s)
Alphafold 2 is the system that is quite
[13:31] (811.36s)
famous the one that uh was quite a large
[13:33] (813.68s)
improvement. Alpha fold one was the best
[13:35] (815.36s)
in the world but someone did uh the
[13:37] (817.76s)
Alcesi lab did a very uh careful
[13:40] (820.08s)
experiment where they took Alphold 2 the
[13:43] (823.44s)
architecture and they trained it on 1%
[13:46] (826.40s)
of the available data and they could
[13:48] (828.80s)
show that alpha fold 2 trained on 1% of
[13:51] (831.36s)
the data was as accurate or more
[13:54] (834.08s)
accurate as alphafold one which was the
[13:56] (836.16s)
state-of-the-art system previously. So
[13:58] (838.56s)
there's a very clean thing that says
[14:00] (840.40s)
that the third uh the third of these
[14:03] (843.60s)
ingredients research was worth a
[14:06] (846.16s)
hundfold of the first of these
[14:08] (848.00s)
ingredients data. And I think this is
[14:10] (850.64s)
generally really really important that
[14:13] (853.52s)
one of the big as you're all thinking as
[14:16] (856.16s)
you're all in startups or thinking about
[14:18] (858.08s)
startups think about the amount to which
[14:21] (861.76s)
ideas research discoveries amplify data
[14:26] (866.64s)
amplify compute they work together with
[14:28] (868.64s)
it we wouldn't want to use less data
[14:30] (870.48s)
than we have we wouldn't want to use
[14:31] (871.92s)
less compute than we have available but
[14:35] (875.36s)
ideas are a core component when you're
[14:37] (877.68s)
doing machine learning research and they
[14:39] (879.28s)
really helped to transform the world.
[14:41] (881.76s)
YC's Next Batch is now taking
[14:44] (884.08s)
applications. Got a startup in you?
[14:46] (886.32s)
Apply at y combinator.com/apply.
[14:49] (889.28s)
It's never too early. And filling out
[14:51] (891.36s)
the app will level up your idea. Okay,
[14:54] (894.32s)
back to the video. We can even go back
[14:56] (896.64s)
and we can do ablations and we can say
[14:58] (898.40s)
what parts matter. And don't focus too
[15:00] (900.24s)
much on the details. We pulled this from
[15:01] (901.92s)
our paper. You can see here this is the
[15:04] (904.56s)
difference compared to the baseline. And
[15:06] (906.40s)
you take either of those and you can see
[15:08] (908.80s)
that each of the ideas that you might
[15:10] (910.64s)
remove from our final system kind of
[15:12] (912.48s)
discreet identifiable ideas some of
[15:15] (915.04s)
which were incredibly popular research
[15:18] (918.56s)
areas within the field like this work
[15:20] (920.88s)
came out and a part of it was
[15:22] (922.40s)
equivariant and people said equivariance
[15:25] (925.52s)
that is the answer alphafold is an
[15:27] (927.52s)
equivariant system and it's great we
[15:29] (929.60s)
must do more research on equivarians to
[15:31] (931.52s)
get even more great systems well I was
[15:34] (934.48s)
very confused by this because the sixth
[15:37] (937.92s)
uh row there no IPA invariant point
[15:40] (940.80s)
attention that removes all the
[15:42] (942.56s)
equavariance in alpha fold and it hurts
[15:45] (945.60s)
a bit but only a bit. Alpha fold itself
[15:48] (948.80s)
on this GDT scale that you can see on
[15:51] (951.12s)
the left graph. Alphafold 2 was about 30
[15:54] (954.08s)
GDT better than alphafold one and
[15:57] (957.44s)
equivariance explains two or three of
[15:59] (959.68s)
this. It isn't about one idea. It's
[16:02] (962.48s)
about many midscale ideas that add up to
[16:05] (965.04s)
a transformative system. And it's very
[16:07] (967.68s)
very important when you're building
[16:08] (968.88s)
these systems to think about what we
[16:11] (971.20s)
would call in this context biological
[16:13] (973.04s)
relevance. We would have ideas that were
[16:15] (975.28s)
better. We kind of got our system
[16:18] (978.24s)
grinding 1% at a time. But what really
[16:21] (981.68s)
mattered was when we crossed the
[16:23] (983.52s)
accuracy that it mattered to an
[16:25] (985.68s)
experimental biologist who didn't care
[16:27] (987.36s)
about machine learning. And you have to
[16:29] (989.76s)
get there through a lot of work and a
[16:31] (991.76s)
lot of effort. And when you do, it is
[16:33] (993.84s)
incredibly transformative. And we can
[16:36] (996.48s)
measure against uh this axis where the
[16:38] (998.72s)
dark blue axis the other systems
[16:40] (1000.88s)
available at the time. And this was
[16:42] (1002.32s)
assessed. Protein structure prediction
[16:45] (1005.12s)
is in some ways far ahead of uh LLMs or
[16:49] (1009.20s)
the general machine learning space and
[16:50] (1010.72s)
having blind assessment. Since 1994,
[16:53] (1013.76s)
every two years, everyone interested in
[16:55] (1015.44s)
predicting the structure of proteins
[16:57] (1017.44s)
gets together and predicts the structure
[16:58] (1018.96s)
of a hundred proteins whose answer isn't
[17:00] (1020.88s)
known to anyone except the research
[17:02] (1022.32s)
group that just solved it, right?
[17:04] (1024.40s)
Unpublished. And so, you really do know
[17:06] (1026.24s)
what works. And we had about a third of
[17:08] (1028.40s)
the error of any other group on this
[17:10] (1030.88s)
assessment. But it matters because once
[17:13] (1033.60s)
you are working on problems in which you
[17:15] (1035.20s)
don't know the answer, you get to really
[17:16] (1036.72s)
measure how good things are. And you can
[17:19] (1039.36s)
really find that a lot of systems don't
[17:21] (1041.60s)
live up to what people believe over the
[17:24] (1044.40s)
course of their research. And because
[17:26] (1046.32s)
even if you have a benchmark, we all
[17:28] (1048.32s)
overfit to our ideas to the benchmark,
[17:31] (1051.44s)
right? Unless you have held out. And in
[17:33] (1053.84s)
fact, the problems you have in the real
[17:36] (1056.32s)
world are almost always harder than the
[17:38] (1058.16s)
problems you train on, right? Because
[17:40] (1060.16s)
you have to learn from much data and you
[17:41] (1061.84s)
apply it to very important singular
[17:44] (1064.16s)
problems. So it is very very important
[17:46] (1066.48s)
that you measure well both as you're
[17:48] (1068.48s)
developing and when people are trying to
[17:50] (1070.64s)
decide whether they should use your
[17:52] (1072.32s)
system. External benchmarks are
[17:54] (1074.64s)
absolutely critical to figuring out what
[17:57] (1077.76s)
works and that's what really helps drive
[18:00] (1080.00s)
the world forward. So just some
[18:02] (1082.24s)
wonderful examples of this is typical
[18:04] (1084.00s)
performance for us. These are blind
[18:05] (1085.92s)
predictions. You can see they're pretty
[18:07] (1087.84s)
darn good. also important we made it
[18:10] (1090.40s)
available and we thought it was and we
[18:12] (1092.08s)
did a lot of assessment but we decided
[18:13] (1093.60s)
that it was very important to make it
[18:15] (1095.44s)
available in two ways. One is that we
[18:17] (1097.12s)
open source the code and we actually
[18:18] (1098.48s)
open sourced the code about a week
[18:19] (1099.84s)
before we released a database of
[18:22] (1102.80s)
predictions starting originally at
[18:24] (1104.40s)
300,000 predictions and later going to
[18:26] (1106.48s)
200 million essentially every protein um
[18:29] (1109.76s)
from an organism whose genome has been
[18:31] (1111.76s)
sequenced. And this made an enormous
[18:34] (1114.00s)
difference. And one of the most
[18:35] (1115.04s)
interesting kind of sociological things
[18:36] (1116.72s)
is this huge difference between when we
[18:39] (1119.20s)
released a piece of code that
[18:40] (1120.48s)
specialists could use and we got some
[18:43] (1123.20s)
information and then when we made it
[18:44] (1124.80s)
available to the world in this database
[18:48] (1128.72s)
form. It was really interesting kind of
[18:51] (1131.52s)
you know you release something and every
[18:52] (1132.80s)
day you check Twitter to find out or
[18:54] (1134.72s)
check X to find out what's going on. And
[18:58] (1138.08s)
what we would really see is even after
[19:01] (1141.44s)
that CASP assessment, I would say that
[19:03] (1143.92s)
the structure predictors were convinced
[19:05] (1145.76s)
this obviously was this enormous advance
[19:08] (1148.96s)
solved the problem. But general
[19:10] (1150.96s)
biologists, the people we wanted to use,
[19:12] (1152.48s)
the people who didn't care about
[19:13] (1153.36s)
structure prediction, they cared about
[19:14] (1154.64s)
proteins to do their experiments, they
[19:16] (1156.96s)
weren't as sure. They said, "Well, maybe
[19:18] (1158.32s)
CASP was easy. I don't know." And then
[19:21] (1161.04s)
this database came out and people got
[19:23] (1163.36s)
curious and they clicked in and the
[19:26] (1166.64s)
amount to which the proof was social was
[19:28] (1168.80s)
extraordinary that people would look and
[19:31] (1171.52s)
say how did deep mind get access to my
[19:34] (1174.08s)
unpublished structure. you know, this
[19:36] (1176.72s)
moment at which they really believed it
[19:38] (1178.16s)
that everyone had a a protein either had
[19:41] (1181.60s)
a protein that they hadn't solved or had
[19:43] (1183.44s)
a friend who had a protein that was
[19:45] (1185.20s)
unpublished and they could compare and
[19:47] (1187.28s)
that's what really made the difference.
[19:49] (1189.04s)
And having this database, this
[19:50] (1190.56s)
accessibility, this ease led everyone to
[19:53] (1193.84s)
try it and figure out how it worked.
[19:56] (1196.56s)
Word of mouth is really how this trust
[19:59] (1199.12s)
is built. And you can kind of see some
[20:00] (1200.88s)
of these testimonials, right? I wrestled
[20:03] (1203.52s)
for three to four months trying to do
[20:06] (1206.00s)
this uh scientific task. You know, this
[20:09] (1209.44s)
morning I got an alpha fold prediction
[20:11] (1211.60s)
and now it's much better. I want my time
[20:14] (1214.72s)
back, right? You know, you really
[20:17] (1217.84s)
appreciate alphafold when you run it on
[20:19] (1219.76s)
a protein that for a year refused to get
[20:22] (1222.40s)
expressed and purified. Meaning they for
[20:24] (1224.16s)
a year they couldn't even get the
[20:25] (1225.28s)
material to start experiments. These are
[20:27] (1227.76s)
really important. When you build the
[20:29] (1229.28s)
right tool, when you solve the right
[20:30] (1230.96s)
problem, it matters and it changes the
[20:34] (1234.24s)
lives of people who are doing things not
[20:37] (1237.20s)
that you would do but building on top of
[20:39] (1239.76s)
your work. And I think it's just
[20:41] (1241.68s)
extraordinary to see these and the
[20:43] (1243.52s)
number of people I talked to. The time
[20:45] (1245.92s)
that I really knew this tool mattered.
[20:47] (1247.92s)
In fact, there was a special issue of
[20:49] (1249.36s)
science on the nuclear pore complex a
[20:51] (1251.44s)
few months after the tool came out. And
[20:54] (1254.96s)
the special issue was all about this
[20:56] (1256.96s)
particular very large kind of several
[20:59] (1259.36s)
hundred protein system. And three out of
[21:02] (1262.48s)
the four uh papers in science about this
[21:05] (1265.76s)
made extensive use of alpha fold. I
[21:07] (1267.44s)
think I counted over a hundred mentions
[21:08] (1268.88s)
of the word alphafold in science and we
[21:11] (1271.36s)
had nothing to do with it. We didn't
[21:12] (1272.64s)
know it was happening. We weren't
[21:14] (1274.16s)
collaborating. It was just people doing
[21:16] (1276.56s)
new science on top of the tools we had
[21:18] (1278.56s)
built and that is the greatest feeling
[21:19] (1279.84s)
in the world. And in fact, users do the
[21:22] (1282.32s)
darnest things. They will use tools in
[21:25] (1285.04s)
ways you didn't know were possible. The
[21:28] (1288.56s)
tweet on the left from Yoshaka Morowaki
[21:31] (1291.68s)
came out two days after our code was
[21:33] (1293.68s)
available. We had predicted the
[21:35] (1295.92s)
structure of individual proteins, but we
[21:37] (1297.68s)
consider we were working on building a
[21:39] (1299.44s)
system that would predict how proteins
[21:40] (1300.88s)
came together. But uh this researcher
[21:43] (1303.52s)
said, "Well, I have alphapold. Why don't
[21:45] (1305.52s)
I just put two proteins together and
[21:47] (1307.20s)
I'll put something in between?" You
[21:49] (1309.04s)
could think of this as prompt
[21:50] (1310.16s)
engineering but for proteins. And
[21:52] (1312.56s)
suddenly they find out this is the best
[21:54] (1314.16s)
protein interaction prediction in the
[21:56] (1316.00s)
world, right? That when you train on
[21:58] (1318.32s)
these a really really powerful system,
[22:00] (1320.72s)
it will have additional in some sense
[22:03] (1323.12s)
emergent skills as long as they're
[22:05] (1325.20s)
aligned. People started to find all
[22:07] (1327.36s)
sorts of problems that Alphafold would
[22:11] (1331.12s)
work on that we hadn't anticipated. It
[22:13] (1333.68s)
was so interesting to see the field of
[22:16] (1336.72s)
science in real time reacting to the
[22:19] (1339.04s)
existence of these tools, finding their
[22:20] (1340.72s)
limitations, finding their possibilities
[22:24] (1344.08s)
and this continues and people do all
[22:26] (1346.96s)
sorts of exciting work be it in protein
[22:28] (1348.96s)
design be it in others on top of either
[22:31] (1351.84s)
the ideas and often the systems we have
[22:34] (1354.00s)
built. One application that really uh I
[22:39] (1359.04s)
thought was really important is that
[22:41] (1361.28s)
people have started to learn how to use
[22:43] (1363.12s)
it to engineer big proteins or to use it
[22:46] (1366.40s)
in part of and I want to tell this story
[22:48] (1368.40s)
for two reasons. One is I think it's a
[22:50] (1370.00s)
really cool application but the second
[22:52] (1372.00s)
is how it really changes the work of
[22:54] (1374.08s)
science and often people will say
[22:57] (1377.28s)
science is all about experiments and
[22:59] (1379.36s)
validation. So it's great that you have
[23:01] (1381.60s)
all these alpha fold predictions. Now
[23:03] (1383.68s)
all we have to do is solve all the
[23:05] (1385.52s)
proteins the classic way so that we can
[23:08] (1388.88s)
tell whether your predictions are right
[23:10] (1390.72s)
or wrong. And they're right about one
[23:13] (1393.76s)
thing. Science is about experiments.
[23:15] (1395.84s)
Science is about doing these
[23:17] (1397.20s)
experiments.
[23:19] (1399.04s)
But they're wrong about another thing.
[23:21] (1401.68s)
Um science is about making hypotheses
[23:24] (1404.40s)
and testing them not about the structure
[23:27] (1407.60s)
of a particular protein. In this case,
[23:29] (1409.28s)
the question was they took this protein
[23:32] (1412.16s)
on the left called the contractile
[23:34] (1414.64s)
inject injection system, but that's a
[23:36] (1416.56s)
mouthful. They like to call it the
[23:37] (1417.92s)
molecular syringe. And what it does is
[23:40] (1420.48s)
it attaches to a cell and injects a
[23:43] (1423.52s)
protein into it. And the scientists at
[23:45] (1425.84s)
the Jang Lab at uh MIT were saying,
[23:49] (1429.68s)
well, can we use this protein
[23:53] (1433.36s)
to do targeted drug delivery? Can we use
[23:55] (1435.36s)
it to get gene editors like cast 9 into
[23:58] (1438.32s)
the cell? They tried over a hundred
[24:01] (1441.04s)
methods to figure out how to take this
[24:03] (1443.12s)
protein, which they didn't have a
[24:04] (1444.48s)
structure of. This is just kind of a
[24:05] (1445.92s)
rendition after the fact, and say, how
[24:08] (1448.40s)
can we change what it recognizes? I
[24:10] (1450.64s)
think it's originally involved in plant
[24:12] (1452.00s)
defense or something like that, and they
[24:14] (1454.00s)
didn't know how to do it. And they ran
[24:15] (1455.28s)
an alpha fold prediction. You can see
[24:16] (1456.88s)
the one on the left. I wouldn't even say
[24:18] (1458.00s)
it's a great alpha fold prediction, but
[24:20] (1460.16s)
almost immediately they looked at that
[24:21] (1461.76s)
and said, "Wait a minute. those legs at
[24:23] (1463.92s)
the bottom are how it must recognize and
[24:26] (1466.00s)
attach to cells. Why don't we just
[24:28] (1468.56s)
replace those with a designed protein?
[24:31] (1471.28s)
And so almost immediately as soon as
[24:32] (1472.96s)
they got the alpha fold prediction, they
[24:34] (1474.48s)
re-engineered to add this design protein
[24:36] (1476.96s)
that you see in red uh to target a new
[24:40] (1480.88s)
type of cell. And they take this system
[24:45] (1485.04s)
and then they show in fact that they can
[24:47] (1487.60s)
choose cells within a mouse and they can
[24:50] (1490.32s)
inject proteins in this case fluorescent
[24:52] (1492.48s)
proteins. So there you'll see the color
[24:54] (1494.32s)
and they can target the cells they want
[24:56] (1496.08s)
within a mouse brain. And so they are
[24:58] (1498.48s)
using this to develop a new type of
[25:00] (1500.64s)
system
[25:02] (1502.24s)
of targeted drug discovery. And we see
[25:05] (1505.12s)
many more examples. We see some in which
[25:07] (1507.36s)
scientists are using this tool to try
[25:10] (1510.08s)
thousands and thousands of interactions
[25:11] (1511.84s)
to figure out which ones are likely to
[25:14] (1514.16s)
be the case. In fact, discovered a new
[25:16] (1516.32s)
component of how eggs and sperm come
[25:18] (1518.96s)
together in fertilization. Many many of
[25:21] (1521.44s)
these discoveries that are built on top
[25:23] (1523.28s)
of this. And I like to think that our
[25:26] (1526.80s)
work made the whole field of what's
[25:29] (1529.28s)
called structural biology, biology that
[25:31] (1531.04s)
deals with structures, you know, five or
[25:33] (1533.76s)
10% faster. But the amount to which that
[25:37] (1537.04s)
matters for the world is enormous and we
[25:39] (1539.92s)
will have more of these discoveries. And
[25:43] (1543.28s)
I think ultimately structure prediction
[25:45] (1545.44s)
and larger AI for science should be
[25:47] (1547.36s)
thought of as an incredible capability
[25:49] (1549.20s)
to be an amplifier for the work of
[25:51] (1551.36s)
experimentalists that we start from
[25:53] (1553.76s)
these scattered observations, these
[25:55] (1555.52s)
natural data. This is our equivalent of
[25:58] (1558.32s)
all the words on the internet. And then
[26:00] (1560.48s)
we train a general model that
[26:02] (1562.08s)
understands the rules underneath it and
[26:04] (1564.32s)
can fill in the rest of the picture. And
[26:06] (1566.64s)
I think that we will continue to see
[26:08] (1568.56s)
this pattern and it will get more
[26:10] (1570.40s)
general that we will find the right
[26:11] (1571.92s)
foundational data sources in order to do
[26:15] (1575.04s)
this. And I think the other thing that
[26:17] (1577.28s)
has really been a property is that you
[26:20] (1580.72s)
start where you have data but then you
[26:22] (1582.88s)
find what problems it can be applied to.
[26:25] (1585.84s)
And so we find enormous advance,
[26:28] (1588.64s)
enormous capability to understand
[26:30] (1590.88s)
interactions in the cell or others that
[26:33] (1593.04s)
are downstream of extracting the
[26:35] (1595.60s)
scientific content of these predictions
[26:39] (1599.12s)
and then the rules they use can be
[26:41] (1601.12s)
adapted to new purposes. And I think
[26:42] (1602.96s)
this is really where we see the
[26:45] (1605.20s)
foundational model aspect of alpha fold
[26:47] (1607.92s)
or other narrow systems. And in fact, I
[26:50] (1610.08s)
think we will start to see this on more
[26:51] (1611.60s)
general systems, be them LLMs or others,
[26:54] (1614.24s)
that we will find more and more
[26:55] (1615.76s)
scientific knowledge within them and
[26:58] (1618.32s)
we'll use them for important important
[27:00] (1620.32s)
purposes. And I think this is really
[27:03] (1623.04s)
where this is going. And I think the
[27:04] (1624.96s)
most exciting question in AI for science
[27:08] (1628.08s)
is how general will it be. Will we find
[27:10] (1630.64s)
a couple of narrow places where we have
[27:12] (1632.80s)
transformative impact or will we have
[27:15] (1635.28s)
very very broad systems? And I expect it
[27:17] (1637.28s)
will ultimately be the latter as we
[27:19] (1639.36s)
figure it out. Thank you.