[00:00] (0.16s)
There's a 9w week release cuz every 9
[00:02] (2.72s)
weeks there's a new release going out,
[00:04] (4.48s)
right? So there's Lenus does a release
[00:06] (6.40s)
this point in time and then the merge
[00:08] (8.40s)
window is considered open and then for 2
[00:10] (10.32s)
weeks all the maintainers send Lenus all
[00:12] (12.72s)
the stuff they've had pending from the
[00:14] (14.56s)
last release. We have two weeks to add
[00:16] (16.40s)
all new features and then he does
[00:17] (17.84s)
release candidate one. From there on
[00:19] (19.68s)
it's bug fixes only for the next 7
[00:21] (21.60s)
weeks. So it's bug fixes only, bug fixes
[00:23] (23.92s)
only, bug fixes, it's regression fixes,
[00:25] (25.68s)
we'll revert things, no new features. Do
[00:27] (27.68s)
I understand correctly that in the case
[00:29] (29.04s)
of Linux is this a thing where every 9
[00:31] (31.12s)
weeks there will be a release? It's time
[00:32] (32.88s)
based. So we have that two week window
[00:34] (34.72s)
of merging all the new features to
[00:36] (36.00s)
Leninas that have been in our tree and
[00:37] (37.36s)
accepted already and proven to work. And
[00:39] (39.28s)
the window is short 9 weeks. We used to
[00:40] (40.96s)
have three year long development cycles.
[00:42] (42.24s)
And the problem there is even if you
[00:43] (43.92s)
have six month development cycles,
[00:45] (45.36s)
there's that fear of you have a feature,
[00:47] (47.28s)
I want to take your feature, but it's
[00:48] (48.56s)
not quite ready. Do I want to wait and
[00:50] (50.64s)
things like that? But if you know that
[00:51] (51.92s)
you can get your feature in in 9 weeks
[00:53] (53.68s)
from now and it's just not ready, it's
[00:55] (55.52s)
not ready. The pressure is off me as a
[00:57] (57.12s)
maintainer to take your new feature
[00:58] (58.64s)
until it's ready. Linux is the world's
[01:01] (61.04s)
most widely used operating system thanks
[01:02] (62.88s)
to powering most Android devices,
[01:04] (64.72s)
servers, smart TVs, and embedded
[01:06] (66.56s)
systems. But how is it actually built?
[01:09] (69.76s)
Today, we sat down with Greg Crow
[01:11] (71.28s)
Hartman, a Linux kernel maintainer for
[01:13] (73.44s)
13 years, who is one of the three Linux
[01:15] (75.52s)
Foundation fellows. In today's
[01:17] (77.68s)
conversation, we cover details on how
[01:20] (80.32s)
widespread Linux is and why mobile
[01:22] (82.32s)
versions of Linux have three times the
[01:23] (83.84s)
lines of code as a server versions. What
[01:26] (86.56s)
exactly it takes to get a change accept
[01:28] (88.24s)
to the Linux kernel and merged by Linux
[01:30] (90.40s)
Trots himself. How Linux manages to have
[01:32] (92.96s)
4,000 contributors per year yet have no
[01:35] (95.28s)
product managers or project managers and
[01:37] (97.68s)
many more details. If you're a software
[01:40] (100.16s)
engineer, you will use Linux directly or
[01:42] (102.00s)
indirectly. And this episode will help
[01:43] (103.92s)
you understand why it's so widespread
[01:45] (105.68s)
and how it's a lot easier to contribute
[01:47] (107.52s)
than most people would assume. If you
[01:49] (109.52s)
enjoy the show, please subscribe to the
[01:50] (110.88s)
podcast on any podcast platform and on
[01:52] (112.72s)
YouTube. Thank you.
[01:55] (115.76s)
So, Greg, it's really just nice to to
[01:58] (118.24s)
have you here cuz you're one of the most
[02:00] (120.64s)
well-known Linux contributors, one of
[02:03] (123.12s)
the few one of the longest standing ones
[02:04] (124.88s)
as well. So, just welcome to the
[02:06] (126.64s)
podcast. Thanks. Thanks for having me. I
[02:08] (128.00s)
think as software engineers, we know
[02:09] (129.44s)
Linux is important in the sense of it's
[02:11] (131.28s)
it's running on most web servers that
[02:13] (133.60s)
that we use and run. It's it's a desktop
[02:17] (137.84s)
OS that some people use and it's of
[02:20] (140.40s)
course, you know, powering a fork of it
[02:22] (142.24s)
is powering Android. But what is there
[02:24] (144.88s)
to know about Linux? How how big is this
[02:26] (146.88s)
thing? How complex is this thing? Well,
[02:28] (148.48s)
it's yeah, we took it's an operating
[02:29] (149.76s)
system. So, it's a kernel. Um we took
[02:31] (151.60s)
over the world without anybody noticing.
[02:33] (153.52s)
Um I joke it's um Android devices or 4
[02:36] (156.64s)
billion Android Linux users out there or
[02:39] (159.44s)
and they don't realize it. Um that's
[02:41] (161.52s)
everything else is a rounding error
[02:42] (162.96s)
which doesn't make the server people
[02:44] (164.24s)
happy with me but it's true. It's in
[02:45] (165.68s)
everything. Um we it's in all the
[02:47] (167.68s)
embedded devices. It's in the air
[02:48] (168.96s)
conditioning units, the car electric
[02:51] (171.12s)
charging ports, uh satellites, runs
[02:53] (173.68s)
international space station. Um really?
[02:56] (176.32s)
Yeah. Yeah. Air traffic control for
[02:58] (178.48s)
Europe and probably the US. All the
[03:00] (180.24s)
financial markets. Um, I don't think
[03:02] (182.96s)
it's in the cameras that we're using.
[03:06] (186.48s)
Um, so, um, it's, yeah, I don't know of
[03:09] (189.12s)
any place that hasn't taken over. The
[03:12] (192.32s)
number the top five selling laptops for
[03:14] (194.80s)
the past 15, 10, 15 years. Chromebooks.
[03:17] (197.44s)
Those are all Linux based. Not Apple,
[03:19] (199.44s)
but the Chromebooks are. Um, yeah. Oh,
[03:22] (202.08s)
iPhones. So, every 5G modem out there is
[03:24] (204.56s)
running a copy of Linux. Really? Yeah.
[03:26] (206.80s)
Wow. Wow. So now with Apple doing their
[03:28] (208.40s)
new ship, I don't know if it's the new
[03:29] (209.52s)
one, but um Qualcomm, all the 5G modems,
[03:32] (212.56s)
probably the 4G, I'm not sure, but I
[03:34] (214.00s)
know all the 5G modems have Linux inside
[03:35] (215.68s)
it. This episode is brought to you by
[03:37] (217.68s)
Work OS. If you're building a SAS app,
[03:40] (220.08s)
at some point your customers will start
[03:41] (221.52s)
asking for enterprise features like SAML
[03:43] (223.20s)
authentication, skin provisioning, and
[03:45] (225.20s)
fine grain authorization. That's where
[03:47] (227.60s)
Work OS comes in, making it fast and
[03:49] (229.52s)
painless to add enterprise features to
[03:51] (231.04s)
your app. Their APIs are easy to
[03:53] (233.20s)
understand, and you can ship quickly and
[03:54] (234.88s)
get back to building other features.
[03:57] (237.28s)
Work OS also provides a free user
[03:59] (239.04s)
management solution called OKit for up
[04:00] (240.80s)
to 1 million monthly active users. It's
[04:03] (243.12s)
a drop in replacement for Ozero and
[04:04] (244.96s)
comes standard with useful features like
[04:06] (246.48s)
domain verification, rolebased access
[04:08] (248.72s)
control, bot protection, and MFA. It's
[04:11] (251.52s)
powered by Radics components which means
[04:13] (253.36s)
zero compromises in design. You get
[04:15] (255.20s)
limitless customizations as well as
[04:16] (256.80s)
modular templates designed for quick
[04:18] (258.40s)
integrations. Today, hundreds of fast
[04:20] (260.96s)
growing startups are powered by work OS,
[04:22] (262.96s)
including ones you probably know like
[04:24] (264.32s)
Cursor, Verscell, and Perplexity. Check
[04:27] (267.04s)
it out at work oos.com to learn more.
[04:29] (269.68s)
That is work os.com.
[04:32] (272.88s)
Well, first of all, I'm I'm just kind of
[04:35] (275.28s)
reflecting on why I never kind of, you
[04:38] (278.32s)
know, like thought about it like this
[04:40] (280.16s)
cuz in my mind it was always like, you
[04:42] (282.56s)
know, Debbian, Red Hat, it's it's it's
[04:44] (284.88s)
on a server side. Maybe that's because
[04:46] (286.80s)
that's where I actually see where it is.
[04:49] (289.52s)
Of course, you know, there there's the
[04:51] (291.28s)
I'm a right now I'm a a Mac user and
[04:53] (293.76s)
there's the the Unix influence which
[04:55] (295.20s)
which is an influence. You know, it it
[04:56] (296.72s)
gets pretty close. I think it's a good
[04:58] (298.32s)
time to reflect on on on how many things
[05:00] (300.64s)
it it actually runs in terms of the the
[05:04] (304.32s)
kernel itself like how large is it? And
[05:07] (307.04s)
I I know you know for for different
[05:09] (309.12s)
devices it'll be split differently for
[05:11] (311.44s)
for serverside Linux for for a an
[05:14] (314.24s)
Android device it'll use different parts
[05:16] (316.24s)
of the kernel. How how big is this in
[05:18] (318.00s)
terms of contributors lines of code? I
[05:20] (320.16s)
know lines of code is not a great
[05:21] (321.28s)
measure but lines of code is
[05:22] (322.88s)
interesting. So we have just under 40
[05:24] (324.64s)
million lines of code right now. Um
[05:26] (326.48s)
that's a lot. That's all the kernel
[05:28] (328.72s)
that's the kernel. core part is like 5%
[05:30] (330.80s)
of that that everybody runs and then
[05:32] (332.48s)
everybody the rest of it is hardware
[05:34] (334.64s)
support different drivers different
[05:36] (336.24s)
devices different architectures
[05:37] (337.60s)
different chips um so your laptop runs
[05:40] (340.24s)
about two two and a half million lines
[05:42] (342.08s)
of code your server runs about one and a
[05:44] (344.56s)
half million servers are really easy
[05:46] (346.64s)
those are very simple things your phone
[05:49] (349.04s)
runs about 4 million your phone so are
[05:51] (351.44s)
the most complex pieces of CPU and
[05:54] (354.08s)
interaction out there they're just crazy
[05:56] (356.56s)
complex why why is Like can we just
[05:59] (359.68s)
pause for a second? So so like again
[06:01] (361.84s)
lines of code we know is not a perfect
[06:03] (363.52s)
measure of complexity but but in in this
[06:05] (365.28s)
sense comparing it between the two of
[06:07] (367.04s)
them with the same code base is somewhat
[06:08] (368.56s)
so you said roughly give or take a
[06:10] (370.88s)
server is one and a half million a phone
[06:12] (372.96s)
is 4 million like three times the lines
[06:14] (374.72s)
of code for a phone. Why the difference
[06:17] (377.20s)
even though I I would I would think that
[06:18] (378.72s)
the server you know does all this
[06:20] (380.16s)
mission critical stuff. A server is
[06:21] (381.60s)
really simple CPU and a network card and
[06:25] (385.20s)
a storage and storage that's it. So SOC
[06:29] (389.20s)
on a phone has you have power control,
[06:31] (391.68s)
you have clocks, you have five different
[06:33] (393.76s)
buses on there talking to different
[06:35] (395.04s)
types of devices. You have battery
[06:36] (396.48s)
control, you have talking to your modem,
[06:38] (398.72s)
you have another version of Linux in the
[06:40] (400.16s)
modem. Um, you got USB out the back. You
[06:42] (402.80s)
got USB bypass to talk to the audio
[06:44] (404.96s)
side. You have audio drivers. You have a
[06:47] (407.92s)
zillion different clocks and fives and
[06:49] (409.60s)
all sorts of stuff in there. The SOS and
[06:51] (411.92s)
it's a eight core machine. It's there's
[06:53] (413.92s)
eight processors and nothing. Those are
[06:55] (415.36s)
not trivial things. And sometimes those
[06:57] (417.04s)
processors are different sizes. So you
[06:58] (418.64s)
have big and little sizes which add the
[07:01] (421.04s)
complexity just for some control for
[07:03] (423.28s)
some power management but they all run
[07:05] (425.36s)
the same core of the Linux but it's the
[07:07] (427.76s)
drivers and the devices and things like
[07:09] (429.44s)
that. So your Pixel phone I I look at
[07:11] (431.52s)
Pixel phone Google ships a core kernel
[07:13] (433.84s)
that all Android devices pick not
[07:16] (436.24s)
hardware specific just says ARM 64.
[07:18] (438.40s)
Pixel has 300 other drivers they add to
[07:21] (441.52s)
get the Pixel phone working. I mean some
[07:23] (443.68s)
of these are tiny. This is for this tiny
[07:25] (445.20s)
chip. this this but your phone is is
[07:27] (447.84s)
really one of the most complex beasts
[07:29] (449.44s)
out there for software. Is it safe to
[07:31] (451.68s)
say that you know the complexity and you
[07:34] (454.16s)
know the lines of code will to some
[07:35] (455.76s)
extent scale with that has to do with
[07:38] (458.00s)
the hardware the capabilities and and
[07:40] (460.80s)
you know not about you know like how
[07:42] (462.56s)
mission critical because of course it
[07:43] (463.84s)
need the phone needs to be stable the
[07:46] (466.16s)
server needs to be stable my TV needs to
[07:47] (467.84s)
be stable so you know that's just kind
[07:48] (468.96s)
of a given right yeah oh and all TVs for
[07:51] (471.04s)
the past 15 years are all running Linux
[07:52] (472.64s)
so that's oh so my Samsung TV is running
[07:54] (474.32s)
Linux Oh yeah oh yeah Samsung my your
[07:55] (475.76s)
Samsung my Samsung washer and dryer are
[07:57] (477.28s)
running Linux
[07:59] (479.60s)
so Um your Samsung watch is running
[08:01] (481.36s)
Linux. So um Samsung has their own yeah
[08:03] (483.92s)
distro all works really nice. Um yeah
[08:07] (487.04s)
it's all due down to the complexity of
[08:08] (488.64s)
the hardware. So the kernel controls the
[08:11] (491.04s)
hardware. The job of Linux is to make
[08:14] (494.16s)
all the hardware look agnostic to
[08:16] (496.40s)
programs. So you can write the same user
[08:18] (498.24s)
space program and run it on the same on
[08:21] (501.04s)
different hardware and it does it just
[08:22] (502.40s)
works. hard a kernel's job is to manage
[08:25] (505.52s)
memory and devices in a common way and
[08:28] (508.72s)
provide that to user space. It's not a
[08:30] (510.80s)
we joke from the kernel like user space
[08:32] (512.72s)
is just a test load, but I mean it's a
[08:35] (515.28s)
tool there for you to actually solve
[08:36] (516.72s)
your problems. So when you're running
[08:37] (517.92s)
servers, you're wanting to put message
[08:40] (520.00s)
through and network and storage and
[08:42] (522.00s)
stuff. That's that's your load and
[08:43] (523.68s)
that's what they're there for. For a
[08:44] (524.72s)
phone, you want to control a a display,
[08:46] (526.80s)
you want to talk out the modem, you want
[08:48] (528.08s)
to talk on the thing, you want to listen
[08:49] (529.44s)
to audio. Yeah. Lots of different things
[08:51] (531.36s)
there. And I'm just want to touch back
[08:54] (534.40s)
on on the kernel because like I I'm not
[08:56] (536.72s)
a Linux developer. Like I I know you
[08:59] (539.20s)
know I' I've heard of the kernel. I in
[09:01] (541.60s)
my assumption it is the the critical
[09:03] (543.28s)
part as as you said the you know the
[09:05] (545.92s)
thing that runs immediately and then it
[09:08] (548.00s)
will you know the user space will run on
[09:10] (550.24s)
top of it. But what is the differential?
[09:12] (552.80s)
What what makes a kernel? And you said
[09:14] (554.96s)
it's about 5% of all of these things.
[09:17] (557.20s)
How how how do you split this or like is
[09:19] (559.36s)
there a definition of again you're a
[09:21] (561.52s)
kernel developer so I'm I'm I'm trying
[09:24] (564.08s)
to get a sense how can like someone who
[09:26] (566.96s)
I'm you know let's say I'm I'd like to
[09:28] (568.80s)
contribute to to Linux and understand
[09:30] (570.40s)
how it is eventually I'm going to figure
[09:31] (571.92s)
out what what what this kernel is but
[09:33] (573.52s)
what is it what what makes kernel and
[09:35] (575.52s)
and non-kernel so kernel versus user
[09:38] (578.32s)
space so there's an idea um chips have a
[09:41] (581.52s)
protected mode and a not protected mode
[09:43] (583.60s)
in a very simplified way there's
[09:45] (585.20s)
different levels of protection so The
[09:46] (586.80s)
protected mode is where the operating
[09:48] (588.64s)
system runs the kernel. Yeah. And that
[09:50] (590.40s)
is where we share all the resources.
[09:52] (592.08s)
It's one flat address space. Got it. And
[09:54] (594.48s)
we are not isolating processes. Got it.
[09:56] (596.64s)
So a user space process then runs on top
[09:58] (598.48s)
of that and we isolate them and they
[10:00] (600.08s)
they all individually think they have
[10:01] (601.52s)
the whole machine but they don't. Yeah.
[10:02] (602.96s)
So it's multitasking. You can run
[10:04] (604.56s)
multiple programs at the same time. And
[10:06] (606.48s)
the kernel is there to give you memory
[10:08] (608.48s)
to give access to storage to give access
[10:10] (610.80s)
in a common way to give access to the
[10:12] (612.40s)
network in a common way to give or
[10:14] (614.88s)
provide the pipes to go around the
[10:16] (616.80s)
network stack in the user space. Um some
[10:19] (619.20s)
people don't like using Linux's network
[10:20] (620.72s)
stack. They have their own to provide a
[10:22] (622.48s)
way for all your different mice to show
[10:24] (624.00s)
up to user space in the common way. We
[10:25] (625.92s)
know all the different mice um USB to
[10:27] (627.84s)
storage devices, your graphics
[10:29] (629.52s)
controller. We provide a way to make it
[10:32] (632.24s)
so that user space can talk to the
[10:34] (634.96s)
kernel in an agnostic way and it'll
[10:37] (637.28s)
their stuff will just work because it
[10:38] (638.88s)
all the graphics work the same
[10:40] (640.08s)
interface. We talk to keyboards all the
[10:42] (642.08s)
same way things like that. So it's a
[10:43] (643.60s)
commonality of providing a a shim layer
[10:46] (646.16s)
above the hardware and then for example
[10:47] (647.92s)
drivers do they always live in the user
[10:50] (650.96s)
space? So yeah no all our drivers live
[10:52] (652.32s)
in the kernel. So the kernel and drivers
[10:54] (654.32s)
are all Linux is not a micro kernel
[10:57] (657.12s)
architecture it's a monolithic. So the
[10:59] (659.28s)
the code is all in the same address
[11:00] (660.80s)
space. So a bug in any one of them has a
[11:02] (662.56s)
chance to take any part of the kernel
[11:03] (663.92s)
down. Mhm. So Linux ships all the
[11:07] (667.04s)
drivers for all the architecture in one
[11:09] (669.12s)
big tarball. We that's 40 million lines
[11:11] (671.68s)
of code. Other operating systems try and
[11:13] (673.84s)
go out there and um had split things
[11:15] (675.76s)
off. So the core of Windows is their
[11:17] (677.60s)
kernel and then you can put drivers
[11:19] (679.12s)
additional on the top. Um we tie
[11:22] (682.40s)
everything together in one big giant
[11:24] (684.16s)
blob. Theirs is still monolithic. any
[11:26] (686.72s)
driver and theirs will can crash the
[11:28] (688.16s)
kernel within reason. Um, in that way we
[11:31] (691.60s)
can refactor the way the interfaces
[11:33] (693.76s)
between drivers and the kernel are. Uh,
[11:36] (696.00s)
Linux drivers are on average one-third
[11:38] (698.16s)
smaller than other operating system
[11:39] (699.60s)
drivers because we can see the
[11:41] (701.20s)
commonalities if you send oh three
[11:43] (703.12s)
different drivers for three kind of same
[11:44] (704.64s)
hardware. Well, let's combine them all
[11:46] (706.40s)
make it smaller and refactor things and
[11:48] (708.72s)
make it easier and oh, let's change this
[11:50] (710.16s)
API. And this has to do with the open
[11:52] (712.16s)
source approach, right? that you see it
[11:54] (714.32s)
like so we have we see all this common
[11:56] (716.00s)
code and we can refactor it and we can
[11:57] (717.52s)
make it better and cleaner and we're not
[11:58] (718.96s)
tied to any fixed interface. Our fixed
[12:02] (722.16s)
interface is between the user space and
[12:04] (724.00s)
the kernel. We will not break that.
[12:05] (725.92s)
That's our guarantee. We've guaranteed
[12:07] (727.68s)
it for a long long time. And so we
[12:10] (730.08s)
always want you to be able to upgrade
[12:11] (731.28s)
your kernel and not feel worried that
[12:13] (733.28s)
your old programs are going to crash. So
[12:15] (735.12s)
you should always be able to upgrade.
[12:16] (736.24s)
That's our guarantee to you. If it does
[12:18] (738.16s)
break then it's our fault and we'll
[12:19] (739.76s)
regress. There are some exceptions.
[12:21] (741.44s)
There's some gray areas. There's some
[12:22] (742.88s)
really low-level parts between the user
[12:24] (744.88s)
space and kernel that we kind of work
[12:26] (746.48s)
around and we argue about these all the
[12:28] (748.16s)
time, but we never try and break user
[12:29] (749.52s)
space on purpose. Yeah, a lot of times
[12:31] (751.36s)
we do accidentally, we'll fix it up.
[12:32] (752.96s)
That's our number. That's our only
[12:34] (754.00s)
really rule of of kernel development.
[12:35] (755.68s)
Don't break user space on purpose. And
[12:38] (758.00s)
and so when we're talking about the, you
[12:39] (759.68s)
know, the 1.5 million lines drafted per
[12:41] (761.68s)
server, we're talking about the kernel
[12:43] (763.04s)
and plus drivers. Kernel plus driver.
[12:45] (765.52s)
Well, because it is part of the kernel.
[12:47] (767.28s)
And then, you know, you have this 40
[12:48] (768.88s)
million line of of tarbo tarbal. And
[12:50] (770.80s)
then every platform will kind of take
[12:53] (773.12s)
their parts of it. They'll they'll take,
[12:54] (774.88s)
you know, what is relevant for for their
[12:56] (776.40s)
use case, capabilities, drivers, you
[12:58] (778.56s)
know, other parts. And then this is why
[13:00] (780.24s)
I guess you Raspberry Pi. You're going
[13:02] (782.16s)
to say it's going to run. It's laying on
[13:03] (783.52s)
Linux, right? Of course. Yeah. That's
[13:04] (784.88s)
Oh, yeah. Yeah. That's Raspberry Pies.
[13:06] (786.48s)
Yeah. Huge. Those things are everywhere.
[13:08] (788.48s)
That's what's in all the electric car
[13:09] (789.92s)
charging stations. Those are Raspberry
[13:11] (791.28s)
Pies. Really? Yeah. Where you plug your
[13:13] (793.68s)
car into Raspberry Pies? Yeah. Those are
[13:15] (795.36s)
all Wow. Cuz it's a really cheap um
[13:17] (797.76s)
industrial thing. Oh, lots of signages
[13:20] (800.24s)
now. That's all those are all running
[13:21] (801.76s)
Raspberry Pies. I I I guess it was safe
[13:24] (804.08s)
to assume they're not running running
[13:25] (805.44s)
Windows to be fair. So, no. Yeah. So,
[13:27] (807.20s)
the Dutch um signage for the trains,
[13:29] (809.84s)
those are all running Linux. Sometimes
[13:30] (810.88s)
you'll see a crash Linux machine up
[13:32] (812.16s)
there. Can we look at a specific example
[13:34] (814.08s)
of how development actually flows
[13:36] (816.64s)
through uh with a specific patch? Before
[13:39] (819.28s)
I show a specific example, so say so we
[13:41] (821.36s)
had 4,000 developers last year. So, they
[13:43] (823.28s)
make a change. So those 4,000 developers
[13:45] (825.12s)
will send an email to a maintainer and a
[13:47] (827.28s)
maintainer maintains a subset of the
[13:49] (829.12s)
kernel. Every part of the kernel is
[13:51] (831.04s)
owned by somebody and then you are one
[13:52] (832.96s)
of these maintainer. So then yeah I
[13:54] (834.96s)
maintain some drivers and things like
[13:56] (836.56s)
that but then those maintainers send
[13:57] (837.84s)
things off up the tree to a subsystem
[14:00] (840.40s)
maintainer. So like USB serial then will
[14:02] (842.72s)
get sent to USB and then USB will go to
[14:04] (844.96s)
to Lenus. So it's kind of a pyramid
[14:07] (847.28s)
scheme that way. We have that. So we we
[14:09] (849.68s)
have like 800 maintainers and we have
[14:12] (852.00s)
the middle section we maybe have 200
[14:14] (854.08s)
different trees there and then in our
[14:16] (856.32s)
testing environment all those trees are
[14:17] (857.76s)
tested every day they're all merged
[14:19] (859.44s)
together and things that happen whatnot.
[14:22] (862.40s)
Um so we have this kind of hierarchy of
[14:24] (864.96s)
developers and maintainers that way and
[14:27] (867.20s)
part of the hierarchy is the human
[14:28] (868.80s)
aspect. So I if I take code from you as
[14:31] (871.44s)
a maintainer um I'm now responsible for
[14:34] (874.00s)
it because my name's on it. So, if it's
[14:36] (876.56s)
a simple one-off or it's a simple driver
[14:38] (878.08s)
that nobody cares about except you,
[14:40] (880.16s)
great. I know you're the only one that's
[14:41] (881.84s)
going to be affected by it, it's fine.
[14:43] (883.60s)
But if it's the core part of the kernel
[14:45] (885.44s)
and I take changes from you now, I'm
[14:47] (887.28s)
responsible for it if you disappear. So,
[14:49] (889.52s)
I have to trust that either you're going
[14:51] (891.28s)
to be here or that I understand it good
[14:52] (892.80s)
enough that I can maintain it. So, part
[14:55] (895.44s)
of Linux development trust or issue or
[14:58] (898.00s)
model is trust. And it's trust in human
[15:00] (900.08s)
interaction. like I will take stuff from
[15:01] (901.92s)
people if they whatever they send me cuz
[15:04] (904.48s)
I trust not that they got it right but
[15:06] (906.00s)
they'll be there to fix it when they get
[15:07] (907.28s)
it wrong because we all get it wrong and
[15:09] (909.84s)
that's that's the part so that's that's
[15:11] (911.76s)
the trust model we have and that we've
[15:13] (913.92s)
been burned in the past by some major
[15:15] (915.84s)
features were landed in the networking
[15:18] (918.08s)
core subsystem a long time ago and then
[15:20] (920.16s)
once they landed and were merged and
[15:22] (922.48s)
taken the email address behind it
[15:24] (924.32s)
disappeared and the network developers
[15:25] (925.76s)
had to took six months to unwind the
[15:27] (927.68s)
mess um so it's hard to change is a core
[15:30] (930.64s)
part of the kernel for good reason
[15:32] (932.64s)
because it affects everybody and also
[15:34] (934.16s)
good reason in that we want to make sure
[15:35] (935.60s)
that you are going to be there to fix it
[15:37] (937.20s)
if it breaks. Yeah. But for drivers and
[15:39] (939.76s)
things like that we'll take anything.
[15:41] (941.04s)
Drive by will take it's really simple
[15:42] (942.96s)
and it's very simple thing that way. But
[15:45] (945.12s)
um that's the hierarchy. So it change
[15:46] (946.72s)
flows up the tree that way. Yeah. So I
[15:49] (949.28s)
can show that. All right. So what what
[15:52] (952.16s)
are we going to see? So here is a
[15:53] (953.76s)
change. So this was uh written by
[15:56] (956.96s)
somebody named Chester. um he made a
[16:00] (960.24s)
change to the USB serial driver. Uh it's
[16:03] (963.76s)
an option. The chip is called option. Um
[16:05] (965.68s)
these are a USB to serial devices. Um
[16:08] (968.00s)
they're in modems. They're in lots of
[16:10] (970.16s)
different things. There's a ton of
[16:11] (971.28s)
different ones and there's no standard
[16:12] (972.80s)
for these types of devices. So you had
[16:14] (974.32s)
to add a custom device ID for every
[16:16] (976.40s)
single one that you want to use. It's
[16:18] (978.32s)
just the way they work. Um so here's a
[16:20] (980.40s)
patch and here's the description of it.
[16:22] (982.64s)
Um this is just an email. The
[16:24] (984.48s)
description here is there's the subject
[16:26] (986.00s)
line. Yeah. USB serial option adding
[16:28] (988.96s)
whatever adding that device and then
[16:30] (990.80s)
here's so good part about uh hardest
[16:34] (994.16s)
part about writing a kernel change is
[16:35] (995.76s)
the description of what's going on
[16:38] (998.00s)
really yeah I mean the the code is easy
[16:39] (999.84s)
it's the description explaining it is
[16:41] (1001.28s)
hard right you don't explain usually
[16:43] (1003.44s)
what it's doing you have to explain why
[16:45] (1005.04s)
you're doing it for something as simple
[16:46] (1006.80s)
as this it's like it's really easy so
[16:48] (1008.96s)
this person says uh this driver is part
[16:51] (1011.28s)
of a cat 6 modem uh the product ID is
[16:54] (1014.16s)
shared across multiple modems it gives
[16:55] (1015.84s)
some a a little dump of what it looks
[16:57] (1017.52s)
like in the device. And then there's
[16:59] (1019.76s)
some more information. There's a signed
[17:01] (1021.52s)
off by line. And signed off by is what
[17:04] (1024.08s)
we created a long time ago. Um that
[17:05] (1025.76s)
shows that I have the proper authorship
[17:08] (1028.96s)
of this and ownership and I give it to
[17:10] (1030.56s)
this project under the license by which
[17:12] (1032.88s)
the project is run. So it's saying I'm
[17:15] (1035.12s)
licensed this thing under the GPL. Um
[17:18] (1038.00s)
and then way down below is probably the
[17:21] (1041.12s)
oneline patch. This is all just so this
[17:23] (1043.60s)
is all the description. This person is
[17:25] (1045.60s)
giving context on like here's what needs
[17:27] (1047.36s)
to do about the model the different you
[17:29] (1049.44s)
know specifications or what and here's
[17:30] (1050.80s)
the change. So somebody changed this
[17:32] (1052.64s)
removed that line the red is removed the
[17:35] (1055.04s)
green is add. Yeah. So somebody added
[17:37] (1057.60s)
and had to reformat the lines based on
[17:39] (1059.68s)
some new ones they added. Um and that's
[17:42] (1062.08s)
it. So they add a few new device IDs and
[17:44] (1064.16s)
then there's a device ID and then we see
[17:46] (1066.08s)
a hex a few hex numbers. Those are like
[17:48] (1068.80s)
some IDs here. So for USB um USB devices
[17:53] (1073.12s)
have a product and a uh vendor and a
[17:55] (1075.36s)
product ID. That's how they're vendor
[17:57] (1077.76s)
and then they have products and then
[17:59] (1079.36s)
there's some subvice and subproduct IDs.
[18:02] (1082.32s)
Got it. Got it. So that's what this is
[18:03] (1083.44s)
for. Okay. So they're saying we're just
[18:05] (1085.20s)
adding support for some the driver
[18:06] (1086.80s)
already works for these chips this chip
[18:09] (1089.12s)
but we just have a new ID because a new
[18:10] (1090.88s)
vendor came along and they wanted to put
[18:12] (1092.32s)
their own vendor ID on it. Very common.
[18:15] (1095.28s)
So it sounds like this change is as
[18:16] (1096.96s)
simple as it get in terms of the code
[18:18] (1098.40s)
changed but still the description was
[18:21] (1101.28s)
very extensive right so very des
[18:24] (1104.64s)
extensive um part of the description was
[18:26] (1106.56s)
also just here is a dump of the
[18:28] (1108.48s)
description of the hardware just so that
[18:30] (1110.40s)
we can verify yeah that is going to
[18:32] (1112.32s)
match with this got it so just it's a we
[18:35] (1115.28s)
have tools that create those things and
[18:37] (1117.04s)
but yeah it's a lot of work for four
[18:39] (1119.92s)
line change well but but this is like if
[18:42] (1122.08s)
if you know if we talk GitHub language
[18:44] (1124.56s)
which although I'm not sure this this
[18:46] (1126.48s)
will be a PR, right? This is this should
[18:49] (1129.20s)
be in the patch itself, not the PR. Ah,
[18:52] (1132.64s)
so a PR would be so say you have 10
[18:55] (1135.04s)
changes you want to make. So a PR would
[18:56] (1136.72s)
be the patch zero out of 10. Got it. Got
[18:58] (1138.88s)
it. Yeah. This this is the commit. This
[19:00] (1140.24s)
is the commit itself. Which is a big
[19:01] (1141.60s)
problem of why I don't like the GitHub
[19:03] (1143.12s)
model is because people don't put the
[19:04] (1144.48s)
changes in the GitHub in the git commit.
[19:07] (1147.04s)
No, because the git repo Well, no. So
[19:09] (1149.20s)
there's a problem when you commit the
[19:10] (1150.72s)
when you're looking at the repo later
[19:12] (1152.00s)
and you look at the change, you don't
[19:13] (1153.44s)
see the pull, you can't see the pull
[19:14] (1154.88s)
request information. Yeah. And it's
[19:16] (1156.48s)
gone. And that's a big problem I feel
[19:18] (1158.80s)
with the GitHub model. Well, I I feel
[19:20] (1160.24s)
this goes back to, you know, like you
[19:22] (1162.32s)
built the tool or you know, Linux group
[19:24] (1164.56s)
built a tool for your use case and
[19:26] (1166.24s)
you're using it the way you intend to
[19:27] (1167.92s)
use it. Whereas GitHub built the pull
[19:29] (1169.92s)
request workflow is built on top of this
[19:31] (1171.76s)
and it is not part of Git for whatever
[19:33] (1173.84s)
reason. You know, maybe GitHub could
[19:35] (1175.04s)
have made it part of get whatnot, but
[19:37] (1177.76s)
it's not right. Well, no. So, we have
[19:41] (1181.12s)
pull requests. We created pull requests
[19:43] (1183.92s)
for in Linux. We have we email there's
[19:46] (1186.16s)
git create pull request. Oh, okay. That
[19:48] (1188.08s)
was that was a good command. It makes an
[19:49] (1189.52s)
email. Is that part of the get? Yeah, it
[19:51] (1191.92s)
makes an email that says pull from this
[19:54] (1194.00s)
repo and here's everything that's in
[19:55] (1195.44s)
this repo. And when we do a merge of
[19:58] (1198.16s)
that, we that merge commit has all that
[20:01] (1201.60s)
information in it. Okay. And then so
[20:04] (1204.16s)
you'll if you look at the Linux kernel
[20:05] (1205.60s)
and you see when you merge when Linus
[20:07] (1207.36s)
merges in the USB tree he sees my little
[20:10] (1210.24s)
message at the top saying here's
[20:11] (1211.52s)
everything that's going to be in this
[20:12] (1212.48s)
pull request. You got it. And because
[20:15] (1215.36s)
Git is the source and that's where all
[20:17] (1217.36s)
the data is, right? And so you can see
[20:19] (1219.44s)
that we don't have pull requests. It's
[20:20] (1220.80s)
not external. GitHub could change the
[20:22] (1222.32s)
model and put that in the merge request,
[20:23] (1223.76s)
but it doesn't. I I I was about to say
[20:26] (1226.00s)
that like because you did it, they they
[20:28] (1228.00s)
could do, but it's it's a matter of I
[20:29] (1229.44s)
guess preferences and Yeah, that's fine.
[20:31] (1231.52s)
Anyway, but the good thing about this is
[20:33] (1233.44s)
you can track every single line of code
[20:35] (1235.52s)
back to who made it and what they did
[20:38] (1238.72s)
and what the was the change what was the
[20:40] (1240.48s)
change log what was the reasoning behind
[20:42] (1242.16s)
this which is great. Okay. So, so this
[20:45] (1245.20s)
comes into this person sends it to the
[20:47] (1247.68s)
the module. So, he sent it to the Yeah.
[20:49] (1249.84s)
So, the owner to this is Johan. Uh
[20:52] (1252.08s)
there's a script we have that says take
[20:53] (1253.76s)
any patch and give me the people who are
[20:56] (1256.56s)
responsible for this and the mailing
[20:57] (1257.84s)
list. So Johan and me um picked this
[21:01] (1261.20s)
because us and sends it to the USB list
[21:02] (1262.88s)
and copies a bunch of other people that
[21:04] (1264.48s)
I guess they worked with and that have
[21:06] (1266.00s)
changed this driver in the past and and
[21:08] (1268.00s)
then the the copy is also done with the
[21:09] (1269.92s)
tool as you said. It kind of looks to
[21:11] (1271.60s)
who who touched this code or or who
[21:15] (1275.52s)
all automated. Yep. All automated. We do
[21:17] (1277.44s)
that soon. So that was great. Um he sent
[21:20] (1280.00s)
it and this the mailing list has two
[21:21] (1281.92s)
copies of it. That's just because it
[21:23] (1283.52s)
went to two different mailing lists. But
[21:24] (1284.96s)
then they said, um, oops, I messed up.
[21:29] (1289.68s)
There's an email from the person
[21:31] (1291.12s)
instantly after he sent it off. Oh, and
[21:33] (1293.20s)
said, I I messed up. There's an
[21:34] (1294.40s)
interface. I need to maybe it would be a
[21:35] (1295.84s)
good idea to change this comment. So,
[21:38] (1298.48s)
they go and change the comment and then
[21:40] (1300.64s)
they resend it and you just send a new
[21:42] (1302.80s)
version. And then um and then in this
[21:46] (1306.00s)
case, they send a new patch or or do
[21:48] (1308.96s)
they do they do they just add one more?
[21:51] (1311.12s)
No, you want to have a clean commit,
[21:52] (1312.88s)
right? Yeah, we don't do So here they
[21:54] (1314.80s)
sent a version two patch. If you can see
[21:57] (1317.68s)
that it says version two, right? Oh,
[22:00] (1320.96s)
there. Version two. Got it. And then
[22:03] (1323.84s)
here's the same information. And then
[22:05] (1325.36s)
there should be some comments about what
[22:07] (1327.36s)
changed between the two versions.
[22:08] (1328.96s)
Hopefully.
[22:10] (1330.64s)
Yes. Changes in version two from the
[22:13] (1333.20s)
previous one. And there's a link back to
[22:14] (1334.80s)
the first one. Mhm. Nice. Very nice. So
[22:17] (1337.28s)
that's what we do. So we want to see the
[22:18] (1338.72s)
changes because I mean I get a thousand
[22:20] (1340.24s)
emails a day. Yeah. And I when I review
[22:22] (1342.48s)
patches and stuff, I'll review them and
[22:24] (1344.24s)
then they're gone because I'm reviewing
[22:25] (1345.36s)
the next one. Yeah. But if I I want when
[22:28] (1348.00s)
you send a next version, I want to
[22:29] (1349.36s)
remember what see what changed from the
[22:30] (1350.80s)
previous one because I don't want to go
[22:31] (1351.68s)
back and dig through all stuff. Okay. So
[22:33] (1353.68s)
they added some information to it.
[22:35] (1355.68s)
Wonderful. And then what happened?
[22:40] (1360.88s)
Johan, who is the maintainer of this
[22:43] (1363.12s)
subsystem? Yep. Wrote said, "Hey, thanks
[22:45] (1365.52s)
for the patch and how for documenting
[22:47] (1367.20s)
it." Oh, he did something else. I got
[22:49] (1369.44s)
the order.
[22:51] (1371.68s)
First they said um oh he Chester wrote
[22:55] (1375.28s)
hey please please please apply this
[22:57] (1377.52s)
after two weeks after a week he said
[22:59] (1379.44s)
because after a week or after two weeks
[23:01] (1381.84s)
it's nice hey what happened to this
[23:03] (1383.20s)
what's going on um Johan said you submit
[23:06] (1386.16s)
this patch during the merge window um
[23:08] (1388.24s)
I'll talk about how we do our
[23:09] (1389.76s)
development model but there's a twoe
[23:11] (1391.20s)
merge window for when we do releases
[23:12] (1392.88s)
that Lenus takes all the changes from
[23:14] (1394.48s)
all the maintainers that have been in
[23:16] (1396.00s)
their development trees we can't add new
[23:17] (1397.92s)
changes at that point in time so there's
[23:19] (1399.76s)
a two kind of blackout for new
[23:21] (1401.44s)
development, but this is where all the
[23:23] (1403.04s)
stuff is flowing into Lenus for the next
[23:24] (1404.64s)
release. Um, so during that time, if you
[23:27] (1407.44s)
send me a patch, I can't really do
[23:28] (1408.72s)
anything with it, but it'll stick in my
[23:30] (1410.00s)
mailbox until then. So, this happened to
[23:32] (1412.64s)
hit that little window of time. Ju just
[23:35] (1415.28s)
understand, there's a 9w week release.
[23:38] (1418.16s)
Every nine weeks there's a new release
[23:39] (1419.92s)
going out, right? And then there's a
[23:42] (1422.32s)
window where patches are gathered. So,
[23:44] (1424.64s)
yeah, here let's talk about that. So,
[23:46] (1426.00s)
there's Linus does a release. Yeah.
[23:47] (1427.92s)
Yeah, this point in time. Um, and then
[23:50] (1430.88s)
the merge window is considered open. And
[23:53] (1433.28s)
then for two weeks, all the maintainers
[23:55] (1435.84s)
send Lenus all the stuff they've had
[23:57] (1437.76s)
pending from the last release. Yep. We
[24:00] (1440.56s)
have two weeks to add all new features.
[24:02] (1442.56s)
Yep. And then he does release candidate
[24:04] (1444.48s)
one and then from there on it's bug
[24:07] (1447.04s)
fixes only for the next seven weeks.
[24:08] (1448.80s)
Mhm. So it's bug fixes only, bug fixes
[24:11] (1451.04s)
only, bug fixes, it's regression fixes,
[24:12] (1452.80s)
we'll revert things and so it's no new
[24:14] (1454.56s)
features. But during that seven weeks,
[24:16] (1456.88s)
people were sending me new features. So
[24:18] (1458.80s)
I have a separate tree which is my next
[24:22] (1462.32s)
you're now batching it for the when the
[24:24] (1464.16s)
window will open. Yeah. So we call it
[24:25] (1465.52s)
next Linux next. So we have a next tree
[24:27] (1467.92s)
where all these are merged together on a
[24:29] (1469.84s)
daily basis to see to make sure they
[24:31] (1471.68s)
work. Yeah. Because be prepared for
[24:33] (1473.36s)
Lenus' next one. And then when he does a
[24:36] (1476.32s)
release after everything's good, we all
[24:38] (1478.40s)
throw things at him again in another two
[24:40] (1480.32s)
weeks. Now Lenus doesn't pull
[24:42] (1482.80s)
automatically from all those merge
[24:44] (1484.08s)
trees. We have to explicitly ask them.
[24:46] (1486.16s)
Yeah. Because sometimes our trees aren't
[24:47] (1487.84s)
good. Yeah. So sometimes like I
[24:49] (1489.52s)
maintained the TTY in serial one time
[24:51] (1491.68s)
famously it was a mess. Our tree it just
[24:53] (1493.92s)
wasn't working. There was new features
[24:55] (1495.04s)
added. So I'm like I'm skipping this
[24:56] (1496.96s)
release cycle. I'm going to pull out
[24:58] (1498.24s)
some of these bug fixes and send it to
[24:59] (1499.52s)
you off the side and then go. But if it
[25:01] (1501.44s)
was like automatically being merged in,
[25:03] (1503.12s)
we'd have to deal with that mess. It's
[25:04] (1504.72s)
it's just interesting because most
[25:07] (1507.04s)
companies just you know reflecting on
[25:08] (1508.80s)
you know the companies that use git
[25:10] (1510.72s)
large tech companies they often have
[25:12] (1512.40s)
let's say let's talk about native mobile
[25:14] (1514.00s)
development where there is a concept of
[25:16] (1516.64s)
releasing every week or or or two weeks
[25:19] (1519.36s)
because of the app store review process
[25:20] (1520.88s)
or same with like desktop apps such you
[25:23] (1523.12s)
can't really just continuous release.
[25:24] (1524.64s)
There's usually a an aim for something
[25:27] (1527.52s)
but it's not as strict. So of every now
[25:30] (1530.08s)
and then it would also happen that you
[25:31] (1531.68s)
know it's it's just not stable enough.
[25:33] (1533.60s)
will push it back. But there is not this
[25:35] (1535.84s)
rigid like clockwork like I I think you
[25:38] (1538.32s)
know most companies that I've seen they
[25:40] (1540.64s)
just treat it a bit more flexible
[25:42] (1542.08s)
because again you know they come up with
[25:43] (1543.68s)
uh thing they're they're in charge your
[25:46] (1546.08s)
feature you want to have added right
[25:47] (1547.36s)
yeah and then as as we know when you
[25:48] (1548.64s)
have a milestone you know like features
[25:51] (1551.20s)
might be cut deadline might be moved you
[25:53] (1553.36s)
know like companies totally do I
[25:55] (1555.60s)
understand correctly that in the case of
[25:57] (1557.44s)
Linux like is this a thing where every
[25:59] (1559.52s)
nine weeks there will be a release we
[26:01] (1561.60s)
it's time based so we have that two week
[26:03] (1563.68s)
window of merging all the new features
[26:05] (1565.36s)
to Lenus that have been in our tree and
[26:06] (1566.88s)
accepted already and proven to work. Um
[26:09] (1569.52s)
and the window is short eight nine
[26:11] (1571.12s)
weeks. Yeah. And that's good because we
[26:12] (1572.96s)
had we used to have two year-long
[26:14] (1574.40s)
development cycles, three year long
[26:15] (1575.60s)
development cycles. And the problem
[26:16] (1576.96s)
there is if you have even if you have
[26:18] (1578.64s)
six month development cycles, there's
[26:20] (1580.32s)
that fear of you have a feature. I want
[26:22] (1582.64s)
to take your feature, but it's not quite
[26:24] (1584.00s)
ready. Do I want to wait? I know and
[26:26] (1586.24s)
things like that. But if if I if you
[26:27] (1587.84s)
know that you can get your feature in in
[26:29] (1589.44s)
nine weeks from now and it's it's just
[26:31] (1591.44s)
not ready, it's not ready. It's it's
[26:33] (1593.76s)
much more like okay the pressure is off
[26:35] (1595.76s)
me as a maintainer to take your new
[26:37] (1597.36s)
feature until it's ready. You you you
[26:39] (1599.84s)
can say like look if if it'll make it
[26:41] (1601.76s)
into the next one or you know let's make
[26:43] (1603.44s)
sure it's going to work properly if it's
[26:45] (1605.12s)
more complex. Yeah, we have lots of
[26:46] (1606.32s)
features. I mean famously there's a USB
[26:47] (1607.92s)
feature that's on patch version number
[26:50] (1610.16s)
35. This 25 patch series. It's on the
[26:53] (1613.28s)
35th version and it's just not ready.
[26:55] (1615.60s)
And I just got email today saying well
[26:57] (1617.12s)
maybe we need to change this to this
[26:58] (1618.56s)
other way. I mean, so I feel feel so bad
[27:00] (1620.64s)
for that developer, but he's been
[27:02] (1622.32s)
working hard and it's a it's a
[27:03] (1623.68s)
complicated feature and it's taken him a
[27:06] (1626.48s)
year and a half to get there. I have
[27:07] (1627.84s)
other patches that are in version three,
[27:09] (1629.20s)
but that's version three and it's been
[27:10] (1630.72s)
two years because the developer just
[27:12] (1632.40s)
took a lot of time in between. Okay. So,
[27:15] (1635.04s)
so so in this case, this is a good
[27:16] (1636.32s)
example that, you know, the the the
[27:18] (1638.72s)
person the contributor uh said like,
[27:21] (1641.28s)
"Hey, a reminder, I'd like this patch
[27:22] (1642.96s)
applied." And and then uh Johan replied
[27:26] (1646.40s)
uh reminding of the of the timeline on
[27:28] (1648.00s)
how it works, right? Yeah. Exactly. And
[27:29] (1649.68s)
Chester wrote back. And then really
[27:31] (1651.44s)
friendly. It'll be in the next one.
[27:32] (1652.96s)
Don't worry. Yeah. Which is nice. Very
[27:34] (1654.64s)
positive. Yeah. This is We're not mean
[27:36] (1656.80s)
people. We want Yeah. And in reminding,
[27:38] (1658.96s)
don't ever feel bad about reminding me
[27:40] (1660.80s)
that I haven't reviewed your patch in
[27:42] (1662.56s)
two weeks. Now, if I haven't reviewed it
[27:44] (1664.24s)
in two days, yeah, I'll be a little
[27:46] (1666.00s)
testy, but two weeks is a good idea. And
[27:48] (1668.56s)
then Chester Rob, thanks a lot for
[27:49] (1669.92s)
keeping an eye on it. Keep up the good
[27:51] (1671.12s)
work. And that was it. So then Johan has
[27:53] (1673.20s)
it. Yeah. Johan applied it to his tree
[27:55] (1675.28s)
because he then wrote saying, "Hey," and
[27:57] (1677.12s)
Johan is very nice here. He said, "You
[27:58] (1678.64s)
kind of didn't do the comments in the
[28:00] (1680.48s)
proper format. I fixed it up for you."
[28:02] (1682.88s)
Oh, nice. So for driveby changes like
[28:04] (1684.72s)
that, we want to make it really easy and
[28:06] (1686.48s)
make it We're not mean people. I I mean
[28:08] (1688.16s)
I mean I mean clearly this this feels
[28:10] (1690.00s)
like it's a person who is unlikely to
[28:12] (1692.56s)
become a regular contributor. They're
[28:14] (1694.72s)
getting their work done, right? They're
[28:16] (1696.16s)
they're adding they have a device that
[28:18] (1698.00s)
they have to ship. Yeah, pretty much.
[28:20] (1700.00s)
But we want to be friendly and open and
[28:22] (1702.32s)
easy to everybody because everybody
[28:24] (1704.16s)
submits their first patch at one time,
[28:25] (1705.52s)
right? Famously, I when I did my very
[28:27] (1707.68s)
first patch, I wrote an email saying,
[28:28] (1708.64s)
"How do I make a patch?" Because we
[28:29] (1709.76s)
didn't have good documentation. Somebody
[28:31] (1711.20s)
wrote back and said, "Hey, here's how
[28:32] (1712.48s)
you do it." Um, he became my boss eight
[28:35] (1715.12s)
years later. It was like I worked for I
[28:36] (1716.48s)
ended up working for just funny. It's
[28:38] (1718.24s)
just like a small world and whatnot. But
[28:40] (1720.00s)
um but yeah, and we want to make it
[28:41] (1721.44s)
easy. So Johan takes this and he's got
[28:43] (1723.76s)
the patch and it's in his tree now.
[28:45] (1725.28s)
Yeah. Which is great. So, but that's in
[28:46] (1726.72s)
his local little tree. Um then he has to
[28:49] (1729.44s)
get it off somewhere else. Johan then
[28:51] (1731.36s)
makes a pull request to me. Mhm. So this
[28:54] (1734.48s)
is an output of the get make pull
[28:57] (1737.12s)
request. I don't know what the actual
[28:58] (1738.72s)
command and it this is what a pull
[29:00] (1740.40s)
request from get and this is because
[29:01] (1741.92s)
Johan is a subsystem maintainer and it
[29:05] (1745.12s)
maintains the USB to serial drivers.
[29:06] (1746.80s)
There's a bunch of drivers for this
[29:08] (1748.00s)
types of things. And then he sends it
[29:09] (1749.68s)
off to me the USB maintainer. Got it.
[29:12] (1752.00s)
and he says take this patch or pull from
[29:15] (1755.84s)
this tree at this tag and it's a signed
[29:18] (1758.00s)
tag. So it's signed with his GBG key so
[29:20] (1760.08s)
I can verify that it's really him. Yeah.
[29:21] (1761.68s)
When I pull from it and it says take
[29:23] (1763.44s)
these patches and here's the
[29:24] (1764.40s)
information. It's going to be some USB
[29:25] (1765.84s)
device IDs and they have all been in
[29:27] (1767.60s)
Linux next with no reported issues. So
[29:29] (1769.44s)
they've been tested in our integrated
[29:31] (1771.60s)
testing. We test all this stuff every
[29:33] (1773.04s)
day. And what does testing mean? Is is
[29:35] (1775.92s)
it automated testing? Is it pushing it
[29:38] (1778.32s)
out on on devices in device labs? Is it
[29:40] (1780.80s)
a mix? Yes, it's all of that. So, we
[29:42] (1782.80s)
have one. So, Linux next gets merged
[29:44] (1784.80s)
every day as developer in Australia. He
[29:46] (1786.96s)
merges all the trees together and builds
[29:48] (1788.64s)
them and boots them. Y and virtual
[29:50] (1790.72s)
machines. Yep. Um that's a non-trivial
[29:53] (1793.52s)
thing for a colonel to do just to boot.
[29:55] (1795.44s)
If it can boot, it's usually, hey,
[29:56] (1796.56s)
things are going well. Um it isn't
[29:58] (1798.72s)
testing on real devices. Now there are
[30:00] (1800.72s)
other labs out there with kernel CI
[30:02] (1802.64s)
which is our CI infrastructure that can
[30:04] (1804.64s)
run on all individual labs and we do
[30:06] (1806.24s)
push things out there and our people
[30:07] (1807.68s)
testing Linux next on their real
[30:09] (1809.12s)
hardware sending us reports back in an
[30:11] (1811.44s)
automatic fashion. Um those are less
[30:13] (1813.84s)
rare. Lenus' tree gets tested more on
[30:16] (1816.08s)
that. The stable trees I can talk about
[30:17] (1817.68s)
stable trees in a little bit get mean
[30:19] (1819.52s)
tested more on the real hardware more.
[30:21] (1821.52s)
Linux next gets build and boot tested
[30:23] (1823.68s)
pretty well. Yeah. Um I don't run Linux
[30:26] (1826.00s)
next. I run the my development trees on
[30:28] (1828.16s)
mine. So I don't run all the miss mix of
[30:29] (1829.92s)
them all. Sometimes they interact
[30:31] (1831.60s)
because we don't have any fifoms. Um if
[30:34] (1834.40s)
I have a USB change that needs to
[30:36] (1836.96s)
actually go through the networking
[30:38] (1838.24s)
stuff, I can change the networking code
[30:39] (1839.76s)
and whatnot like that and they can say
[30:41] (1841.76s)
hey maybe you shouldn't do that and we
[30:42] (1842.96s)
try and get approval. You review my
[30:45] (1845.04s)
patches but it's now we can touch any
[30:47] (1847.44s)
any but can touch any part of the kernel
[30:49] (1849.20s)
in a way. But he sent me a pull request
[30:50] (1850.80s)
and I a pull request is that I don't
[30:53] (1853.20s)
actually review the changes in it. I'm
[30:54] (1854.72s)
not reviewing each individual patch
[30:56] (1856.00s)
through email. I'm trusting that he sent
[30:57] (1857.76s)
me four patches here and that they're
[31:00] (1860.00s)
good. Yeah. And I have known Yan and I
[31:03] (1863.60s)
know that he will be there if something
[31:04] (1864.80s)
goes wrong. Yeah. And and like you will
[31:07] (1867.52s)
you read the kind of the the description
[31:10] (1870.08s)
and then every now and then you might
[31:11] (1871.44s)
decide for example to like deep dive
[31:13] (1873.04s)
into a a change. Totally. I mean for USB
[31:15] (1875.84s)
device IDs it's like okay yeah well
[31:17] (1877.44s)
they're all attached to the same driver.
[31:19] (1879.04s)
Yeah. These are common. They're nothing
[31:20] (1880.56s)
simple. Sometimes they're a little more
[31:22] (1882.08s)
complex. Um, I don't pull from a lot of
[31:24] (1884.40s)
different trees, but I pull from some
[31:26] (1886.32s)
that I trust. Some subsystems that I
[31:28] (1888.88s)
don't necessarily trust as well. Um, I
[31:30] (1890.88s)
will make you send them an email and
[31:32] (1892.96s)
I'll actually review them and then I'll
[31:34] (1894.32s)
review them. And then when I review
[31:35] (1895.44s)
them, I add my signed off by to it and I
[31:37] (1897.76s)
I guess part of trust will be here. I'm
[31:39] (1899.92s)
just going to assume that since you and
[31:41] (1901.92s)
Johan know each other well and you work
[31:44] (1904.08s)
for a while, Johan will probably also
[31:46] (1906.00s)
every now and then give a comment
[31:47] (1907.28s)
saying, "Hey, Greg, there's this change.
[31:49] (1909.52s)
Can you take an extra look on on this
[31:51] (1911.28s)
thing?" etc. Yeah. So sometimes Johan
[31:53] (1913.68s)
makes changes to the code himself or I
[31:56] (1916.56s)
make changes to the code myself. I put
[31:58] (1918.24s)
it out for review and I have other
[32:00] (1920.16s)
people review my changes. So this is
[32:02] (1922.16s)
just just fascinating for me to
[32:05] (1925.28s)
tell you explain how trust between
[32:09] (1929.60s)
people maintainers is so important for
[32:12] (1932.88s)
efficient development. Yeah, it's all
[32:15] (1935.36s)
it's Yeah, it's And then also the trust
[32:18] (1938.40s)
is somebody once told me that Linux
[32:20] (1940.48s)
development was the scariest thing they
[32:21] (1941.84s)
ever did because not because it was like
[32:23] (1943.84s)
difficult or what not. It's because my
[32:25] (1945.36s)
name is on this change and it's public.
[32:27] (1947.60s)
That's makes you as an engineer do
[32:29] (1949.68s)
really really good work. I mean so much
[32:31] (1951.92s)
for so that this person who submitted
[32:33] (1953.44s)
this patch went back and looked at it
[32:35] (1955.04s)
instantly and said, "Oh, wait. The
[32:36] (1956.72s)
comment could be made a little bit
[32:38] (1958.24s)
better." And they're like, "Oh, yeah."
[32:39] (1959.52s)
So I mean that's not a normal
[32:40] (1960.96s)
development process in a company that I
[32:42] (1962.56s)
commit to go. It it it makes me you know
[32:45] (1965.68s)
wonder about a few things that I kind of
[32:48] (1968.64s)
took for granted. For example, you know,
[32:50] (1970.40s)
like does could this mean that you
[32:52] (1972.16s)
closed source software where the outside
[32:54] (1974.88s)
world does not know how it was done?
[32:56] (1976.32s)
maybe there's just a bit less incentive
[32:58] (1978.08s)
to do, you know, like such great work.
[33:00] (1980.08s)
And actually, it's just a reflection
[33:01] (1981.20s)
like I do remember when when I worked at
[33:03] (1983.52s)
a company and when we actually my team,
[33:05] (1985.92s)
we open sourced a component that we
[33:08] (1988.08s)
built and I just remember how I put in
[33:11] (1991.28s)
way more work into that to make it look
[33:14] (1994.24s)
good to have the document and not just
[33:15] (1995.76s)
look good but but make it clean. We we
[33:18] (1998.08s)
cleaned up actual tech depth before we
[33:20] (2000.48s)
published it. And we didn't do that with
[33:22] (2002.48s)
our stuff. It it was so open source
[33:25] (2005.76s)
development by virtue of just human
[33:29] (2009.04s)
pressure makes a better engineering
[33:31] (2011.36s)
product. It's a better engineering and
[33:33] (2013.44s)
then and we've kind of shown that
[33:34] (2014.96s)
through the years that this development
[33:37] (2017.04s)
model creates a better software. I'm I'm
[33:40] (2020.08s)
kind of revisiting some of my like not
[33:42] (2022.40s)
assumptions but I I never thought of it
[33:44] (2024.40s)
like this but it's it's just it's
[33:45] (2025.84s)
awesome to to see this. So So then what
[33:47] (2027.76s)
what what what happens next after after
[33:49] (2029.28s)
Yuan sends it to? So Johan sends it to
[33:50] (2030.88s)
me and then I take it and I put it in my
[33:52] (2032.88s)
tree. I think I send do I send him
[33:55] (2035.12s)
saying I took it and and then if you
[33:57] (2037.44s)
take responsible, right? I pulled it and
[34:00] (2040.32s)
pushed it out. Yes. And there's my email
[34:02] (2042.40s)
that says that. So now I'm responsible.
[34:04] (2044.32s)
It's in my tree. Yeah. So now the um
[34:06] (2046.64s)
since this is a device ID, these can go
[34:09] (2049.04s)
to Lenus at any time. We can add bug
[34:10] (2050.64s)
fixes or new device IDs. These are
[34:12] (2052.72s)
trick. So then a few days later, I send
[34:14] (2054.80s)
this change off to Lenus. So I send
[34:16] (2056.56s)
Lenus. I said, "Hey, Lenus,
[34:19] (2059.20s)
take all these following changes, these
[34:21] (2061.04s)
changes, and here's a whole bunch of USB
[34:22] (2062.88s)
fixes." So, here's some small driver
[34:25] (2065.04s)
fixes, some new device IDs, and then So,
[34:27] (2067.60s)
I summarize it all. I say, "These are
[34:29] (2069.36s)
all the things in here." Yeah. And these
[34:31] (2071.12s)
are going to be like a few dozen of of
[34:33] (2073.92s)
patches, something like that. Yeah,
[34:35] (2075.44s)
there's a whole bunch. And but here's
[34:36] (2076.48s)
the list of the patches down below. And
[34:38] (2078.16s)
here's of them. Here's the diff of them
[34:41] (2081.04s)
to make sure that this diff matches what
[34:42] (2082.80s)
he pulls from. This is signed with
[34:44] (2084.80s)
Mikey. Mhm. Um I do say almost all these
[34:47] (2087.76s)
have been Linux next. I guess some of
[34:49] (2089.52s)
them slipped in, but we also have
[34:51] (2091.04s)
another testing when you send patches to
[34:52] (2092.80s)
the mailing list. We have a we call it a
[34:54] (2094.48s)
zero day bot. We'll go through and start
[34:56] (2096.16s)
applying them and build testing them.
[34:57] (2097.76s)
Mh. And that's run and then our L our
[35:00] (2100.00s)
own trees that we create also does
[35:01] (2101.60s)
verification that they did build and
[35:03] (2103.12s)
boot. Yeah. And it will run some
[35:04] (2104.80s)
benchmarks for drivers. It doesn't
[35:06] (2106.80s)
really run benchmarks. Um and so then
[35:09] (2109.04s)
Lenus takes this and he puts in his in
[35:10] (2110.88s)
his tree. So then it got picked up. So
[35:13] (2113.28s)
it got picked up another day later. And
[35:16] (2116.88s)
let's talk about how we do our model. So
[35:18] (2118.40s)
Lenus does a release every nine weeks.
[35:20] (2120.88s)
Yep. Bug fixes come in during those nine
[35:22] (2122.72s)
weeks or the last release. You're
[35:24] (2124.00s)
running the last release, right? You
[35:25] (2125.76s)
want those bug fixes. You had a device
[35:28] (2128.40s)
that's running those bug fixes. A long
[35:29] (2129.84s)
time ago, we realize that people don't
[35:31] (2131.76s)
want to wait 8 weeks. So let's create a
[35:33] (2133.60s)
model of we have a development tree and
[35:35] (2135.28s)
we have a stable tree. So when Lenus
[35:37] (2137.60s)
does a release, I fork off Lenus' branch
[35:39] (2139.84s)
and I say this is a stable branch. So if
[35:42] (2142.00s)
6.4 four. I do 6.4.1.2.3.4.5.
[35:46] (2146.72s)
And our release numbers are just
[35:47] (2147.92s)
numbers. They mean nothing. They're not
[35:49] (2149.36s)
semantic versioning. We were around way
[35:51] (2151.52s)
before that happened. They're just
[35:53] (2153.04s)
meaning this number is later than that
[35:54] (2154.96s)
number. Yep. That's all. When we switch
[35:56] (2156.80s)
from 4x to 5.x, it's just because the x
[36:00] (2160.24s)
got too big. Yeah. And in your brain,
[36:02] (2162.64s)
when you see a number between like 14
[36:04] (2164.64s)
and 18, it looks smaller than 4 to 8.
[36:07] (2167.92s)
Yeah. So, and Yeah. So, we just bump it
[36:10] (2170.24s)
up every couple years. So then we so we
[36:12] (2172.56s)
take stable we have stable releases. I
[36:14] (2174.48s)
do a release every week and what I do is
[36:16] (2176.48s)
the patches have to be in Lenus' tree
[36:18] (2178.48s)
first. We can't diverge. So if it's in
[36:20] (2180.64s)
Lenus' tree first and a bug fix and it
[36:22] (2182.64s)
meets this criteria, I put in the stable
[36:24] (2184.64s)
tree and I do a release. And so we do
[36:26] (2186.32s)
new releases every week for that. So
[36:28] (2188.56s)
during those nine weeks, I'll take new
[36:30] (2190.96s)
device IDs. I'll take bug fixes and
[36:32] (2192.80s)
whatnot. And then you can tag the fixes
[36:35] (2195.28s)
that are going into the tree with a
[36:36] (2196.72s)
special way that I'll automatically take
[36:38] (2198.32s)
them. I know to look at them. the other
[36:39] (2199.76s)
stable tree maintainer with me, Sasha,
[36:41] (2201.68s)
he runs through them and runs a whole
[36:43] (2203.28s)
bunch of fuzzing. He's been doing AI
[36:45] (2205.20s)
before it was called ever AI. Um, it's
[36:47] (2207.44s)
just pattern matching. I mean, and we
[36:49] (2209.36s)
have a whole body of here's a whole
[36:50] (2210.48s)
bunch of bug fixes. Here's a whole bunch
[36:52] (2212.08s)
of changes. Did anybody do these kind of
[36:54] (2214.56s)
match? Oh yeah, these people because
[36:56] (2216.16s)
some people don't realize that, oh, this
[36:57] (2217.44s)
was a bug fix. It should go into the
[36:58] (2218.80s)
stable tree. They've written academic
[37:01] (2221.04s)
papers on it for years. It's fun stuff.
[37:03] (2223.28s)
Um, so just pattern matching, right? So,
[37:05] (2225.68s)
um, then we'll pick up a whole bunch of
[37:06] (2226.88s)
stuff that, hey, maybe you forgot about
[37:08] (2228.56s)
that and you'll give me a chance to
[37:10] (2230.00s)
respond to before it goes into the
[37:11] (2231.52s)
stable tree. And we do those releases.
[37:13] (2233.20s)
When Lenus does a new release, then I
[37:14] (2234.96s)
throw that stable tree away and I make a
[37:16] (2236.56s)
new stable tree. That's great for things
[37:19] (2239.60s)
that it can update more often. People
[37:21] (2241.52s)
want to make a device. You want to make
[37:23] (2243.04s)
it something that's going to last a long
[37:24] (2244.24s)
time. So, what we come up with the idea
[37:25] (2245.52s)
is long-term stable trees. And there I
[37:28] (2248.08s)
pick one kernel a year and I maintain it
[37:30] (2250.24s)
for to start with two years, sometimes
[37:32] (2252.88s)
six years. So your Android phone is
[37:35] (2255.28s)
running off a kernel that's five years
[37:36] (2256.72s)
old, but it's still getting bug fixes
[37:38] (2258.40s)
back to it. So I I maintain like four st
[37:40] (2260.88s)
long-term stable trees at the same time
[37:43] (2263.36s)
and we backport all these fixes to all
[37:44] (2264.80s)
the different branches and then we pick
[37:46] (2266.16s)
one a year and we maintain these. So
[37:48] (2268.16s)
there's six of them going at a time and
[37:50] (2270.08s)
and in this case like is it is it you
[37:52] (2272.40s)
like there's one maintainer for each of
[37:54] (2274.96s)
these long term? No, it's just me.
[37:58] (2278.32s)
Oh, you Wow. Okay. Yeah, it's the two of
[38:00] (2280.40s)
us. Um the longer the the interesting
[38:02] (2282.64s)
thing is the older the code is the
[38:04] (2284.24s)
harder it is to maintain and the
[38:05] (2285.92s)
companies like oh I'll put a junior
[38:07] (2287.20s)
developer to maintain old code that's
[38:08] (2288.96s)
harder because it's more diverged from
[38:10] (2290.56s)
what the latest developers are using.
[38:12] (2292.16s)
Can you tell me a little bit more about
[38:14] (2294.48s)
this cuz you know the the older the code
[38:16] (2296.96s)
the harder is to maintain like I think
[38:19] (2299.76s)
it feels true but but why why is this
[38:23] (2303.68s)
the the case? Is it just lost context?
[38:25] (2305.68s)
Is it it's so development moves on goes
[38:28] (2308.32s)
forward right? So say a change I make
[38:30] (2310.56s)
today to the codebase that fixes the bug
[38:32] (2312.24s)
that's going to that affects the code
[38:33] (2313.92s)
and I look back it's affected the code
[38:35] (2315.52s)
for the past 10 years. Yeah. All right.
[38:37] (2317.36s)
If I try start backporting this change
[38:39] (2319.20s)
to code that's 10 years old. Code has
[38:41] (2321.52s)
evolved in that time. Yeah. And making
[38:43] (2323.68s)
that change to older code is harder. And
[38:45] (2325.84s)
the more I have to change it, the more
[38:47] (2327.28s)
it diverges from the original fix. So
[38:49] (2329.76s)
the more context and skill you have to
[38:51] (2331.92s)
have to make the change to the older
[38:53] (2333.92s)
codebase than even the developer who
[38:56] (2336.00s)
made the first change. It's it's not
[38:58] (2338.24s)
intuitive. Uh companies make this
[39:00] (2340.00s)
mistake all the time thinking, "Oh, I'll
[39:01] (2341.44s)
just maintain this old codebase for a
[39:02] (2342.80s)
long time." We have major security bugs
[39:05] (2345.28s)
like Spectra Meltdown with chips. Yeah.
[39:08] (2348.24s)
Some of those Spectra fixes have not
[39:09] (2349.92s)
been backported to some of the long-term
[39:11] (2351.60s)
kernels that are still being supported
[39:13] (2353.52s)
because it was just too hairy of a fix.
[39:15] (2355.12s)
Anybody who cared moved to a new kernel.
[39:16] (2356.88s)
Yeah. So, I look at a lot of these older
[39:18] (2358.72s)
kernels is it's again if you're using
[39:20] (2360.72s)
it, you will provide the resources to
[39:22] (2362.48s)
maintain it. Google, I'll call out, and
[39:24] (2364.48s)
Laro, Google's um another group do a lot
[39:27] (2367.44s)
of work in testing these old kernels
[39:28] (2368.96s)
because Google cares a lot about these
[39:30] (2370.16s)
kernels. So, they provide testing
[39:31] (2371.36s)
infrastructure and merges and
[39:34] (2374.00s)
reproducibility and and running on real
[39:36] (2376.16s)
devices to make sure that these kernels
[39:37] (2377.68s)
still work on them and they work well.
[39:39] (2379.52s)
And that way, I know that if I make a
[39:41] (2381.20s)
change back there, it'll still work. If
[39:42] (2382.88s)
I didn't have that resources for them
[39:44] (2384.56s)
doing that work, I wouldn't be able to
[39:45] (2385.76s)
maintain these old kernels. Yeah. And
[39:48] (2388.00s)
and then go going back to the the buck
[39:49] (2389.76s)
fix. So like every week there's a a new
[39:52] (2392.08s)
stable branch release and then when does
[39:54] (2394.80s)
the the big release come the the
[39:56] (2396.80s)
nineweek release come that that's after
[39:58] (2398.40s)
this has been kind of baking right for
[40:00] (2400.24s)
the stable branch has been so stable's
[40:02] (2402.08s)
independent of Lenus's tree. Oh stable
[40:03] (2403.60s)
independent so the only tie is it has to
[40:06] (2406.16s)
be in Lenus' tree first we do not want
[40:08] (2408.24s)
divergent we don't want you to make a
[40:09] (2409.76s)
fix to a stable tree only in non lenus
[40:11] (2411.68s)
tree got it. Sometimes I will have bugs
[40:14] (2414.00s)
in the stable tree due to other changes
[40:16] (2416.08s)
I've t I mean fixes need fixes and I'm
[40:18] (2418.96s)
like I can't take the fix for this until
[40:20] (2420.88s)
you get the fix and leanness a tree and
[40:22] (2422.48s)
it's kind of a forcing function on a
[40:24] (2424.16s)
developer to get a fix to Lenus before
[40:26] (2426.88s)
I'll take it from the stable tree.
[40:28] (2428.56s)
Sometimes I'll revert the change in the
[40:30] (2430.24s)
stable tree. And do I understand the the
[40:32] (2432.16s)
way to get a fix into Linux is a well of
[40:34] (2434.72s)
course you need to get a a fix into
[40:36] (2436.32s)
Linux's tree which means you need to go
[40:38] (2438.40s)
through one of the maintainers who uh is
[40:42] (2442.48s)
is in you know who who maintains one of
[40:45] (2445.28s)
the the subsystems. Yeah. So say and you
[40:47] (2447.92s)
just need to go up the tree as you up
[40:49] (2449.12s)
the pyramid. Right. So uh famously
[40:51] (2451.04s)
Bluetooth always breaks every other
[40:53] (2453.04s)
release. Bluetooth is crazy complex. The
[40:55] (2455.28s)
hardware is horrible. And if you need to
[40:56] (2456.80s)
get a fix in there, it has to go to
[40:58] (2458.00s)
Bluetooth 3 and then that gets sucked
[41:00] (2460.00s)
into a networking tree and then that
[41:01] (2461.28s)
network tree goes to Lenus. So it's like
[41:03] (2463.20s)
a two-stage process sometimes. And then
[41:05] (2465.52s)
we have somebody tracking regressions.
[41:07] (2467.04s)
Regressions are really important. We
[41:08] (2468.16s)
don't want anything to regress.
[41:09] (2469.28s)
Sometimes Lenus will say, I'll just take
[41:10] (2470.64s)
these bug fixes or regressions. I'll
[41:12] (2472.24s)
just take them now. Boom. I'll just take
[41:14] (2474.24s)
them. So um depends on what they are. If
[41:16] (2476.40s)
they affect hardware that's really
[41:17] (2477.76s)
common, we prioritize that over hardware
[41:20] (2480.48s)
that isn't as common just by virtue of,
[41:23] (2483.36s)
hey, this broke my laptop, right? I want
[41:25] (2485.20s)
to keep working. So yeah, it's a little
[41:27] (2487.20s)
thing that way. So we have two branches
[41:28] (2488.40s)
going at once. Development and then
[41:29] (2489.92s)
stable release is happening. So then
[41:31] (2491.68s)
this went into Lus's tree. Um I picked
[41:34] (2494.32s)
this out as part of the stable trees and
[41:36] (2496.32s)
then they ended up in the stable tree
[41:38] (2498.56s)
somewhere as well. Um and then I can
[41:43] (2503.52s)
give you dates for all this stuff. All
[41:45] (2505.12s)
this whole process took about a week and
[41:47] (2507.60s)
a half. Mhm. And that was it. Okay.
[41:51] (2511.68s)
And then here is it ended up in the
[41:54] (2514.00s)
6.13.4 kernel as well. Yeah. And then
[41:56] (2516.96s)
and other ones as well. Back to trust
[41:59] (2519.92s)
isn't just earned, it's demanded.
[42:02] (2522.48s)
Whether you're a starter founder
[42:03] (2523.68s)
navigating your first audit or seasoning
[42:05] (2525.60s)
security professional skill in your
[42:07] (2527.04s)
governance risk and compliance program,
[42:09] (2529.20s)
proving your commitment to security has
[42:10] (2530.88s)
never been more critical or more
[42:12] (2532.48s)
complex. That's where Vant comes in.
[42:15] (2535.60s)
Vantic can help you start or scale your
[42:17] (2537.52s)
security program by connecting with
[42:19] (2539.20s)
auditors and experts to conduct your
[42:21] (2541.04s)
audit and set up your security program
[42:22] (2542.72s)
quickly. Plus, with automation and AI
[42:25] (2545.20s)
throughout the platform, Vanta gives
[42:26] (2546.88s)
your time back so you can focus on
[42:28] (2548.48s)
building your company. Businesses use
[42:30] (2550.56s)
Vant to establish trust by automating
[42:32] (2552.48s)
compliance needs across over 35
[42:34] (2554.16s)
frameworks like SOCK 2 and ISO 2701.
[42:37] (2557.92s)
With Vanta, they centralize security
[42:39] (2559.76s)
workflows, complete questionnaires up to
[42:41] (2561.60s)
five times faster, and proactively
[42:43] (2563.60s)
manage vendor risk. Join over 9,000
[42:46] (2566.08s)
global companies to manage risk and
[42:47] (2567.68s)
prove security in real time. For a
[42:50] (2570.08s)
limited time, my listeners get $1,000
[42:52] (2572.24s)
off Vanta at vanta.com/pragmatic.
[42:55] (2575.20s)
That is v na.com/pragmatic
[42:58] (2578.88s)
for $1,000 off. So, we saw what it takes
[43:02] (2582.32s)
to get a fix into into Linux, and it
[43:05] (2585.60s)
actually wasn't that complicated. No, it
[43:07] (2587.84s)
really is. I mean, it's just you email a
[43:09] (2589.36s)
change off and you you email, you use
[43:11] (2591.28s)
the Git workflow. So if if you're
[43:12] (2592.56s)
familiar with get it, it's it's pretty
[43:13] (2593.84s)
simple. Obviously I guess I'm obviously
[43:16] (2596.40s)
you need to be able to build Linux uh
[43:18] (2598.56s)
test it on test it yourself validate
[43:20] (2600.88s)
locally that it works the the basic
[43:22] (2602.96s)
things and then straightforward. The fun
[43:25] (2605.12s)
thing is so I can take a change like
[43:26] (2606.56s)
that without really testing it because
[43:28] (2608.16s)
it built it obviously works for your
[43:29] (2609.92s)
hardware. I can't I didn't test it but
[43:31] (2611.44s)
it works and I assume that it goes.
[43:33] (2613.28s)
Yeah. And yeah it's um very fast
[43:36] (2616.88s)
workflow as far as getting a project. So
[43:39] (2619.12s)
it was like a two-eek window from
[43:40] (2620.72s)
sending the first change that was the
[43:42] (2622.56s)
merge window to getting it out into
[43:43] (2623.84s)
stable kernels to the world. That was
[43:45] (2625.68s)
pretty fast. Yeah. For overall for a
[43:48] (2628.32s)
worldwide project that is everywhere. So
[43:51] (2631.52s)
I think I understand what it's like to
[43:54] (2634.72s)
you know be someone who contributes to
[43:56] (2636.16s)
to Linux every now and then. But over
[43:58] (2638.16s)
time some people start to contribute
[43:59] (2639.76s)
more. They become more regular
[44:00] (2640.88s)
contributors and eventually you're one
[44:02] (2642.80s)
of the few people or one of the few or
[44:05] (2645.60s)
many people who works on on Linux
[44:07] (2647.52s)
full-time. Are there many people working
[44:09] (2649.04s)
on it full-time? So, Linux has almost
[44:11] (2651.76s)
always been paid to be worked on. So, I
[44:14] (2654.96s)
started keeping the numbers back in what
[44:16] (2656.88s)
2006 or something. And at that point in
[44:19] (2659.04s)
time, 80% of the people that contributed
[44:21] (2661.28s)
were being paid to do it full-time for
[44:23] (2663.68s)
their employer. And their employers want
[44:25] (2665.84s)
people who know how to do Linux because
[44:27] (2667.60s)
they want to solve their problems. They
[44:29] (2669.20s)
want Linux to It's much cheaper to pay a
[44:31] (2671.20s)
few engineers to add a few new features
[44:33] (2673.04s)
than it is to write your own operating
[44:34] (2674.32s)
system. That's the beauty of Linux.
[44:35] (2675.60s)
That's why IBM put a bunch of money into
[44:37] (2677.12s)
it. That's why everybody uses it. It's a
[44:39] (2679.04s)
tool for people to get their work done,
[44:40] (2680.80s)
right? You want to you want to run your
[44:42] (2682.24s)
battery. You want to run your car
[44:43] (2683.92s)
charger. You add a little driver for the
[44:46] (2686.08s)
one device you had. You had an engineer
[44:47] (2687.92s)
do that and it's good to go and it'll be
[44:49] (2689.44s)
maintained for forever because we
[44:50] (2690.72s)
maintain it in the community. It's all
[44:52] (2692.32s)
good. So, it's cheaper. So, we've been
[44:54] (2694.08s)
doing it. And the joke used to be you
[44:55] (2695.44s)
get three changes into the kernel, you
[44:56] (2696.96s)
get a job. It's not really a joke. Um,
[45:00] (2700.08s)
as long as they aren't spelling fixes,
[45:01] (2701.76s)
but um some people do spelling fixes,
[45:03] (2703.76s)
which is great. We have people that do
[45:05] (2705.44s)
janitorial work to the colonel. They
[45:06] (2706.88s)
sweep the tree for common problems and
[45:08] (2708.56s)
they just clean stuff up and keep code
[45:10] (2710.80s)
alive and keep make sure it's fresh
[45:12] (2712.56s)
proper coding style. We have coding
[45:14] (2714.00s)
style issues. We have people just fixing
[45:15] (2715.84s)
spelling mistakes and comments which is
[45:17] (2717.36s)
great because you got to start
[45:18] (2718.32s)
somewhere. In fact, spelling mistakes
[45:19] (2719.76s)
and comments is a great place to start
[45:22] (2722.00s)
because it it makes you get the workflow
[45:24] (2724.16s)
down. You figure out how to make a
[45:25] (2725.20s)
patch. You figure out how to send an
[45:26] (2726.64s)
email. Picture email client and not send
[45:28] (2728.32s)
HTML and things like that. Yeah. And you
[45:30] (2730.48s)
can't use a web client that doesn't web
[45:32] (2732.24s)
email client to send an email. It just
[45:34] (2734.08s)
doesn't work. Um, good email. There's
[45:36] (2736.00s)
lots of really good email tools out
[45:37] (2737.12s)
there. Use use them. But you're you're
[45:40] (2740.08s)
now a full-time fulltime kernel
[45:43] (2743.84s)
maintainer. What does your kind of
[45:45] (2745.92s)
dayto-day or week to week look like? Cuz
[45:48] (2748.08s)
I'm I'm going to assume it's it's going
[45:49] (2749.68s)
to be a little bit different than most
[45:51] (2751.04s)
developers who, you know, like write
[45:52] (2752.48s)
code, review code, do those kind of
[45:54] (2754.48s)
things. So, yeah. I mean, I been working
[45:56] (2756.64s)
for Linux Foundation for what, 13 years
[45:58] (2758.24s)
now. Before that, I used to work at
[45:59] (2759.76s)
Nobel and Souza. Before that, IBM and
[46:02] (2762.16s)
then a little startup all doing Linux
[46:03] (2763.60s)
stuff all the time. And then before
[46:04] (2764.80s)
that, I did embedded work. When I worked
[46:06] (2766.32s)
for a company, you end up working on
[46:08] (2768.64s)
features that your company wants or
[46:10] (2770.56s)
reviewing code from other developers of
[46:12] (2772.56s)
your company, then sending off changes.
[46:13] (2773.92s)
Or if you're a maintainer, a maintainer
[46:15] (2775.84s)
is, and the networking maintainer said
[46:17] (2777.60s)
this the best. Um, we're like editors.
[46:19] (2779.68s)
We used to be a writer. All we do is
[46:21] (2781.52s)
critique other people's stuff now. But
[46:23] (2783.44s)
because we're a writer, we have a little
[46:24] (2784.64s)
side project. So, we do have little
[46:26] (2786.00s)
things that we do dabble in stuff. So,
[46:27] (2787.44s)
like I looked I did 80 changes, only 80
[46:29] (2789.36s)
changes last year because I have a few
[46:30] (2790.64s)
little things I want to do. Um, that was
[46:32] (2792.72s)
low. But, um, working for Linux
[46:34] (2794.64s)
Foundation as a full-time maintainer,
[46:36] (2796.72s)
that's rare. I think there's only maybe
[46:38] (2798.32s)
five people, maybe maybe a handful of
[46:41] (2801.44s)
people that just work on whatever they
[46:43] (2803.04s)
want to do. So, Linux Foundation rule is
[46:45] (2805.12s)
they can't tell me what to do and I
[46:46] (2806.24s)
can't tell them what to do. Works out
[46:47] (2807.60s)
great. Um, me and Lus and Shua Khan,
[46:50] (2810.16s)
we're all fellows there and we work on
[46:53] (2813.04s)
improving Linux for however we feel like
[46:54] (2814.96s)
it. lot of me and Lenus do a lot of
[46:56] (2816.64s)
review a lot of other stuff. Lenus still
[46:58] (2818.64s)
contributes. He does he famously rewrote
[47:01] (2821.52s)
the core locking primitives in Linux a
[47:03] (2823.76s)
couple years ago. I had a Microsoft
[47:05] (2825.44s)
developer say there's no way any of us
[47:07] (2827.36s)
would be even allowed to do that on
[47:08] (2828.64s)
Windows. You know, you don't test
[47:10] (2830.16s)
changed core bits and pieces for one of
[47:12] (2832.64s)
the security features in one of these
[47:14] (2834.08s)
stable releases. Lena had said to
[47:16] (2836.16s)
rewrite the the call path from how a
[47:18] (2838.56s)
user space calls into the kernel, the
[47:20] (2840.64s)
core SIS call path. Nobody really
[47:22] (2842.64s)
noticed that it got rewritten, but it
[47:24] (2844.08s)
did. and he did it and in a stable way
[47:26] (2846.40s)
and then it worked like that. So, um
[47:28] (2848.00s)
we're also part of this the colonel
[47:29] (2849.60s)
security team. We get security bug fixes
[47:31] (2851.36s)
all the time and if they're easy, we'll
[47:33] (2853.84s)
just fix them ourselves and send out the
[47:35] (2855.28s)
fixes. So, we do security fixes a lot as
[47:38] (2858.72s)
far as that goes. So, my day-to-day is I
[47:40] (2860.88s)
read other people's stuff. Like I said,
[47:42] (2862.08s)
I get a thousand emails a day to do
[47:43] (2863.84s)
something. You're not excitating. No.
[47:46] (2866.72s)
Yeah. You you get a thousand emails a
[47:48] (2868.40s)
day. It's Wow. So I don't have a lot of
[47:50] (2870.80s)
it just file off and I do and it's like
[47:52] (2872.88s)
oh this this like I subscribe to a
[47:54] (2874.56s)
number of kernel subsystem mailing lists
[47:56] (2876.72s)
to see what's going on. Yeah. And I
[47:58] (2878.72s)
don't have to do something with all of
[47:59] (2879.76s)
those. Yeah. But some of them you need
[48:02] (2882.16s)
to do something. Yeah. Some of these I
[48:03] (2883.28s)
do need to do something with and some
[48:04] (2884.40s)
like so I'll so say for USB is one of
[48:06] (2886.56s)
the subsystems I retain. I showed them
[48:08] (2888.08s)
all off to a mailbox and then once a
[48:09] (2889.44s)
week I'll go through them all and say
[48:10] (2890.56s)
okay let's review all these. And so I'll
[48:13] (2893.04s)
look at my inbox, I'll have 200 USB
[48:15] (2895.44s)
emails to patches to go through and
[48:17] (2897.52s)
other people review them and other stuff
[48:19] (2899.28s)
like that and okay, this maintainer said
[48:20] (2900.72s)
this was good, not good, whatot and I
[48:23] (2903.12s)
apply them to my trees, see if they
[48:24] (2904.32s)
build, if they failed, I'll report those
[48:26] (2906.24s)
and not you know what you're doing
[48:28] (2908.16s)
reminds me a little bit of of when when
[48:29] (2909.92s)
I used to work at Uber, we had this
[48:31] (2911.76s)
concept of RFC's which I think got
[48:34] (2914.32s)
inspired by by the our RFC process. So
[48:37] (2917.68s)
people would just just send off a
[48:39] (2919.20s)
document of here's what I'm planning to
[48:40] (2920.96s)
to do and and there would be mailing
[48:42] (2922.40s)
lists for like back and mobile different
[48:45] (2925.04s)
parts and I noticed after a while that
[48:47] (2927.52s)
the more tenured engineers and the more
[48:49] (2929.52s)
experienced engineers would spend
[48:51] (2931.04s)
increasingly more of their time reading
[48:53] (2933.12s)
through these things critiquing giving
[48:55] (2935.04s)
feedback giving pointers connecting the
[48:57] (2937.04s)
dots. Like it it just hit me when one of
[49:00] (2940.00s)
the one of the first mobile engineers at
[49:01] (2941.52s)
Uber was telling me that that he has one
[49:03] (2943.44s)
day blocked out per day just to go
[49:05] (2945.92s)
through all of these things which again
[49:07] (2947.52s)
it wasn't kind of part of his role but
[49:09] (2949.68s)
he felt responsible. He had all the
[49:11] (2951.20s)
context. He actually helped so many
[49:12] (2952.72s)
people avoid certain things just by
[49:14] (2954.64s)
pointing it out. It's it's the same the
[49:16] (2956.64s)
same thing or something something
[49:17] (2957.60s)
similar happened here. It's the same but
[49:19] (2959.04s)
we also the the different part of this
[49:20] (2960.96s)
and I'll call this out. Um we don't have
[49:23] (2963.92s)
grand proposals sent to the colonel
[49:25] (2965.60s)
list. We don't say, "Hey, wouldn't it be
[49:27] (2967.04s)
great if you did this?" I don't want to
[49:28] (2968.72s)
see that. I want to see code that works.
[49:31] (2971.28s)
Mhm. And I love it. As proof, then code
[49:34] (2974.00s)
that works matters is because um you've
[49:36] (2976.72s)
taken the time, you've proved that this
[49:38] (2978.56s)
can be done. Yeah. Now, not necessarily
[49:40] (2980.32s)
that it's done right or done the best
[49:42] (2982.32s)
way, but it could be done. Yeah. And
[49:44] (2984.32s)
that's now you have the skin in the
[49:45] (2985.60s)
game, and now I'm willing to work with
[49:46] (2986.80s)
you and let's go on that. People do send
[49:48] (2988.56s)
off RFC's of patches. If it's an area I
[49:51] (2991.28s)
care about, I'll look at it. Sometimes
[49:53] (2993.76s)
you can get away with this. This is a a
[49:55] (2995.36s)
fun trick with maintainers. If you send
[49:57] (2997.60s)
me a patch set that solves your problem
[49:59] (2999.44s)
in such a way that it's horrible that I
[50:01] (3001.52s)
don't I hate it so much that I'll
[50:03] (3003.04s)
rewrite it myself because they'll be
[50:04] (3004.64s)
like I can't say no because it solved
[50:06] (3006.24s)
the problem and I want to solve your
[50:07] (3007.84s)
problem but if I don't say no then I
[50:09] (3009.76s)
have to take that. So you get you can do
[50:11] (3011.44s)
that like once a year to a maintainer. I
[50:13] (3013.20s)
I I I sense that you're eliminating busy
[50:15] (3015.28s)
work because I I've seen at different
[50:17] (3017.04s)
companies when you have the proposal
[50:18] (3018.88s)
process again a lot of companies for
[50:20] (3020.24s)
good for you know it's it sounds logical
[50:22] (3022.72s)
instead of starting the work instead of
[50:24] (3024.24s)
investing time maybe we would all save
[50:26] (3026.80s)
time by do a little planning up front
[50:28] (3028.56s)
right but but then every now and then
[50:30] (3030.00s)
what happens is you get into this
[50:31] (3031.36s)
never-ending planning nothing happens
[50:33] (3033.52s)
until either the project is abandoned or
[50:35] (3035.36s)
someone just sits down writes some code
[50:38] (3038.00s)
and kind of you know just cuts all the
[50:40] (3040.24s)
discussions are are done cuz now it
[50:41] (3041.92s)
works. Yeah. Well, you have to prove
[50:43] (3043.52s)
that it can work. And so inside
[50:45] (3045.36s)
companies, I'll say we do have we did
[50:47] (3047.20s)
like when I worked for companies, IBM
[50:48] (3048.64s)
was like we had planning. Okay, we need
[50:49] (3049.84s)
to implement this feature to match this
[50:51] (3051.36s)
parody with this old version of Unix.
[50:53] (3053.04s)
How are we going to do that? Let's
[50:54] (3054.08s)
figure out how to do this. Is this going
[50:55] (3055.44s)
to work? Yada yada yada. And we have
[50:56] (3056.64s)
planning and things like that. One of
[50:57] (3057.84s)
the fun things is um when you're dealing
[50:59] (3059.76s)
with open source and this happened at
[51:02] (3062.16s)
IBM engineer over here was tasked with
[51:05] (3065.44s)
fix this problem. Great. He came up with
[51:07] (3067.60s)
the solution, submitted all the changes
[51:08] (3068.96s)
upstream, lots and lots of discussion.
[51:10] (3070.96s)
Turns out his solution was not very
[51:13] (3073.12s)
good. Somebody else saw that it was a
[51:15] (3075.36s)
problem, rewrote it, submitted it, and
[51:17] (3077.92s)
got it accepted, but it wasn't the
[51:19] (3079.60s)
original engineer's work. And so the end
[51:22] (3082.48s)
of the year came was like, how is this
[51:23] (3083.84s)
person going to be reviewed? And we're
[51:25] (3085.28s)
like, he caused the feature to get done.
[51:28] (3088.64s)
It wasn't that he his code made it in,
[51:30] (3090.88s)
but he influenced the community and made
[51:33] (3093.04s)
the goal was you wanted to see Linux
[51:34] (3094.56s)
support this, right? Linux supports this
[51:36] (3096.32s)
now. And it was had a change in
[51:37] (3097.68s)
mentality of how management had to treat
[51:39] (3099.92s)
engineers and also the same thing with
[51:42] (3102.56s)
with who owns the code. We had people
[51:44] (3104.72s)
come in and be end up becoming
[51:46] (3106.48s)
maintainers of certain subsystems and
[51:48] (3108.64s)
that's great and they were maintaining
[51:49] (3109.92s)
this part of the kernel and then they
[51:51] (3111.52s)
were reassigned to do something else
[51:52] (3112.88s)
within the company. It's like oh that's
[51:54] (3114.40s)
great but you're going to still have to
[51:55] (3115.44s)
give him time to do that other thing.
[51:56] (3116.40s)
It's like no no we'll reassign it to
[51:57] (3117.68s)
somebody else. It's like no no no the
[51:59] (3119.52s)
community gave that to him. It follows
[52:01] (3121.52s)
him. If he goes to a different company
[52:03] (3123.04s)
it follows him. he goes to a different
[52:04] (3124.96s)
part of the company follows him. And
[52:06] (3126.32s)
that's actually why Linux is so good. I
[52:09] (3129.04s)
think when you work at a big company,
[52:10] (3130.40s)
you're forced to work on new things
[52:11] (3131.52s)
every couple years, right? And that's
[52:13] (3133.44s)
part of moving up in a company. You get
[52:14] (3134.88s)
different tasks and whatnot. Famously,
[52:16] (3136.72s)
Windows has had like eight, no, five
[52:18] (3138.96s)
different teams work on their USB stack.
[52:20] (3140.72s)
Linux has had one team work on their USB
[52:22] (3142.64s)
stack for 20 years. And then we know
[52:25] (3145.36s)
this stuff and we have this development
[52:26] (3146.96s)
and depth there. We just keep coming
[52:28] (3148.96s)
back to like I was kind of expecting a
[52:30] (3150.80s)
little bit of a discussion about I came
[52:33] (3153.60s)
in here just you know using Linux or or
[52:35] (3155.92s)
indirectly using Linux a lot but not
[52:37] (3157.68s)
knowing of the depths and I I kind of
[52:39] (3159.52s)
thought that we would talk a lot about
[52:40] (3160.96s)
the the tech the processes and every
[52:43] (3163.84s)
time we come back to the people the
[52:46] (3166.88s)
trust I I wanted to ask why why you
[52:49] (3169.84s)
think you know Linux is is has has won
[52:52] (3172.80s)
so big that it's everywhere but I'm
[52:54] (3174.72s)
starting to get the answer
[52:57] (3177.12s)
to this like like you know cuz I was
[52:58] (3178.64s)
thinking why Linux why not a why not a
[53:02] (3182.08s)
commercial if I naively uh you know ask
[53:05] (3185.36s)
myself before this conversation like we
[53:07] (3187.76s)
have two teams one is commercially
[53:09] (3189.20s)
funded they're selling their software
[53:10] (3190.96s)
they're paying the developers really
[53:12] (3192.24s)
well and then the other one is giving it
[53:14] (3194.16s)
away for free you know they figured out
[53:16] (3196.16s)
a model where people are still paid but
[53:17] (3197.92s)
but you know it's open source anyone can
[53:20] (3200.00s)
use it anyone can contribute which one
[53:21] (3201.92s)
would win the long term I naively would
[53:24] (3204.56s)
have said maybe the commercial one
[53:26] (3206.32s)
because they're incentive device are
[53:27] (3207.76s)
going to you know create all these
[53:29] (3209.44s)
professional things where here it's more
[53:31] (3211.60s)
intrinsic value but but now it is
[53:33] (3213.60s)
interesting but so Linux has been
[53:35] (3215.84s)
contributed to by companies in their own
[53:38] (3218.08s)
interest so it turns out everybody
[53:40] (3220.24s)
contributes in a selfish way we want to
[53:42] (3222.08s)
solve my selfish problem but it turns
[53:43] (3223.92s)
out everybody has the same problems so
[53:46] (3226.16s)
your problem being solved is the same
[53:47] (3227.76s)
problem as their problem we had this
[53:50] (3230.24s)
when it came down to embedded so
[53:51] (3231.92s)
embedded happened they came up saying we
[53:54] (3234.00s)
need to change Linux to make it work
[53:55] (3235.44s)
better on batteries you know Power is
[53:57] (3237.20s)
really important. Power is very, very
[53:58] (3238.80s)
important. A lot more efficient. So, we
[54:00] (3240.72s)
wanted to This was when Linux was first
[54:02] (3242.08s)
getting into embedded and like we need
[54:03] (3243.36s)
to make this very efficient. And we're
[54:04] (3244.48s)
like, great, that's a wonderful little
[54:06] (3246.16s)
solution. Make it work for everybody.
[54:08] (3248.24s)
Like, no, no, no, we just care about
[54:09] (3249.52s)
embedded over here. That's the only
[54:10] (3250.48s)
person that's going to care about power.
[54:11] (3251.52s)
It's like, no, you really just make it
[54:12] (3252.96s)
generic. It'll all be good. Turns out
[54:15] (3255.68s)
data centers save billions of dollars in
[54:17] (3257.68s)
money because of power management. And
[54:19] (3259.92s)
it turns out everybody so the main
[54:21] (3261.20s)
frames if it's more efficient on on on a
[54:23] (3263.20s)
mobile phone suddenly it's a good
[54:24] (3264.96s)
candidate for it to be a mobile OS. Yes,
[54:26] (3266.80s)
it works for every it works for
[54:27] (3267.92s)
everything. Um same thing for um
[54:29] (3269.76s)
multiprocessors. Multipprocessor came
[54:31] (3271.36s)
out there's two we have two processors
[54:32] (3272.88s)
in big data center. Who's going to care
[54:34] (3274.32s)
about that? In your pocket now you have
[54:35] (3275.84s)
16 processors. It just works for
[54:37] (3277.28s)
everybody now right data trends shrink
[54:40] (3280.96s)
go different places but because we
[54:42] (3282.48s)
solved it for in a generic way. We
[54:45] (3285.28s)
forced you to solve it in a generic way,
[54:47] (3287.20s)
but you contributed in a selfish manner.
[54:49] (3289.28s)
And that's it's the it's a good way that
[54:51] (3291.52s)
IBM knew they could put money into it,
[54:53] (3293.52s)
hire developers, and get the money back.
[54:55] (3295.84s)
Yeah. So, it was cheaper for them in the
[54:57] (3297.36s)
long run to do that. And they make money
[54:59] (3299.84s)
selling support and selling hardware.
[55:01] (3301.44s)
Red Hat makes money selling support. And
[55:03] (3303.68s)
that's like that. Intel makes money
[55:05] (3305.28s)
selling chips. And that's how that's who
[55:07] (3307.76s)
contributes to Linux is the people who
[55:09] (3309.52s)
they want to sell a different product.
[55:11] (3311.36s)
Now, one other thing that's interesting
[55:13] (3313.28s)
about efficiency
[55:14] (3314.88s)
We have 4,000 developers contributing.
[55:17] (3317.28s)
Some of them only contribute one change.
[55:18] (3318.80s)
Some of them contribute a bunch. Three
[55:21] (3321.04s)
to 500 companies per year. We're talking
[55:23] (3323.04s)
about per year. If you told me like this
[55:25] (3325.44s)
is inside a a kind of commercial
[55:27] (3327.68s)
company, a tech company, you know, I I
[55:29] (3329.92s)
would assume that in order to make this
[55:31] (3331.60s)
work, oh, for 4,000 developers, you
[55:33] (3333.60s)
know, we probably need to hire 400 PMs.
[55:36] (3336.08s)
We'll have we'll have about about for
[55:38] (3338.16s)
every 50 developers, we'll have about 80
[55:40] (3340.32s)
TPMs. This is how it would run. Like
[55:42] (3342.32s)
you're you're laughing. I know I've been
[55:44] (3344.48s)
there. In fact, you've been there, but I
[55:45] (3345.92s)
I only come from here. Now, one thing
[55:47] (3347.76s)
you told me cuz I I was asking you how
[55:50] (3350.08s)
many pro how many project or you very
[55:52] (3352.00s)
technical project managers you have and
[55:53] (3353.84s)
you said zero. How well so in a way the
[55:57] (3357.92s)
project managers already happened on the
[55:59] (3359.60s)
back end before the patches got to us.
[56:01] (3361.52s)
So at a company say IBM I want to solve
[56:03] (3363.84s)
this problem. They've said how do we
[56:05] (3365.68s)
solve this problem? Let's put this task.
[56:07] (3367.36s)
Let's figure this out. And then the
[56:08] (3368.48s)
patches come out to us. So we don't see
[56:10] (3370.80s)
that. So we just see the feature when it
[56:12] (3372.80s)
lands on us. That's fair. So they're
[56:14] (3374.64s)
there working for the individual
[56:16] (3376.40s)
companies to get their thing. Sometimes
[56:18] (3378.40s)
sometimes they're not. Sometimes they're
[56:19] (3379.76s)
just developers are spitting things out
[56:21] (3381.28s)
and like this person who needed to get a
[56:22] (3382.96s)
new device ID. It saves company time and
[56:24] (3384.64s)
money if they contribute their changes
[56:25] (3385.84s)
upstream than to keep it as a fork
[56:27] (3387.44s)
because they have to keep maintaining
[56:28] (3388.40s)
that fork. So wise companies have
[56:30] (3390.64s)
realized let our developers work
[56:32] (3392.08s)
upstream, do what they need to do there
[56:34] (3394.72s)
with limited project management and it
[56:36] (3396.72s)
it just works out better. And again,
[56:38] (3398.80s)
we're only taking things when they're
[56:40] (3400.80s)
ready, right? We're not having to track.
[56:43] (3403.60s)
We do have tools. We said everything's
[56:45] (3405.44s)
through email. We have tools like the
[56:46] (3406.88s)
networking subsystem has a web page you
[56:48] (3408.48s)
can go to see what the status of your
[56:49] (3409.76s)
patches, if it's passed all the CI, if
[56:52] (3412.08s)
it's been reviewed by the maintainer and
[56:53] (3413.60s)
things like that. So, we have a bunch of
[56:54] (3414.88s)
automatic tools based on top of email
[56:56] (3416.96s)
that'll help you out. And those project
[56:59] (3419.04s)
managers can go look at those if they
[57:00] (3420.40s)
want to wonder what the status of their
[57:01] (3421.84s)
employees patches were at, things like
[57:03] (3423.68s)
that. But yeah, it's just it's a
[57:05] (3425.04s)
different model, but it's not like
[57:06] (3426.40s)
they're there. they're hidden behind the
[57:08] (3428.80s)
solution for that company. It's it's
[57:10] (3430.96s)
fair and I think it's also good to to
[57:12] (3432.64s)
remind that that's the case. But I I
[57:14] (3434.88s)
feel Linux still figured out a way to
[57:17] (3437.20s)
just focus on just ruthless efficiency
[57:19] (3439.04s)
with with automation with focusing on on
[57:22] (3442.08s)
the work when it's done. So as you said
[57:24] (3444.16s)
all these things do happen but they
[57:26] (3446.00s)
happen before and then you can you know
[57:27] (3447.76s)
like this this part of the process will
[57:29] (3449.20s)
just be more efficient by design. Yeah.
[57:31] (3451.28s)
And we also but once a year we get
[57:32] (3452.96s)
together the core maintainers and we
[57:34] (3454.96s)
talk about not technical things because
[57:36] (3456.48s)
we can't have enough technical people in
[57:37] (3457.68s)
the room for a topic. We talk about
[57:39] (3459.20s)
process. Is our process working? Is it
[57:41] (3461.36s)
not working? And we refine it and say,
[57:42] (3462.80s)
"Oh, maybe we need to do this a little
[57:44] (3464.16s)
bit better. Oh, wouldn't it be nicer to
[57:45] (3465.44s)
do this? Hey, we need more testing over
[57:47] (3467.04s)
here. Hey, can we do this type of
[57:48] (3468.32s)
stuff?" So, we do we talk about our
[57:49] (3469.52s)
process all the time. Famously, the
[57:51] (3471.52s)
leadup to that meeting is a public
[57:53] (3473.68s)
another public mailing list that we all
[57:55] (3475.12s)
talk about processes and that process
[57:57] (3477.20s)
that that once a year bike shedding of
[57:59] (3479.20s)
our process in public. It helps shake
[58:01] (3481.92s)
out a lot of things and work out and
[58:04] (3484.40s)
there are problems. I'm not saying this
[58:05] (3485.76s)
development model is perfect. It works
[58:07] (3487.52s)
really well. One thing that's odd about
[58:09] (3489.20s)
Linux is that we keep going as fast as
[58:12] (3492.08s)
we are. We're running at 9 to 10 changes
[58:13] (3493.84s)
an hour. In the stable kernels, we're
[58:15] (3495.84s)
running 30 changes a day. 30 to 40
[58:17] (3497.76s)
changes a day. Mhm. Um 10 CVES a day. A
[58:21] (3501.12s)
bug at our level is a CVE almost. Yeah.
[58:23] (3503.52s)
So CBS are the critical. Yeah. It's a
[58:26] (3506.16s)
security bug. It's a vulnerability. Um
[58:28] (3508.16s)
they could be as stupid as um a memory
[58:30] (3510.48s)
leak somewhere or um I rebooted the
[58:33] (3513.36s)
machine or I took over and got
[58:34] (3514.88s)
permission. I I don't know when I when I
[58:37] (3517.20s)
create a CV, I can't I don't know how
[58:38] (3518.80s)
you use Linux, so I can't tell the
[58:40] (3520.16s)
severity of it, but I can just say
[58:41] (3521.68s)
here's a bug. You should you should look
[58:43] (3523.52s)
at this. So we're responsible for that.
[58:45] (3525.44s)
So we're running at a huge rate of
[58:46] (3526.88s)
change. Most large software projects
[58:48] (3528.56s)
have a huge ramp up and then they
[58:50] (3530.16s)
plateau with developers and rate of
[58:51] (3531.84s)
change and whatnot because they've
[58:52] (3532.72s)
solved the problem. Linux has never
[58:54] (3534.64s)
solved the problem. And I used to have I
[58:56] (3536.56s)
had a manager at IBM every year come to
[58:58] (3538.08s)
me and said, "Hey, is Linux done yet?" I
[58:59] (3539.84s)
was like, "No." It took me 10 years to
[59:01] (3541.68s)
finally come up with the answer of um
[59:03] (3543.12s)
it'll be done when you stop making new
[59:04] (3544.48s)
hardware. And when they stop making new
[59:07] (3547.04s)
hardware or having different work
[59:08] (3548.24s)
classes, then we'll stop. But we're one
[59:09] (3549.84s)
of the few projects that keep having to
[59:11] (3551.84s)
add new features because of new
[59:13] (3553.92s)
hardware. We're not doing it just
[59:15] (3555.20s)
because I mean Lin has been working for
[59:16] (3556.72s)
all of us for 20 years. We're doing it
[59:18] (3558.88s)
to support new hardware to support new
[59:20] (3560.56s)
use models to support things. We don't
[59:22] (3562.56s)
add things for fun generally. We add it
[59:24] (3564.32s)
to solve a problem that somebody had.
[59:26] (3566.08s)
Most of Linux is is written using C C or
[59:29] (3569.04s)
C++, right? No C++ just just C. And I
[59:33] (3573.20s)
guess for some hardware drivers, is
[59:34] (3574.88s)
there assembly ever involved or no? No.
[59:37] (3577.20s)
um assembly will drop down into the
[59:38] (3578.88s)
early boot of a processor and then some
[59:41] (3581.68s)
core functionality like locking and that
[59:43] (3583.84s)
drivers or other people will call will
[59:46] (3586.72s)
basic will go down like string functions
[59:48] (3588.64s)
and whatnot will go down to good
[59:50] (3590.16s)
assembly language that's tuned for the
[59:51] (3591.52s)
different processors also when you boot
[59:53] (3593.84s)
Linux looks at the processor you're
[59:55] (3595.60s)
running on patches itself to figure out
[59:57] (3597.68s)
the best best functions that those
[60:00] (3600.24s)
assembly would work and then it
[60:01] (3601.68s)
continues on moving which is crazy it
[60:03] (3603.68s)
patches itself at boot time so so hold
[60:05] (3605.36s)
on but but Some of that is is assembly
[60:07] (3607.84s)
or is that some of that's assembly in
[60:09] (3609.12s)
the very beginning and some of those
[60:10] (3610.16s)
low-level functions but drivers don't
[60:11] (3611.60s)
ever touch assembly. Okay. So so
[60:13] (3613.12s)
basically like from from a Linux contrib
[60:16] (3616.00s)
now you know one one thing that actually
[60:18] (3618.56s)
the way we started talking is uh there
[60:20] (3620.56s)
is a proposal to do to introduce Rust
[60:22] (3622.88s)
because it's it's just more memory safe.
[60:24] (3624.56s)
It's also a language growing in
[60:26] (3626.48s)
popularity and some people would like to
[60:28] (3628.88s)
do more Rust development. What is your
[60:30] (3630.88s)
take on on this? Do do you think Linux
[60:33] (3633.92s)
at some point uh might support Rust or
[60:36] (3636.48s)
and you know what what what are your
[60:38] (3638.48s)
what what is your thinking of doing
[60:40] (3640.08s)
things outside of C? So we have 25,000
[60:43] (3643.12s)
lines of Rust in the kernel already. Oh,
[60:45] (3645.60s)
we do. Okay, awesome. Yeah. Um so most
[60:48] (3648.32s)
of that is just bindings. There's no
[60:50] (3650.08s)
real functionality. Um in the latest
[60:51] (3651.92s)
release, um if the kernel crashes, it'll
[60:54] (3654.00s)
put up a QR code. You can take a picture
[60:55] (3655.92s)
of it to get the crash dump. That code
[60:57] (3657.60s)
was written in Rest. Oh, nice. Um that's
[60:59] (3659.68s)
in Rest. Um, so the Rust for Linux
[61:02] (3662.80s)
developers have been working for a long
[61:03] (3663.92s)
time. A couple years ago, they came to
[61:05] (3665.36s)
us and said, "We think we're ready to do
[61:06] (3666.80s)
this. Do you want it?" And we said,
[61:08] (3668.80s)
"Yeah, let's try this experiment. You're
[61:10] (3670.80s)
willing to do the work? Who am I to tell
[61:13] (3673.36s)
no to?" Um, I mean, it's Linux. Yeah. I
[61:17] (3677.04s)
mean, it's it's it's now the problem
[61:19] (3679.52s)
with Linux and Rust is it would be
[61:22] (3682.32s)
easier to write a core piece of Linux
[61:23] (3683.92s)
and Rust than it would be to write a
[61:25] (3685.20s)
driver. A driver is consumed from
[61:27] (3687.12s)
everywhere in the kernel. Mhm. So you
[61:28] (3688.88s)
want to talk locking, you want to talk
[61:31] (3691.12s)
input and output, you want to talk talk
[61:32] (3692.88s)
to the driver model, talk to the USB
[61:34] (3694.32s)
port, all this stuff. Drivers have to
[61:36] (3696.00s)
can be really tiny because they take
[61:37] (3697.92s)
resources from the rest of the kernel.
[61:39] (3699.76s)
In Rust, you need to have a binding
[61:41] (3701.68s)
between the C code and the Rust code.
[61:43] (3703.52s)
There's an intermediate layer. The C the
[61:46] (3706.48s)
kernel in C has these very opinionated
[61:48] (3708.56s)
model ideas of how it handles objects
[61:50] (3710.64s)
and how it does memory and how it it has
[61:52] (3712.64s)
its memory model. Rust has its very
[61:54] (3714.40s)
opinionated model of how it does this
[61:56] (3716.16s)
type. Same idea. This meshing is tough.
[61:59] (3719.92s)
This meshing is also the most crazy
[62:01] (3721.60s)
complex Rust code you've ever seen. So
[62:04] (3724.32s)
from a new Rust developer like me, I can
[62:06] (3726.96s)
barely read the bindings, but I trust
[62:08] (3728.64s)
other people are doing it. So yes, so
[62:10] (3730.88s)
the trick is we now need to write a
[62:13] (3733.04s)
binding for every different part of the
[62:15] (3735.12s)
kernel in order to write a rest code
[62:16] (3736.32s)
scope, a rush driver. If you want to do
[62:18] (3738.40s)
the QR generator, that's simple. That
[62:20] (3740.16s)
was this one function. Yeah. So over the
[62:23] (3743.44s)
year, the past couple years, people have
[62:24] (3744.96s)
been trying to write write bindings to
[62:26] (3746.24s)
try and do things. We've had a bunch of
[62:28] (3748.16s)
example drivers like a new disc driver,
[62:30] (3750.32s)
this write a driver in C versus R. It
[62:32] (3752.56s)
turns out there are still some
[62:33] (3753.76s)
performance issues with R code versus C
[62:35] (3755.68s)
code because we can do some tricks in C
[62:37] (3757.36s)
that they can't do yet in R. Yeah,
[62:38] (3758.88s)
that's and the tooling and the RS
[62:40] (3760.16s)
developers are doing it. The core R
[62:41] (3761.92s)
developers that the language, some of
[62:43] (3763.28s)
them are Linux kernel developers.
[62:44] (3764.72s)
They've always wanted R to be working
[62:46] (3766.08s)
for Linux. Um the rest model is good.
[62:49] (3769.04s)
Memory safety at our level does not mean
[62:51] (3771.68s)
that you can't crash the kernel. Uh you
[62:53] (3773.60s)
can still overwrite things. It memory
[62:55] (3775.44s)
safety in Rust just means the the memory
[62:57] (3777.60s)
that you pass around you think you have
[62:59] (3779.68s)
ownership of or it isn't an ownership of
[63:02] (3782.24s)
and it when things are go out of scope,
[63:04] (3784.72s)
they'll get cleaned up properly. So I've
[63:06] (3786.56s)
seen every single kernel bug for the
[63:07] (3787.92s)
past 18 years. Half of them will be
[63:09] (3789.84s)
fixed with Rust. It's just it's just
[63:12] (3792.32s)
going to be fixed with Rust. It's the
[63:14] (3794.40s)
stupid oneoff bugs. It's the I oops, I
[63:17] (3797.20s)
overwrote an array and I didn't realize
[63:18] (3798.80s)
it by one. Oops, I um forgot to clean up
[63:21] (3801.52s)
this error path. Yeah, I forgot to
[63:23] (3803.36s)
unlock this lock. It's I It's stupid
[63:25] (3805.92s)
little things like that. There's logic
[63:27] (3807.52s)
bugs. Of course, you can write logic
[63:29] (3809.20s)
bugs in Rust. You'll always have those,
[63:31] (3811.20s)
right? So, but famously, the code the QR
[63:33] (3813.44s)
code for in Rust that made the QR C
[63:36] (3816.40s)
passed into the rest code a pointer to a
[63:38] (3818.64s)
buffer and the buffer size. The rest
[63:40] (3820.48s)
code forgot to look at how big the
[63:42] (3822.24s)
buffer was and it scribbled right over
[63:43] (3823.60s)
memory. So you can write memory unsafe
[63:45] (3825.68s)
code in R just fine and you can crash
[63:47] (3827.92s)
things in Rust. So memory safety here
[63:50] (3830.40s)
means it's the safety of object life
[63:53] (3833.12s)
cycles and things like that. It doesn't
[63:54] (3834.96s)
mean it's going to remove all bugs. It's
[63:56] (3836.96s)
not a golden bullet or anything like
[63:58] (3838.64s)
silver bullet. But I think yes I think
[64:01] (3841.84s)
Rust needs to come in because it should
[64:05] (3845.04s)
be easier to write drivers in this
[64:06] (3846.64s)
stuff. We have a lot of issues with
[64:08] (3848.64s)
lifetime rules of when you yank out a
[64:11] (3851.92s)
device. Devices are dynamic and dealing
[64:14] (3854.08s)
with these reference counting of things
[64:15] (3855.76s)
like that is very tricky to get right.
[64:17] (3857.52s)
There's parts in the colonel we still do
[64:18] (3858.96s)
not have it right and we know we don't
[64:20] (3860.24s)
have it right. Rust is forcing us to
[64:23] (3863.04s)
actually document our C code better and
[64:26] (3866.40s)
it's cleaning up. So if Rust disappeared
[64:28] (3868.00s)
tomorrow, I've had to clean up code in
[64:29] (3869.52s)
the driver core that's like, oh yeah, I
[64:31] (3871.36s)
guess we can do things better and safer
[64:33] (3873.20s)
in the C code in order to make Rust
[64:34] (3874.96s)
easier. Mhm. And we have and so it's
[64:37] (3877.12s)
making us rethink how we do a lot of our
[64:39] (3879.12s)
existing code in the kernel. To be fair,
[64:41] (3881.84s)
a lot of core kernel people are very
[64:44] (3884.40s)
resistant to that. They don't like
[64:45] (3885.84s)
change, don't like different languages.
[64:47] (3887.76s)
Um, one core kernel developer said, "I
[64:49] (3889.52s)
don't like working with a project that
[64:50] (3890.96s)
has um multiple languages in it just
[64:53] (3893.12s)
because it's tricky and they are free to
[64:55] (3895.84s)
do that. They're not stepping on
[64:56] (3896.88s)
anybody's toes. Um, a lot of it's
[64:58] (3898.40s)
miscommunication and a lot of it comes
[64:59] (3899.84s)
down to people." Again, famously in this
[65:03] (3903.52s)
binding I wrote the driver core many
[65:05] (3905.60s)
many years ago of how drivers work in
[65:07] (3907.04s)
the system in the kernel. There had to
[65:08] (3908.96s)
be a binding for that in Rust. I this
[65:12] (3912.32s)
code I saw I said this is horrible. This
[65:14] (3914.40s)
isn't going to work at all. It's
[65:15] (3915.36s)
miserable. I went and actually met with
[65:17] (3917.36s)
the developers and we had there's a rest
[65:19] (3919.12s)
Linux conference. We sat down. I think
[65:20] (3920.72s)
they gave a whole presentation just for
[65:22] (3922.24s)
me. Um turns out I was wrong and they
[65:25] (3925.12s)
were wrong. We both were wrong and they
[65:27] (3927.04s)
were doing crazy things like they had a
[65:28] (3928.64s)
thousand lines of C Rust code that that
[65:30] (3930.80s)
I do in two lines of C code. I'm like
[65:32] (3932.72s)
well why? They're like well we didn't
[65:33] (3933.68s)
want to change the C code. I'm like we
[65:35] (3935.28s)
can change the C code because I just did
[65:37] (3937.28s)
that because it was easy in C but if I
[65:38] (3938.64s)
change that you get rid of a thousand
[65:39] (3939.84s)
lines of Rust. Let's do that. And again
[65:41] (3941.76s)
it comes down to okay understanding what
[65:43] (3943.44s)
your problems are understanding what my
[65:45] (3945.12s)
problems are and let's work together.
[65:46] (3946.56s)
And now we have bindings in the kernel
[65:48] (3948.16s)
that you can actually write some drivers
[65:49] (3949.52s)
with. And the Red Hat developers are
[65:51] (3951.60s)
starting to write the new Nvidia GPU
[65:53] (3953.36s)
drivers in Rust and they're starting to
[65:55] (3955.60s)
put the proposals out there. The Apple
[65:57] (3957.60s)
GPU drivers are for the Apple MacBooks
[65:59] (3959.76s)
are written in Rust. Those patches are
[66:01] (3961.52s)
not merged, but they're written to rest
[66:03] (3963.20s)
and prove on on a fork. Um, that works
[66:05] (3965.60s)
great. Um, there's a whole bunch of
[66:07] (3967.52s)
crazy object life cycle issues with
[66:09] (3969.36s)
graphics drivers and Rust makes it a lot
[66:11] (3971.28s)
easier for them to do. Um, I think
[66:12] (3972.96s)
you'll see a lot more of the driver
[66:15] (3975.04s)
simple stupid drivers for hardware
[66:17] (3977.20s)
devices being written in Rust because
[66:18] (3978.88s)
all they want to do is read and write to
[66:20] (3980.24s)
some random memory bits and it's really
[66:22] (3982.24s)
easy to do that in Rust and you can do
[66:23] (3983.20s)
it in actually less code than you can do
[66:24] (3984.40s)
it in C code. Yeah. And I think that's
[66:26] (3986.72s)
we now have the infrastructure in there.
[66:28] (3988.08s)
So I think we've hit the tipping point
[66:29] (3989.84s)
where you'll start seeing new stuff in
[66:31] (3991.04s)
there and we need to do that. I mean
[66:32] (3992.24s)
there's mandates from governments that
[66:33] (3993.84s)
you can't use memory unsafe languages
[66:36] (3996.24s)
like C and products. Yeah. And if I want
[66:38] (3998.32s)
to see Linux to succeed, which I do,
[66:40] (4000.88s)
we're going to have to change. And I can
[66:42] (4002.40s)
say going forward, if you want to write
[66:44] (4004.00s)
in rest, you can write in rest. Now,
[66:45] (4005.52s)
that being said, we still have 40
[66:46] (4006.64s)
million lines of C code. Yeah. So, we
[66:48] (4008.08s)
have some very, very good developers out
[66:49] (4009.52s)
there working on mitigating the problems
[66:51] (4011.12s)
we have in C. We now have bound checking
[66:53] (4013.28s)
for our stuff. We now have other we call
[66:55] (4015.28s)
them seat belts and airbags that protect
[66:58] (4018.08s)
your C code from doing stupid things.
[67:00] (4020.16s)
And we working with the compiler authors
[67:01] (4021.76s)
to add new extensions to C and make
[67:04] (4024.32s)
things safer for the C code because we
[67:06] (4026.24s)
want to protect the code that we have
[67:07] (4027.84s)
today because you're not going to
[67:08] (4028.72s)
rewrite code in Rust. Don't worry about
[67:10] (4030.40s)
that. Google famously published
[67:11] (4031.84s)
something recently saying over the past
[67:13] (4033.36s)
couple years we've written our new code
[67:14] (4034.72s)
in Rust and we got uh overwhelmingly
[67:17] (4037.60s)
more secure because we didn't touch the
[67:19] (4039.68s)
old code and bugs degrade over time.
[67:22] (4042.40s)
There's still going to be bugs in the
[67:23] (4043.36s)
older stuff, but most bugs happen in
[67:24] (4044.88s)
your new code, not in your old code.
[67:26] (4046.88s)
That's awesome. I'm I'm I'm sensing
[67:28] (4048.40s)
you're you're excited about Russ and I I
[67:30] (4050.16s)
it's also just nice to see the
[67:31] (4051.60s)
evolution. Yeah, it's evolution and see
[67:33] (4053.60s)
what happens and if it fails tomorrow,
[67:35] (4055.12s)
we can rip it out and what but we have
[67:36] (4056.80s)
developers willing to do this work for
[67:38] (4058.24s)
us. It's not intruding on other people's
[67:40] (4060.24s)
stuff. Well, and I I I think it does go
[67:42] (4062.24s)
back to what you said earlier is is it's
[67:44] (4064.72s)
feel I understand that a big part of
[67:46] (4066.96s)
Linux is like show the work like if if
[67:49] (4069.20s)
if it works and and same thing, you
[67:50] (4070.80s)
know, it sound like that's how Rust
[67:52] (4072.08s)
started and how it's also how it's
[67:53] (4073.44s)
progressing. People are showing that it
[67:54] (4074.72s)
works. they're proving that it works, it
[67:57] (4077.20s)
solves their problem, it maybe even
[67:59] (4079.68s)
works better for them. And then, you
[68:00] (4080.72s)
know, step by step. Yeah. Like people
[68:01] (4081.92s)
are like, "Well, why not Zigg or Hair?
[68:03] (4083.44s)
Those are other good languages." I'm
[68:04] (4084.72s)
like, "That's great, but nobody's
[68:05] (4085.68s)
proposed." Yeah. So, yeah, they want to
[68:08] (4088.16s)
do that. And to be fair, I think those
[68:09] (4089.84s)
developers who work on those languages
[68:10] (4090.88s)
don't care about Linux, which is fine.
[68:12] (4092.08s)
They don't have to. So, so looking ahead
[68:14] (4094.24s)
uh outside of uh Rust, what are other
[68:18] (4098.16s)
things that you're kind of excited about
[68:20] (4100.72s)
uh that's that's coming in Linux? uh e
[68:23] (4103.28s)
either projects changes I don't know if
[68:26] (4106.48s)
uh we we haven't said LLMs except for
[68:29] (4109.20s)
once here I don't know if that for
[68:31] (4111.28s)
example like like will LLMs have any
[68:34] (4114.08s)
impact on on how development is done?
[68:36] (4116.24s)
No, no, not there's not. I mean, they're
[68:38] (4118.72s)
all trained on Linux kernel code. So,
[68:40] (4120.24s)
you write out another driver, but LMS
[68:42] (4122.40s)
are great for writing um boilerplate
[68:44] (4124.16s)
code and things like that. In Linux
[68:46] (4126.00s)
drivers, you don't have much boilerplate
[68:47] (4127.36s)
code because we've stemmed that down
[68:49] (4129.28s)
into the core and made that work better.
[68:51] (4131.92s)
Um LLMs are used to find bugs and find
[68:55] (4135.44s)
the bugs fixes to match that we should
[68:58] (4138.08s)
be taking. So, we but again, we've have
[69:00] (4140.40s)
published papers on that for eight
[69:01] (4141.92s)
years. Um there's been lots of research
[69:04] (4144.00s)
on that. Um, so we we've been using that
[69:06] (4146.56s)
for a while. I mean, LLM is just applied
[69:08] (4148.40s)
statistics, right? So it's just pattern
[69:10] (4150.16s)
pretty much. So code for us at this
[69:12] (4152.40s)
level, it doesn't matter that much. So
[69:14] (4154.80s)
no. And then as far as I don't know
[69:16] (4156.80s)
what's coming tomorrow because I just
[69:18] (4158.40s)
see what people send to me. So we don't
[69:20] (4160.00s)
have a plan. I mean, we always joke, you
[69:22] (4162.16s)
know, Linux is evolution, not
[69:23] (4163.92s)
intelligent design. Um, it's just
[69:26] (4166.24s)
whatever shows up, right? Because you're
[69:27] (4167.92s)
solving your problem and we'll figure
[69:28] (4168.96s)
out how to fit it in there with
[69:30] (4170.24s)
everybody else's stuff and um, make sure
[69:32] (4172.00s)
it doesn't work out. People are working
[69:33] (4173.84s)
on new features. I mean, Linux is people
[69:36] (4176.00s)
are like, "Oh, it's an old model. It's
[69:37] (4177.36s)
the old Unix model." It's like, yeah, we
[69:39] (4179.04s)
can run code from 20 and 30 and 40 years
[69:41] (4181.04s)
ago, but we can also run new stuff. We
[69:43] (4183.20s)
have new features. We have new IO paths
[69:45] (4185.12s)
that are even better. We have new types
[69:47] (4187.04s)
of functionality. We have new security
[69:48] (4188.64s)
models. We have new capabilities. We
[69:50] (4190.32s)
have new types of stuff for the new
[69:51] (4191.60s)
stuff, but we didn't break the old
[69:52] (4192.72s)
stuff. So, we can do both stuff. You can
[69:54] (4194.40s)
rewrite your code. But I know the
[69:56] (4196.16s)
databases are rewriting it to use IO
[69:58] (4198.72s)
ring which is a new way to do IO which
[70:00] (4200.88s)
gets the user space to kernel boundary
[70:02] (4202.96s)
out of the way and does fast faster
[70:04] (4204.88s)
path. So they're speeding up the
[70:06] (4206.24s)
databases by porting to new Linux
[70:08] (4208.48s)
features but their old databases still
[70:10] (4210.40s)
run just fine. And so it's like people
[70:12] (4212.72s)
look at it like oh nothing's changed
[70:15] (4215.04s)
because the old stuff still works. This
[70:17] (4217.36s)
whole that was the goal. The old stuff
[70:18] (4218.80s)
still works. So I don't know but just
[70:20] (4220.96s)
see what new I mean new hardware
[70:22] (4222.40s)
features. I see the new hardware coming
[70:24] (4224.40s)
all the time. We get told by the CPU
[70:26] (4226.56s)
vendors like look at this new chip. It's
[70:27] (4227.92s)
like great. But so that's always fun.
[70:30] (4230.96s)
And then in terms of contributing to
[70:32] (4232.48s)
Linux, so we we just went through this
[70:34] (4234.32s)
example and it's it seems pretty easy to
[70:37] (4237.12s)
contribute honestly. like you know I I
[70:38] (4238.96s)
wanted to ask on advice to contribute
[70:41] (4241.28s)
but my sense is just do it like it's not
[70:44] (4244.00s)
that difficult but from a professional p
[70:47] (4247.20s)
point of view like what what do you
[70:48] (4248.64s)
think a developer who you know is is is
[70:51] (4251.60s)
building other stuff at at a company
[70:53] (4253.60s)
what would they get professionally out
[70:55] (4255.60s)
of contributing even one change or or a
[70:58] (4258.24s)
few changes to to Linux like how how
[71:00] (4260.48s)
could their you know outlook change or
[71:02] (4262.08s)
or what could they learn well that's the
[71:04] (4264.72s)
best thing is it's your resume it's a so
[71:07] (4267.28s)
I look I talk to college students. I
[71:08] (4268.64s)
talk to college students at VU other
[71:10] (4270.16s)
universities all the time. Say, "Hey,
[71:11] (4271.84s)
contribute to the colonel while you have
[71:13] (4273.04s)
time." And then when you go to get
[71:14] (4274.48s)
hired, somebody can look at you say,
[71:16] (4276.16s)
"Oh, yeah, look, you do play well with
[71:17] (4277.52s)
others and you did a contribute other
[71:18] (4278.88s)
stream." Because when you come in as a
[71:20] (4280.32s)
company, you're not writing code from
[71:22] (4282.48s)
scratch. You're working with other
[71:23] (4283.44s)
people. You're working with existing
[71:24] (4284.48s)
code bases. If you contribute to Linux,
[71:26] (4286.40s)
you or any open source project, you show
[71:28] (4288.48s)
that you can work with others. You can
[71:30] (4290.00s)
work with existing codebase. So, it
[71:31] (4291.60s)
shows a great skill set. When I hired
[71:33] (4293.28s)
people when I was at IBM, if you
[71:34] (4294.88s)
contributed versus not, it's like, oh,
[71:36] (4296.40s)
that's an easy cell. I'd rather take
[71:38] (4298.48s)
that. So, from a personal point of view,
[71:40] (4300.96s)
contributing, you can get a job easier,
[71:42] (4302.72s)
get the next job. From another point of
[71:44] (4304.16s)
view is from an engineer, you get to
[71:45] (4305.76s)
learn new things. I wrote my first
[71:47] (4307.12s)
driver and sent it out. So, oh, here it
[71:48] (4308.88s)
is. It's all perfect. What? Everybody's
[71:50] (4310.40s)
like, "No, this is wrong. This is wrong.
[71:51] (4311.60s)
This is wrong." And you ever heard of
[71:52] (4312.64s)
multiprocessors? I'm like, "What? What
[71:54] (4314.32s)
is all this?" And that's great from an
[71:55] (4315.92s)
engineering point of view. I want to
[71:56] (4316.96s)
know better. I mean the Linux kernel
[71:58] (4318.56s)
developers, you can never have all the
[72:00] (4320.40s)
best developers in the world at the same
[72:01] (4321.68s)
company. But when in open source, we can
[72:03] (4323.76s)
all work the best operating system
[72:05] (4325.28s)
people can all work on the operating
[72:06] (4326.72s)
system together. So the depth and talent
[72:09] (4329.28s)
of the people that are working on Linux
[72:10] (4330.72s)
is just amazing. Take advantage of that.
[72:13] (4333.28s)
I'll say the rest developers that are
[72:14] (4334.96s)
working on Rust for Linux are core Rust
[72:17] (4337.60s)
developers. These people are really,
[72:19] (4339.44s)
really, really good. They maintain core
[72:20] (4340.96s)
parts of REST infrastructure. Take
[72:22] (4342.72s)
advantage of them. I mean, I'm learning
[72:24] (4344.40s)
so much from them. So from an
[72:26] (4346.16s)
engineering point of view, there's these
[72:28] (4348.56s)
people that are really out there and
[72:30] (4350.08s)
willing to help you and grow and as an
[72:32] (4352.64s)
engineer and learn different processes
[72:34] (4354.64s)
and learn different skills much better.
[72:36] (4356.48s)
I mean, I learned so much more working
[72:38] (4358.08s)
in the community than I ever did working
[72:40] (4360.00s)
at companies because you have better
[72:41] (4361.44s)
review process. You have more exposure
[72:43] (4363.68s)
to crazy corner cases that you hadn't
[72:46] (4366.00s)
thought of that. Oh yeah, in the real
[72:47] (4367.52s)
world, yes, that would have been one in
[72:48] (4368.80s)
a million, but we do have to take that
[72:50] (4370.16s)
because we have a million boxes, two
[72:51] (4371.60s)
four billion machines out there. Plus,
[72:54] (4374.08s)
plus plus I guess that the more curious
[72:55] (4375.68s)
you are, everything is open in Linux. So
[72:58] (4378.00s)
I remember when I when I joined Uber, I
[73:00] (4380.00s)
was just amazed by the RFC process and
[73:02] (4382.00s)
internally I could read all the RFS and
[73:03] (4383.76s)
I spent like a week or two just kind of
[73:05] (4385.20s)
breathing and you know trying to take it
[73:06] (4386.56s)
all in in Linux is here like like any
[73:09] (4389.04s)
anyone obviously it's overwhelming if if
[73:11] (4391.12s)
you just start at once but but you can
[73:13] (4393.28s)
like target something and so so you can
[73:15] (4395.12s)
just even even if you contribute little
[73:17] (4397.28s)
or even before you contribute you could
[73:18] (4398.96s)
just learn you can see how the changes
[73:20] (4400.80s)
are made. You can try to understand
[73:22] (4402.00s)
these things. Yeah, it's I will say it's
[73:24] (4404.00s)
not the best learning operating system.
[73:25] (4405.76s)
There's really good learning operating
[73:26] (4406.80s)
systems out there. We're not this. That
[73:28] (4408.24s)
being said, I mean people still write
[73:29] (4409.84s)
academic papers about it and all this
[73:31] (4411.68s)
stuff. We want to rewrite the
[73:32] (4412.64s)
scheduleuler, do all this fun stuff with
[73:34] (4414.00s)
it because it is a realworld tool. I
[73:37] (4417.12s)
mean, I learned from Min and Lena's
[73:38] (4418.88s)
learned from Minx, which was a learning
[73:40] (4420.32s)
operating system. And then um we took
[73:42] (4422.88s)
those ideas and that and we made Linux
[73:44] (4424.96s)
with it. I mean, Lena stood it way
[73:46] (4426.40s)
before me, but um learning operating
[73:49] (4429.12s)
systems are great, but working on a real
[73:51] (4431.04s)
world system is a little bit different.
[73:52] (4432.88s)
That being said, there's really there's
[73:54] (4434.64s)
parts of the kernel that are very easy
[73:55] (4435.92s)
to get into for newbies. We have a whole
[73:57] (4437.60s)
section of code with really bad crummy
[74:00] (4440.32s)
drivers that are the wrong coding style.
[74:02] (4442.72s)
They have um the wrong formatting. They
[74:04] (4444.96s)
have um they just a lot of dead code
[74:07] (4447.12s)
that's there for beginners to take up
[74:09] (4449.12s)
and take your first patch, fix the the
[74:12] (4452.24s)
spelling mistakes, fix the coding style,
[74:13] (4453.68s)
learn how to do this stuff. And there's
[74:14] (4454.72s)
a whole website kernel newbies.org is a
[74:16] (4456.80s)
wiki that has a whole bunch of stuff on
[74:18] (4458.48s)
it. How to write your first kernel
[74:19] (4459.68s)
patch, how to get involved. Um I've
[74:21] (4461.68s)
given old YouTube talks if you searched
[74:23] (4463.28s)
how to write a kernel patch. I need to
[74:24] (4464.72s)
do a newer one. It's fun. I've gone to
[74:26] (4466.72s)
universities and said here I gave
[74:28] (4468.40s)
everybody a file that you're going to
[74:29] (4469.52s)
write a colonel patch for this file.
[74:30] (4470.72s)
It's like what? Okay. And they do it and
[74:32] (4472.64s)
by the end of the class, end of two
[74:34] (4474.32s)
hours, they send a patch off and they
[74:35] (4475.52s)
got accepted. You know, it's very
[74:37] (4477.92s)
simple. Um it's not a difficult thing to
[74:40] (4480.48s)
do. Um, and we want new people to get
[74:42] (4482.56s)
involved because we want we don't know
[74:44] (4484.88s)
who's out there or what they can
[74:46] (4486.40s)
contribute if they just want to do
[74:47] (4487.60s)
something for fun or do something for
[74:49] (4489.28s)
real. It's great. Awesome. Well, this
[74:52] (4492.24s)
has been really interesting and I just
[74:54] (4494.08s)
like to close off with some rapid
[74:55] (4495.28s)
questions where I I just ask and then
[74:57] (4497.92s)
you know you do what comes to mind.
[74:59] (4499.68s)
What's the most memorable patch that
[75:01] (4501.28s)
you've contributed to in Linux? So, this
[75:04] (4504.40s)
is going to be about people again.
[75:06] (4506.80s)
early 2000s um we were starting to get
[75:09] (4509.36s)
Microsoft was saying Linux is a cancer.
[75:11] (4511.20s)
We're all worried about Linux. Oh, I
[75:12] (4512.72s)
remember that. Yes, I remember that
[75:13] (4513.84s)
stuff. Um we started getting some really
[75:15] (4515.60s)
really good patches for some hardware
[75:17] (4517.52s)
that we really didn't know that well
[75:19] (4519.20s)
that was showing some really good stuff
[75:21] (4521.60s)
and it was like this is really good and
[75:24] (4524.24s)
we're like where did you get this
[75:25] (4525.52s)
information? How did you know this
[75:26] (4526.80s)
stuff? Is this like as somebody trying
[75:28] (4528.24s)
to sneak this in? And the person wrote
[75:30] (4530.48s)
back and said here's how I found this.
[75:31] (4531.76s)
Here's how I tested it. How they did
[75:33] (4533.04s)
this? like okay all right we took this
[75:34] (4534.72s)
and over time we took all these patches
[75:36] (4536.40s)
over time and then we have this
[75:37] (4537.84s)
conference once a year for all the
[75:39] (4539.52s)
maintainers and you get invited to it
[75:41] (4541.44s)
and we're like oh let's give them an
[75:42] (4542.80s)
invite because it was really good and it
[75:44] (4544.96s)
was in Canada every year for a number of
[75:47] (4547.28s)
years for some reason um and they came
[75:49] (4549.44s)
and they showed up and he he showed up
[75:51] (4551.44s)
and he's like a sorry I had to bring my
[75:52] (4552.96s)
mom because he was in high school he was
[75:55] (4555.36s)
17 years old none of us knew
[75:58] (4558.32s)
and he contributed and it was like okay
[76:00] (4560.48s)
great and it turns out um he later went
[76:02] (4562.16s)
on to MIT And now he's a professor at
[76:03] (4563.84s)
Stanford. Wow. Yeah. All you see is an
[76:06] (4566.72s)
email address. Yeah. Say Adamo. It's
[76:09] (4569.60s)
like okay. Yeah. It's it's like that. I
[76:11] (4571.44s)
mean it's things like that. That's good.
[76:13] (4573.12s)
Um another one I'm really happy about is
[76:15] (4575.28s)
we there's lots of drivers that have
[76:17] (4577.20s)
been sitting outside the colonel tree
[76:18] (4578.48s)
for many many years just cuz people
[76:20] (4580.32s)
never got them upstream or whatnot. Um
[76:22] (4582.32s)
one of them is the subsystem that handle
[76:24] (4584.80s)
Braille keyboards. So Braille displays.
[76:27] (4587.60s)
Yeah. Just feel that those are outside
[76:29] (4589.12s)
the tree. I and a couple other
[76:30] (4590.56s)
developers worked with those people and
[76:32] (4592.48s)
got them in the tree and got them
[76:33] (4593.68s)
working and now they're shipping with
[76:35] (4595.04s)
all devices. So we made sure that these
[76:37] (4597.44s)
people who not were always having to
[76:39] (4599.68s)
patch this out of tree stuff because
[76:41] (4601.92s)
these people these devices only needed
[76:43] (4603.76s)
for a very tiny subset but now it's
[76:45] (4605.68s)
available for everybody. I'm very happy
[76:46] (4606.96s)
to see that happen. Wow. And I I guess
[76:48] (4608.80s)
this goes back to to Linux of what you
[76:50] (4610.64s)
said why companies will contribute a few
[76:53] (4613.12s)
developers per year because now when you
[76:54] (4614.88s)
take Linux you get an OS for example
[76:56] (4616.88s)
that also has rail support like you know
[76:59] (4619.04s)
that that itself like adding it to an
[77:01] (4621.12s)
existing product or or if you built an
[77:03] (4623.04s)
OS like that itself would be a massive
[77:05] (4625.20s)
undertaking. Yeah. So now it supports
[77:07] (4627.04s)
all the devices out there. So awesome.
[77:08] (4628.96s)
What's your favorite programming
[77:10] (4630.00s)
language? It's still C. Still C. I mean
[77:12] (4632.64s)
I've been doing C for what 30 years
[77:14] (4634.32s)
every day. So yeah C. Yeah. Um, I've
[77:17] (4637.04s)
been doing a lot of Rust lately. Rust I
[77:19] (4639.28s)
feel like I'm able to write really
[77:21] (4641.20s)
sloppy code and it's and it's and it
[77:23] (4643.76s)
works. Not as I don't feel like I have
[77:25] (4645.76s)
to be as precise as as C, which is I
[77:28] (4648.00s)
don't know if that's good or bad. And
[77:30] (4650.40s)
what what are what's a book or two that
[77:32] (4652.32s)
you would recommend reading? The old
[77:34] (4654.56s)
code complete book was a really good
[77:36] (4656.64s)
one. That was a really good one. It
[77:38] (4658.00s)
taught me that um coding style matters.
[77:40] (4660.64s)
It doesn't matter what the coding style
[77:42] (4662.24s)
is. It's just a a spe a generic a set
[77:45] (4665.84s)
coding style matters because our brains
[77:47] (4667.28s)
work on patterns and as programmers
[77:49] (4669.04s)
we're reading patterns and when the
[77:50] (4670.88s)
patterns the same the metadata goes away
[77:53] (4673.20s)
and we can see the logic easier. Code
[77:54] (4674.88s)
complete is aged a little bit weirdly.
[77:56] (4676.32s)
If you look at the first book it has a
[77:57] (4677.76s)
lot more C examples and whatnot but it
[77:59] (4679.60s)
it talks about the basics behind
[78:01] (4681.60s)
programming all that stuff and that was
[78:03] (4683.28s)
a really really good book. Um, on the
[78:05] (4685.36s)
flip side, another really fun one, um,
[78:07] (4687.12s)
that's really tiny, programming pearls
[78:09] (4689.20s)
and like bit fiddling and, um, cute
[78:12] (4692.32s)
little algorithms and neat stuff like
[78:14] (4694.08s)
that, which surprisingly we still do
[78:16] (4696.48s)
today. We're talking about adding parody
[78:18] (4698.40s)
functions in a common way and
[78:20] (4700.08s)
everybody's like, "No, if you do it this
[78:21] (4701.20s)
way, you do it this way, it'll be
[78:22] (4702.32s)
faster." And this, so we're still
[78:23] (4703.84s)
messing with these things that people
[78:25] (4705.44s)
have messed with for 40, 50, 60 years.
[78:27] (4707.68s)
And these things still matter, and they
[78:29] (4709.20s)
matter to people because cycles matter
[78:30] (4710.64s)
and power matters and things like that.
[78:32] (4712.08s)
So between those two, those those are my
[78:34] (4714.16s)
favorite ones. Well, this is this is
[78:35] (4715.84s)
awesome. This has been such an
[78:37] (4717.28s)
interesting and like for me just really
[78:39] (4719.52s)
educational and eye opening chat. So I'm
[78:42] (4722.24s)
I'm glad we did it. Well, thanks for
[78:43] (4723.44s)
having me. I found this episode to be a
[78:45] (4725.04s)
really interesting one about Linux. I'm
[78:47] (4727.12s)
still amazed that an open source project
[78:48] (4728.64s)
managed to become the most widespread
[78:50] (4730.40s)
operating system in the world despite
[78:52] (4732.16s)
not being a commercial business. It's
[78:54] (4734.00s)
such an interesting and inspiring
[78:55] (4735.68s)
project. You can find Greg on social
[78:57] (4737.76s)
media as linked in the show notes below.
[78:59] (4739.76s)
And if you'd like to try your hands on
[79:01] (4741.20s)
contributor to Linux, visit
[79:02] (4742.64s)
kernelnewbies.org.
[79:04] (4744.40s)
For more deep dives related to backend
[79:06] (4746.08s)
engineering, check out the pragmatic
[79:07] (4747.52s)
engineer articles linked in the show
[79:09] (4749.12s)
notes below. If you enjoyed this
[79:10] (4750.80s)
podcast, please do subscribe on your
[79:12] (4752.40s)
favorite podcast platform and on
[79:13] (4753.92s)
YouTube. This helps more people discover
[79:15] (4755.92s)
the podcast and a special thank you if
[79:17] (4757.84s)
you leave a rating. Thanks and see you
[79:20] (4760.16s)
in the next