00:05 Uh the internet today has been described 00:07 00:07 as this big nervous system for humanity, 00:09 00:09 right? Uh so much of what we do today is 00:11 00:11 carried through these digital pipes. Um 00:14 00:14 you so much of our economy is just run 00:17 00:17 entirely uh on the internet. Now uh and 00:21 00:21 we interface with the internet and with 00:24 00:24 each other uh through a series of 00:26 00:26 applications uh and these run mostly on 00:28 00:28 the web. 00:29 00:29 Uh the web is uh sort of like the 00:33 00:33 transport protocol for how these 00:35 00:35 applications move around. It's a way of 00:36 00:36 being able to deploy 00:38 00:38 um different uh media uh and to be able 00:42 00:42 to move to to link people from one set 00:45 00:45 of uh things to another uh to put put 00:47 00:47 people in applications uh and to give 00:48 00:48 people uh functionality that they didn't 00:51 00:51 previously have. So it's like the page 00:53 00:53 just uh loads and you retrieve some 00:54 00:54 function new functionality you didn't 00:56 00:56 have and you now can access um uh new 00:59 00:59 things right so this was a huge 01:01 01:01 breakthrough of course uh but it has 01:03 01:03 some problems today uh now the crazy 01:06 01:06 thing about the internet the web is that 01:07 01:07 it's just a collection of protocols 01:09 01:09 right so it's a bunch of really good 01:11 01:11 ideas of how to do things that have 01:13 01:13 worked things that haven't uh that get 01:14 01:14 standardized put into uh put into 01:16 01:16 browsers and put into all their tools 01:17 01:17 into computers and then this orchestra 01:20 01:20 of programs runs uh and gives us this 01:24 01:24 amazing range of capabilities, right? So 01:26 01:26 everything that we as humans can do 01:27 01:27 today, you know, be able to learn 01:29 01:29 remotely, be able to uh think together, 01:32 01:32 be able to talk to uh a person around 01:35 01:35 the world, um happens thanks to these 01:38 01:38 great protocols that have been written 01:40 01:40 by by people that really were just 01:42 01:42 aiming to augment their own experiences 01:44 01:44 and through that ended up uh augmenting 01:48 01:48 uh what the rest of humanity can do. So 01:51 01:51 this is kind of mind-blowing, right? 01:53 01:53 Because it means that if you find a 01:54 01:54 problem with the internet or uh you have 01:56 01:56 some new ideas about what humanity 01:58 01:58 should uh be able to do, you can just 02:01 02:01 write a protocol uh and then you 02:04 02:04 implement it and if you're right uh and 02:06 02:06 it works then you tell the world and 02:08 02:08 then it gets deployed and then a lot of 02:10 02:10 people will use it and the world will be 02:11 02:11 a better place. So it's it's a great 02:13 02:13 place to be. Now uh we have found some 02:17 02:17 problems uh that we're not telling you 02:18 02:18 about and they mostly have to do with 02:20 02:20 how we move uh applications uh through 02:23 02:23 the web. So 02:24 02:24 it's moving not just applications but 02:27 02:27 any kind of media. So you know documents 02:29 02:29 images uh pictures video and so on. So 02:34 02:34 IPFS is mostly related to HTTP and it 02:36 02:36 sort of enhanced HTP and should be used 02:38 02:38 alongside it um or maybe as a shim uh 02:41 02:41 and potentially in some cases like 02:42 02:42 actually just transition over to to 02:45 02:45 using IPFS instead. Now uh the big 02:50 02:50 problem behind this change or what we're 02:53 02:53 trying to to solve here is the problem 02:56 02:56 of using location addressing. So if 02:58 02:58 you've looked at a URL uh 03:02 03:02 you you have this first part of the of 03:05 03:05 the URL that's the domain right uh and 03:07 03:07 that you know here it's exampample.com 03:09 03:09 the domain is resolved to an IP address 03:12 03:12 and an IP address means the the set of 03:17 03:17 numbers that you need to dial uh to get 03:20 03:20 connected to another computer uh across 03:22 03:22 the network. So what does that picture 03:24 03:24 look like? Say that you're here 03:26 03:26 highlighted in blue and you're trying to 03:28 03:28 access this picture uh through the 03:31 03:31 internet and you have an address and 03:32 03:32 this address has uh you know these 03:35 03:35 numbers and then a path for the for the 03:37 03:37 actual file. That means that that you're 03:40 03:40 going to find a specific other computer 03:42 03:42 at that address and fetch the image from 03:45 03:45 that computer. Now suppose that all of 03:47 03:47 the other computers pictured here have 03:49 03:49 the exact same file locally, including 03:51 03:51 one that's very close to you in the 03:53 03:53 network perhaps. Maybe it's in the same 03:54 03:54 room, but it doesn't matter. Uh, as far 03:56 03:56 as HTP is concerned, that is not the 03:58 03:58 same file and you have to go across the 04:01 04:01 world to 04:02 04:02 10.20.30.40 uh to find it. Actually, 04:04 04:04 that's uh probably won't resolve because 04:06 04:06 10 is a local address. But anyway, um 04:10 04:10 the issue here is that we are addressing 04:15 04:15 uh content by location. So, we're 04:19 04:19 telling you where to find something 04:21 04:21 instead of what it is. And a lot of 04:25 04:25 people have talked about this as a 04:26 04:26 problem and come up with solutions uh 04:27 04:27 but it hasn't quite uh quite sunk in. 04:30 04:30 And so to to drive uh more the point 04:34 04:34 here and and to show you why this is 04:36 04:36 actually a huge problem today uh picture 04:38 04:38 a big room of people and it's filled 04:41 04:41 with people with a bunch of computers 04:42 04:42 and you know nowadays we carry a laptop 04:44 04:44 and maybe a mobile phone and soon watch 04:47 04:47 and you know we'll have a bunch of 04:48 04:48 devices you know say that I that I um 04:51 04:51 upload a picture to to Facebook and I 04:54 04:54 give everyone the link and now a whole 04:56 04:56 bunch of people are going to go to 04:58 04:58 Facebook and fetch that image from 05:00 05:00 Facebook all the way back. So, you know, 05:03 05:03 not that big of a deal, right? Well, 05:05 05:05 let's look at how it looks in the in the 05:07 05:07 backbone. So, I make a request to 05:10 05:10 Facebook and I send it up and you know, 05:12 05:12 say it's just a 1 megabyte image times 8 05:14 05:14 here because there's a set of say that 05:16 05:16 there's eight links. So, it's a total 05:18 05:18 amount of bandwidth, you know, this is 05:19 05:19 these are really rough calculations just 05:21 05:21 to give you a picture. Say that there's 05:23 05:23 8 megaby of bandwidth uh used by my 05:26 05:26 picture going all the way up to 05:27 05:27 Facebook. 05:29 05:29 And when 30 people show up, you know, 05:31 05:31 say they're all in the same room and 05:32 05:32 they all talk to Facebook and all pull 05:35 05:35 down the image. Now, that's 240 05:37 05:37 megabytes of bandwidth wasted. And, you 05:40 05:40 know, you might say, well, you know, 05:41 05:41 that's unavoidable because we have to 05:42 05:42 encrypt everything. And um and you know, 05:45 05:45 that's true. We have encrypted 05:46 05:46 everything. But, you know, it's, you 05:48 05:48 know, we we have to ship the image 30 05:51 05:51 times across the wire. It doesn't matter 05:53 05:53 if it's wasted. Like, you just have to 05:54 05:54 send it. Maybe that's true. Maybe that 05:56 05:56 isn't true. Uh, and you know what? Well, 05:58 05:58 maybe 240 megabytes, it's not a big 05:60 05:60 deal, right? But what if we're looking 06:01 06:01 at video instead? So, imagine that you 06:04 06:04 you go to YouTube and you start watching 06:06 06:06 a video and it's, you know, in high def. 06:07 06:07 So, it says it's like about 200 06:09 06:09 megabytes and, you know, you send it to 06:11 06:11 everybody else and then like these 30 06:13 06:13 people start downloading a 200 megabyte 06:15 06:15 video across these eight links. Now, 06:17 06:17 we're talking about 48 GB of bandwidth. 06:22 06:22 That's starting to be a lot. uh when you 06:25 06:25 look at bigger files or like you know 06:26 06:26 longer videos and so on, we're looking 06:28 06:28 at potentially you know terabytes of 06:30 06:30 bandwidth needed just to move these 06:33 06:33 files around from the place that they're 06:36 06:36 needed uh to the backbone and back. And 06:38 06:38 the crazy thing is that uh those same 06:42 06:42 files or the same data that represents 06:44 06:44 those files might be lying around in the 06:46 06:46 same local area network that you're in. 06:48 06:48 So a really like a computer right next 06:51 06:51 to yours could be serving you that file. 06:54 06:54 Uh now there's you know a lot of reasons 06:56 06:56 why we haven't done this historically. 06:58 06:58 Uh but it maybe maybe may start to be a 06:60 06:60 good idea. That was kind of intense. Uh 07:02 07:02 so while we're talking about bandwidth 07:04 07:04 let's look at something else. So and 07:05 07:05 kind of why this is a problem. So this 07:07 07:07 is data from aime and uh you know it's a 07:11 07:11 graph from 2007 to 2012. uh we could 07:14 07:14 update it but uh this is kind of an 07:16 07:16 older graph and in that period of 5 07:18 07:18 years uh bandwidth only improved about 07:21 07:21 one or two megabits per second in the G7 07:23 07:23 and that's you know the seven uh largest 07:25 07:25 economy so that is not a very 07:29 07:29 significant improvement when you compare 07:30 07:30 that to you know the 07:31 07:31 the way that our storage is increasing 07:35 07:35 right we now have um 10 terabyte hard 07:38 07:38 drives we we had one terabyte a few 07:41 07:41 years ago uh we're going going to have 07:43 07:43 100 terabytes in and a few more. I mean 07:45 07:45 it's it's the cost is doubling uh you 07:47 07:47 know halfing every 11 months and we're 07:50 07:50 now looking at big numbers. So that 07:53 07:53 means uh that we want to be we're 07:55 07:55 saturating these pipes and the bandwidth 07:57 07:57 from the local area networks to the 07:59 07:59 backbone is really really uh really 08:01 08:01 small compared to how much we want to 08:03 08:03 use them. 08:04 08:04 Actually the problem is worse because 08:06 08:06 the the rate of improvement on the on 08:09 08:09 the speed of connections of the average 08:11 08:11 average connection it's actually 08:12 08:12 improving at a slower rate than storage. 08:14 08:14 So that means that people are getting 08:18 08:18 larger capacity drives uh faster than 08:22 08:22 they are getting better bandwidth which 08:23 08:23 gives you the impression that things are 08:26 08:26 getting slower. uh because in a sense 08:28 08:28 you're you're you want to use more of 08:30 08:30 the 08:32 08:32 network. Uh there's also another problem 08:34 08:34 which is latency and I mean we we've all 08:36 08:36 known for a long time that we can't get 08:37 08:37 around the fact that the speed of light 08:39 08:39 is remains constant. Uh so the only way 08:42 08:42 to make things faster is by moving them 08:44 08:44 closer to you. This is why the you know 08:46 08:46 Amazon and Google have offer these cloud 08:49 08:49 services that you can hire to store a 08:51 08:51 whole bunch of stuff right next to um 08:53 08:53 right next to where it's needed the 08:55 08:55 most. And this works pretty well uh 08:58 08:58 except that you know sometimes even 09:01 09:01 maybe sometimes people don't have things 09:02 09:02 deployed in those locations or uh you 09:05 09:05 know even that latency uh out to you 09:09 09:09 know through these slow pipes um to that 09:12 09:12 data center uh is too too much. I mean 09:14 09:14 maybe what you need is like a the file 09:16 09:16 that you need is literally like in 09:18 09:18 another device in the same room and 09:21 09:21 instead like you're you're piping the 09:23 09:23 data through the backbone. like making a 09:25 09:25 request out to the network, grabbing the 09:27 09:27 file and and then pulling it down 09:30 09:30 instead of just talking to the other 09:31 09:31 device that you have locally. And this 09:32 09:32 is kind of an absurd thing that uh that 09:34 09:34 we do this and um 09:37 09:37 uh you know kind of a funny anecdote. I 09:39 09:39 was giving a talk once recently about 09:41 09:41 IPFS and uh the slides that I wanted to 09:43 09:43 present uh were stuck in my computer and 09:45 09:45 my computer couldn't talk to the 09:46 09:46 projector because I we didn't have the 09:48 09:48 right adapter and getting my slides from 09:50 09:50 my computer to another person's computer 09:52 09:52 required because nobody had a USB key 09:53 09:53 required sending the slides all the way 09:56 09:56 up to the backbone and then back down. 09:59 09:59 Uh and you know like it it didn't 10:01 10:01 actually most people would have had to 10:03 10:03 do that like you we actually just drop 10:06 10:06 into the the terminal and and just 10:08 10:08 connect to the specific computer 10:09 10:09 directly and so on. But you know this is 10:10 10:10 the kind of stuff that we're dealing 10:11 10:11 with like most the average person 10:13 10:13 doesn't know how to do any of that. Uh 10:16 10:16 there's another big issue here and 10:17 10:17 that's the the dichotomy between online 10:20 10:20 and offline operation right so we 10:23 10:23 program uh we as engineers are sort of 10:26 10:26 misusing the web because we program 10:28 10:28 behind this model of saying you know we 10:31 10:31 have this data center and if the user 10:33 10:33 can talk to the data center then they're 10:34 10:34 all nine and if they can't they're 10:36 10:36 offline and this is actually a pretty 10:39 10:39 bad model and in increasingly it's 10:41 10:41 becoming uh more and uh you know we had 10:44 10:44 this perception that it was going to 10:45 10:45 uh more and more true as in you know 10:47 10:47 we're going to connect everyone and 10:48 10:48 everyone's going to be online all the 10:50 10:50 time. But that's actually not um 10:52 10:52 accurate because we have ever more 10:55 10:55 devices that we carry around and 10:58 10:58 contexts which we're using these devices 11:00 11:00 that are not close to any kind of u of 11:03 11:03 network that can uh move us. So imagine 11:06 11:06 that you go on a plane and you're 11:07 11:07 traveling uh and you have your laptop 11:09 11:09 and you have your phone and maybe you 11:10 11:10 have some files in one and not in the 11:12 11:12 other and now like moving those around 11:14 11:14 like most applications that you have are 11:15 11:15 not going to do that. In fact certainly 11:17 11:17 most applications that use through the 11:18 11:18 web are certainly not going to load. 11:19 11:19 They're not even going to load. Most 11:21 11:21 things uh just cease operation the 11:24 11:24 moment that you step outside of the 11:25 11:25 bounds of the network. 11:27 11:27 And you know uh sort of to give to 11:30 11:30 illustrate how this works like you know 11:31 11:31 imagine uh you have again this like 11:33 11:33 massive classroom but this this 11:35 11:35 classroom is great it just so shows so 11:38 11:38 much of these is so many of these issues 11:40 11:40 right um you know say that I I uh open a 11:44 11:44 Google doc or something and I send it 11:45 11:45 out to everyone else. So, everyone opens 11:47 11:47 this Google doc and now we're all 11:48 11:48 collaborating, but all of these updates 11:50 11:50 are getting shipped out to the to the 11:53 11:53 backbone uh you know to some Google 11:56 11:56 server and then shipped all the way 11:57 11:57 back. Uh and so that's how we're 11:58 11:58 collaborating. So, by the way, we're 11:60 11:60 seeing a lot of latency there because 12:01 12:01 every single time you type something, it 12:02 12:02 has to go out there and then back 12:03 12:03 instead of right next to you. You could 12:05 12:05 be literally sitting next to the person 12:06 12:06 and they still have to go go out there. 12:08 12:08 And we've kind of hidden behind the fact 12:10 12:10 that you know human humans don't 12:12 12:12 perceive that much latency and that you 12:13 12:13 know if you get um you know around 300 12:16 12:16 400 milliseconds like you're barely 12:18 12:18 going to notice it. Um but you know say 12:21 12:21 that something bad happens to the rest 12:22 12:22 of the network and this whole room loses 12:24 12:24 connectivity to the backbone. Suddenly 12:27 12:27 the entire application comes crashing 12:28 12:28 down and nobody can do anything. So 12:31 12:31 maybe they can keep editing things 12:32 12:32 locally, but they cannot ship the 12:36 12:36 updates that they're making to the 12:38 12:38 person right next to them. And this is 12:40 12:40 absurd, right? I mean, uh, talk to any 12:43 12:43 person who's using these applications. 12:45 12:45 They think it's it's it's crazy um that 12:48 12:48 they can't the data from one computer 12:49 12:49 can't get to the other computer, which 12:51 12:51 is right next to it. And in reality, 12:54 12:54 these these app these computers are 12:56 12:56 actually talking to each other. They're 12:57 12:57 on the same network. are probably 12:58 12:58 pinging each other uh or you know 12:60 12:60 they're seeing each other's packets 13:01 13:01 flowing through they we just haven't 13:03 13:03 taught our applications how to have them 13:07 13:07 talk about these files and this problem 13:10 13:10 is all over the web you know I I'm not 13:13 13:13 picking Google here like there's look at 13:15 13:15 tons of applications that we use 13:17 13:17 day-to-day and like run our lives are uh 13:19 13:19 sees operation uh entirely uh in fact 13:22 13:22 actually Google's one of the best on 13:24 13:24 this uh they have this awesome 13:25 13:25 technology called operational transforms 13:27 13:27 that allows these updates to you know to 13:30 13:30 be done concurrently and like to give 13:31 13:31 you this amazing impression that uh it's 13:33 13:33 working all flawlessly uh and you know 13:36 13:36 they all get applied in in the backbone 13:38 13:38 and uh you can be editing the same 13:40 13:40 document and it just works flawlessly. 13:42 13:42 They could be sending those updates to 13:43 13:43 each other through you know things like 13:44 13:44 WebRTC and and so on but as far as I 13:47 13:47 know that that doesn't happen and it 13:48 13:48 certainly doesn't happen in most 13:49 13:49 applications that you see in the web. Uh 13:51 13:51 and this is not a a flaw of the 13:55 13:55 protocols uh themselves in that we sort 13:59 13:59 of could 13:60 13:60 be could be using WebRTC nowadays but 14:03 14:03 it's actually a flaw in how we store 14:05 14:05 data and how we reference data. So, it's 14:08 14:08 how HTTP has taught us to store data on 14:11 14:11 the web and reference it. And you know, 14:13 14:13 since the beginning of HTTP, like W3C 14:16 14:16 has actually come up with a whole bunch 14:17 14:17 of new ways to reference data that are 14:19 14:19 better. Um, but people haven't really 14:20 14:20 adopted them in the web. Now, uh, to me, 14:24 14:24 this this set of problems kind of feels 14:27 14:27 very silly, right? Like you have this 14:28 14:28 massive set of computers somewhere and 14:30 14:30 then you take out the mothership and 14:32 14:32 everything grinds to a halt. And most 14:35 14:35 people think this is fine, but in 14:36 14:36 reality, we enter terrible bandwidth uh 14:40 14:40 problems. We we run into these issues 14:42 14:42 with uh being somewhere between the 14:44 14:44 online offline spectrum. You know, maybe 14:46 14:46 we're in like very low bandwidth setting 14:48 14:48 or we have a bunch of interference or 14:49 14:49 there's congestion or maybe you're 14:51 14:51 traveling like I mentioned or your ISP 14:53 14:53 has intermittent outages. This happens 14:55 14:55 to me all the time. Uh and maybe the 14:59 14:59 data center has some problem, right? So 15:01 15:01 many issues could be occurring and users 15:03 15:03 can't um be forced to to you know halt 15:08 15:08 their operation just because there's 15:10 15:10 some uh discontinuity between them and 15:12 15:12 the backbone. So our applications should 15:14 15:14 learn how to operate in entirely 15:16 15:16 distributed settings. Uh and it's we as 15:19 15:19 engineers need to start doing this and 15:21 15:21 we need to build better tools for 15:23 15:23 application developers to do this. Now 15:26 15:26 it it's actually really important, 15:28 15:28 right? So you might think that I'm that 15:29 15:29 I'm uh just talking about uh these 15:31 15:31 problems that we want to improve, but 15:33 15:33 they really really matter. And they 15:35 15:35 matter so much that if you ask any 15:38 15:38 Egyptian about the time that the 15:39 15:39 internet shut down because their 15:42 15:42 government decided it they were not 15:44 15:44 going to allow 15:45 15:45 communications, they were they are going 15:47 15:47 to tell you very seriously that these 15:49 15:49 applications either potentially save 15:52 15:52 their lives or their families lives. So 15:53 15:53 these are not only mission critical to 15:55 15:55 businesses, they're mission critical to 15:58 15:58 human human lives uh potentially at 16:01 16:01 risk. So these particular communication 16:04 16:04 things need to be able to speak in uh to 16:06 16:06 each other when disconnected from the 16:08 16:08 backbone. We need to be able to to uh 16:11 16:11 talk have our computers talk to each 16:13 16:13 other and we need to be able to do so 16:15 16:15 securely. Right? So we we've seen this 16:19 16:19 you know in the last few years we've 16:20 16:20 seen like the great failing of um of of 16:23 16:23 our community in terms of securing 16:25 16:25 everything uh you know we've we've had 16:29 16:29 uh major breaches described uh you know 16:32 16:32 by Snowden and uh so on and uh you know 16:36 16:36 there's there's all sorts of problems 16:38 16:38 with how we're moving around data on the 16:40 16:40 web uh and it's really not enough to 16:43 16:43 encrypt the communications uh you know a 16:47 16:47 few I think it was a couple years ago or 16:49 16:49 maybe not that long uh you know Dropbox 16:51 16:51 had this huge data problem where they 16:53 16:53 allowed anyone to log in for uh 16:55 16:55 something like four hours and anybody 16:57 16:57 could log into your account and look at 16:58 16:58 your files and that was you know Dropbox 17:00 17:00 is full of great engineers and this is 17:02 17:02 kind of like you know if they are not 17:04 17:04 getting it perfectly right imagine how 17:06 17:06 many people are just getting it totally 17:08 17:08 wrong. Uh so we need to be encrypting uh 17:11 17:11 everything or or really looking 17:14 17:14 carefully at how we're doing security uh 17:16 17:16 on the client side and treating the 17:19 17:19 cloud as just um as much as we possibly 17:21 17:21 can treating the cloud as just oblivious 17:23 17:23 uh oblivious storage or oblivious uh 17:25 17:25 routing 17:26 17:26 systems. And uh you know this is this of 17:29 17:29 course complicates everything but you 17:30 17:30 know certain kinds of applications or 17:32 17:32 certain kinds of context for 17:34 17:34 applications uh demand that kind of 17:36 17:36 attention. Now, uh, the last one I want 17:39 17:39 to talk about is permanence. And this 17:41 17:41 one is really, really deeply important 17:43 17:43 to me. And it might not be as obvious to 17:46 17:46 everyone why this matters. 17:48 17:48 Um, but, you know, I'll try my best to 17:51 17:51 to illustrate it. So, think about book 17:54 17:54 burning for a moment. Think about the 17:57 17:57 most important uh piece of media that 18:00 18:00 you can think about like uh the most 18:02 18:02 important book that you have ever read 18:03 18:03 or you know the you know think about all 18:05 18:05 of Wikipedia and imagine someone coming 18:07 18:07 along and burning it and really 18:09 18:09 destroying it making it impossible for 18:12 18:12 anyone to read ever again. Think about 18:15 18:15 all the knowledge that is lost. 18:17 18:17 Now, historically, we've treated book 18:20 18:20 burning as this insane, crazy uh offense 18:23 18:23 to to progress and to into humanity 18:26 18:26 itself. And we've condemned anyone who's 18:28 18:28 gone uh to to do such a thing. And you 18:32 18:32 know, it's kind of uh you know, when you 18:36 18:36 look in throughout history and you look 18:37 18:37 at uh you know, the occurrences of book 18:38 18:38 burning uh they drop off uh 18:42 18:42 significantly with the advent of the 18:45 18:45 printing press. This is because the 18:46 18:46 printing press allowed you to make many 18:49 18:49 copies of the same book very cheaply. 18:50 18:50 So, you know, say that you print a whole 18:52 18:52 bunch of copies uh and book burners show 18:54 18:54 up and try to burn a lot of bucks. Well, 18:57 18:57 they're probably only going to be able 18:58 18:58 to get to a few of them. Uh or even if 19:01 19:01 you they get most uh there's probably 19:03 19:03 copies that are going to survive and 19:04 19:04 you'll be able to make more copies. So, 19:06 19:06 this is great. Now, we have some 19:10 19:10 problems today because, you know, if 19:12 19:12 you've been around the web, you know, 19:13 19:13 bringing this back to the digital world, 19:14 19:14 if you've been around the web, you've 19:16 19:16 probably seen a 404, right? A 404 is an 19:18 19:18 error that tells you that some resource 19:19 19:19 is not found. What that means is that 19:22 19:22 there was a link pointing to another 19:24 19:24 object and that object is either no 19:26 19:26 longer there because someone took it 19:27 19:27 down, I burned the book, or someone 19:30 19:30 moved it. And you know, we we tend to to 19:34 19:34 talk um we tend to to chastise a lot of 19:37 19:37 a lot of uh uh agencies and so on for 19:40 19:40 for uh you know, taking certain content 19:41 19:41 down or censoring and so on. But we 19:44 19:44 forget that there's a a tiny little book 19:48 19:48 burnings happening constantly. Whenever 19:50 19:50 any web developer moves some content 19:52 19:52 from one location to another, any link 19:55 19:55 that anyone had added to that location 19:56 19:56 is now broken. uh and will it is you 19:60 19:60 know potentially findable through search 20:02 20:02 but not going to be that link is now 20:05 20:05 going not no no no no longer going to 20:07 20:07 work um potentially uh you know because 20:10 20:10 we're we we have the system where both 20:13 20:13 the publisher of content has to host it 20:16 20:16 uh or you know make sure that it's 20:17 20:17 hosted somewhere the uh the consumer of 20:20 20:20 the content uh must depend on that that 20:23 20:23 producer uh keeping that content up or 20:27 20:27 they have to copy it and move it 20:29 20:29 somewhere else and give it a new 20:30 20:30 address. Right? So, this is again like 20:31 20:31 the the this is this is strongly related 20:34 20:34 to the location addressing problem and 20:35 20:35 is sort of behind this. This is why 20:37 20:37 links break um because people end up 20:40 20:40 being careless and moving things around. 20:42 20:42 Uh and you know really thinking of the 20:45 20:45 web of documents as uh in the abstract 20:48 20:48 you know kind of ignores the fact that 20:49 20:49 that really the web is not just a web of 20:51 20:51 documents. It's a web of documents on 20:54 20:54 machines and that the notion or 20:57 20:57 description of a document identifies the 20:59 20:59 set of machines that are responsible for 21:00 21:00 giving up and this prevents people from 21:03 21:03 replicating or being able to uh uh host 21:06 21:06 the same content somewhere 21:08 21:08 else. Uh you know this is kind of like a 21:10 21:10 book burner's paradise. Uh and you know 21:13 21:13 I'm describing this like accidental book 21:14 21:14 burning that happens. Now fortunately 21:18 21:18 someone uh you know a group of people 21:20 21:20 very smart uh saw that this was a pretty 21:23 21:23 big problem uh early on and into the 21:26 21:26 web's history and started trying to 21:28 21:28 archive everything. Uh and of course I'm 21:30 21:30 talking about the internet archive and 21:32 21:32 it looks like this. It's beautiful. If 21:33 21:33 you haven't been there you you probably 21:34 21:34 should. Um and it they are running this 21:39 21:39 project to try and index the entire web 21:41 21:41 and store it uh store every single page 21:44 21:44 that they can possibly uh find and store 21:45 21:45 it because they know that stuff will be 21:48 21:48 needed in the future. Uh ironically uh I 21:53 21:53 was trying to find uh the source code 21:55 21:55 for one of the one of the protocols that 21:58 21:58 that we um learned a lot from and used 22:02 22:02 to develop uh IPFS. And funnily enough, 22:05 22:05 the source code is only available 22:07 22:07 through the internet archive because uh 22:10 22:10 people that were hosting the code no 22:11 22:11 longer are hosting it or you know maybe 22:13 22:13 there was some glitch in the server or 22:14 22:14 whatever. We could only find it through 22:15 22:15 the way way back machine which is this 22:17 22:17 gray service that they 22:19 22:19 run. Uh this is actually very closely 22:22 22:22 related to another important problem uh 22:25 22:25 described by Vince Surf who's one of the 22:26 22:26 creators of the internet. Uh and that's 22:29 22:29 this notion of digital vellum and 22:32 22:32 uh think of old computers you know as as 22:35 22:35 technology evolves and gets better with 22:38 22:38 time we stop using the old things and so 22:41 22:41 much so that we stop being able to give 22:43 22:43 them maintenance and sometimes they 22:44 22:44 break and when they break nobody knows 22:46 22:46 how to fix them or nobody has the parts 22:47 22:47 to fix them. Uh so these old machines uh 22:52 22:52 that no longer work may have been 22:55 22:55 readers of important media. So things 22:58 22:58 that we stored in some some physical 23:00 23:00 material uh that the this computer or 23:03 23:03 this machine was going to read out to us 23:05 23:05 and now the machine's broken and nobody 23:06 23:06 knows how to fix it. That means that all 23:08 23:08 of that content is lost completely lost 23:11 23:11 unless we find a way to fix it. And so 23:14 23:14 you know the solution to this is really 23:16 23:16 learn to emulate everything right. So we 23:19 23:19 as a society should be very careful with 23:21 23:21 uh the types of media that we store and 23:23 23:23 how we store them to make sure that uh 23:25 23:25 you know if even if we stop using 23:26 23:26 something uh or some major problem 23:29 23:29 happens we can emulate all of these 23:31 23:31 computers or machines uh you know that 23:34 23:34 we're old to try and read the encodings 23:37 23:37 or or you know tell like be able to read 23:39 23:39 out this media. And so we we need to be 23:42 23:42 able to to simulate or emulate uh every 23:45 23:45 single computer that we've ever built. 23:47 23:47 And you know I I this is again related 23:51 23:51 to this to this this bug burning problem 23:53 23:53 because say that we do this say that we 23:55 23:55 we create all these emulators where do 23:57 23:57 we host them? That location might be 23:59 23:59 taken down you know. So 24:01 24:01 um we we uh you know might be taken down 24:04 24:04 or or maybe might lose funding or 24:06 24:06 whatever. We need to be able to 24:07 24:07 replicate the these systems and store 24:09 24:09 them in as many places as we possibly 24:10 24:10 can. uh and we need to build tools and 24:12 24:12 make the internet uh capable of of uh 24:16 24:16 doing this very easily. So these are the 24:19 24:19 the problems that have driven us to 24:23 24:23 think through how the web works, think 24:24 24:24 through how content moves around the 24:27 24:27 internet and come up with a solution. 24:29 24:29 And you know this this isn't designed 24:31 24:31 sort of in the abstract. We're actually 24:33 24:33 um 24:35 24:35 we thought for a long time and we 24:38 24:38 thought through uh many different kinds 24:42 24:42 of attempts and and and solutions to 24:44 24:44 this problem to try and synthesize a 24:46 24:46 bunch of good ideas that work well 24:47 24:47 together and provide a some new software 24:50 24:50 that uh so both a pro protocol and tool 24:52 24:52 set uh that people can use uh to make 24:56 24:56 the web better.