The Dreaded 404 - Computerphile

The Annoying World of 404 Errors: Understanding Why Web Links Break

We've all been there - browsing the web, clicking on a link, and then suddenly finding ourselves staring at a dreaded 404 error message. It's an annoyance that can be frustrating, but also a major problem for old websites that aren't continually being maintained and updated. When this happens, it can make entire websites completely unusable. But why does this happen? And what is the underlying reason behind these broken links?

The Problem with Links on the Web

When we think of a link, we imagine two things being connected - one physical entity to another. A link implies that when you move one item, the other item can't be separated from it. However, this isn't how the web works. What we actually have on the web is not links, but pointers. Pointers are references from one document to another, where the destination of that reference might change or disappear over time.

For example, let's take a look at my personal website and its URL - [insert URL here]. If I want to put a link into this page, I would use following HTML "a" which stands for anchor. This is the anchoring end of the link, referencing this web page. However, there is no way that this web page has any knowledge of the fact that it's being referenced - unless I tell you otherwise. And as far as the mechanics of the web are concerned, it's purely the href, which is a pointer to where something possibly is.

The Importance of Understanding How Links Work

So, what does this mean for us? It means that we need to understand how links work and why they break. This can be difficult because it requires us to think about the underlying mechanics of the web, rather than just clicking on links without thinking about them. But understanding how links work is essential if we want to create websites that are stable and functional.

A Look at How Web Pages Are Structured

Let's take a look at how web pages are structured. When you view a webpage, you see a mix of HTML code, images, videos, and other multimedia content. The HTML code tells the browser how to display the page, while the images and videos add visual interest to the page. But beneath all this, there is an underlying structure that allows the web pages to work together seamlessly.

When we create links between web pages, we are essentially creating connections between these different elements of the webpage. However, because the web is based on pointers rather than physical links, these connections can be fragile and prone to breaking. This is why old websites can become unusable when links break - because the underlying structure of the page has changed or disappeared.

The Role of Data in Web Development

Today, many websites are not just HTML documents, but also data-driven systems with databases underpinning them. These systems require careful design and development to ensure that they work together smoothly. If we're careful enough with our designs, we can create systems that manage their own links internally, reducing the risk of broken links.

However, there is another challenge when it comes to linking between web pages - even if we have data-driven systems in place, external changes can still affect us. For example, if someone else outside of our control updates a link on one of our partner websites, we may find that our own website breaks as a result. This is why linking between external websites can be tricky and unreliable.

Creating New Content: A Way Forward

Despite the challenges of linking between web pages, there are ways to create new content by quoting original sources and taking the original material and putting it into a new content text. By doing this, we can create stable and functional links that are less prone to breaking.

Additionally, Ted also envisaged a micro-charging system, which is an interesting part of this discussion. While not fully explained in the transcription, it's clear that there is ongoing research and development aimed at improving the stability and functionality of web pages.

Conclusion

In conclusion, 404 errors are frustrating but not necessarily fatal. By understanding how links work on the web, we can create websites that are more stable and functional. It requires a deeper understanding of the underlying mechanics of the web, as well as careful design and development to ensure that our systems work together seamlessly.

"WEBVTTKind: captionsLanguage: enI'm going to talk about why web links break now I'm sure this is something that everybody is familiar with and has annoyed everybody if you're browsing the web and you're doing your clicky stuff you follow a link follow another link follow another link and then all of a sudden it doesn't work it breaks you get the dreaded 404 error message it's at best an annoyance but can be a major problem because in the case of old websites that aren't continually being maintained and updated uh it can make them completely unusable if I've certainly quite often seen websites that might have useful information on but none of the links work everything gives you a 404 um error the reason it's 404 is that comes from the protocol underpinning the worldwide web it's one of the error messages and it just means there's a missing fun file at one level that might seem quite trivial somebody's deleted the page you're pointing at or whatever but Information Systems don't necessarily have to behave like that if you are browsing through the files on your PC um you're using Windows Explorer or macf finder or whatever and you move a file you don't suddenly start getting error messages because you've moved a file um it whereas you do on the web and this comes down to quite a fundamental way in which the web is designed and really the problem is that links on the web aren't actually Links at all they're misnamed um because if you think about it for a moment a link the word link is a metaphor in design it's taken from the links of a chain and a link implies that two things are actually attached if you move one you can't separate them and that's not the way the web behaves um what you've actually got on the web is not Links at all but pointers you've got a pointer from one document to another or in fact you don't even have that you have a pointer from one document to where you hope another is going to be and that's why things break let's take a um web page I'm going to just take my personal um web page and web page identified by URL so the URL is this now if I want to put in a link into that page I would just put in following HTML a which stands for anchor because this is the anchoring end of the link hre equals like that and hre just stands for hypex reference um it's referencing this um web page then I can just put in the text from the link so click here and just finish that so that now is the HTML to create a link but you can see what it is it's a hypertext reference to this web page there is no way that this web page has any knowledge of the fact that it's being referenced it's if if I put a link into your web page your web page doesn't know about it uh neither do you unless I tell you um as far as the mechanics of the web are concerned it's purely the href it's a pointer to where something possibly is because it's up to me when I put it in to actually get that right and if it was right at the time but then you change the name of your document then all of a sudden it's wrong um and this is why the web is very brittle it's why it can break I mean there are ways around that um but they're not particularly easy a lot of the web these days isn't just um HTML documents a lot of the web is data driven and there's database underpinning it and if you're careful enough with the design of your system you can have a system that manages its own links internally and that will be fairly careful not to Break um but but for one thing that's quite hard to do um but the an another thing is that even if you do that the moment you have a link that goes out to somebody else's web page you want to link one of my pages or you want to link something on the BBC news site or what have you um then the moment that somebody else outside of your control changes things it's going to break and that is a fundamental problem to the way that the web works is because any solution is retrofitted to the way that the web work and so I'm afraid although they are very very annoying four of fours are going to be with us for the foreseeable future and they're an annoyance that you have to live with if you use the web people could create new material a new content by quoting original sources and taking the original material and putting it into a new content text but an interesting part of this is Ted also envisaged a micr chargingI'm going to talk about why web links break now I'm sure this is something that everybody is familiar with and has annoyed everybody if you're browsing the web and you're doing your clicky stuff you follow a link follow another link follow another link and then all of a sudden it doesn't work it breaks you get the dreaded 404 error message it's at best an annoyance but can be a major problem because in the case of old websites that aren't continually being maintained and updated uh it can make them completely unusable if I've certainly quite often seen websites that might have useful information on but none of the links work everything gives you a 404 um error the reason it's 404 is that comes from the protocol underpinning the worldwide web it's one of the error messages and it just means there's a missing fun file at one level that might seem quite trivial somebody's deleted the page you're pointing at or whatever but Information Systems don't necessarily have to behave like that if you are browsing through the files on your PC um you're using Windows Explorer or macf finder or whatever and you move a file you don't suddenly start getting error messages because you've moved a file um it whereas you do on the web and this comes down to quite a fundamental way in which the web is designed and really the problem is that links on the web aren't actually Links at all they're misnamed um because if you think about it for a moment a link the word link is a metaphor in design it's taken from the links of a chain and a link implies that two things are actually attached if you move one you can't separate them and that's not the way the web behaves um what you've actually got on the web is not Links at all but pointers you've got a pointer from one document to another or in fact you don't even have that you have a pointer from one document to where you hope another is going to be and that's why things break let's take a um web page I'm going to just take my personal um web page and web page identified by URL so the URL is this now if I want to put in a link into that page I would just put in following HTML a which stands for anchor because this is the anchoring end of the link hre equals like that and hre just stands for hypex reference um it's referencing this um web page then I can just put in the text from the link so click here and just finish that so that now is the HTML to create a link but you can see what it is it's a hypertext reference to this web page there is no way that this web page has any knowledge of the fact that it's being referenced it's if if I put a link into your web page your web page doesn't know about it uh neither do you unless I tell you um as far as the mechanics of the web are concerned it's purely the href it's a pointer to where something possibly is because it's up to me when I put it in to actually get that right and if it was right at the time but then you change the name of your document then all of a sudden it's wrong um and this is why the web is very brittle it's why it can break I mean there are ways around that um but they're not particularly easy a lot of the web these days isn't just um HTML documents a lot of the web is data driven and there's database underpinning it and if you're careful enough with the design of your system you can have a system that manages its own links internally and that will be fairly careful not to Break um but but for one thing that's quite hard to do um but the an another thing is that even if you do that the moment you have a link that goes out to somebody else's web page you want to link one of my pages or you want to link something on the BBC news site or what have you um then the moment that somebody else outside of your control changes things it's going to break and that is a fundamental problem to the way that the web works is because any solution is retrofitted to the way that the web work and so I'm afraid although they are very very annoying four of fours are going to be with us for the foreseeable future and they're an annoyance that you have to live with if you use the web people could create new material a new content by quoting original sources and taking the original material and putting it into a new content text but an interesting part of this is Ted also envisaged a micr charging\n"