EXTRA BITS - More on Crowd-Sourced Sports Coverage - Computerphile

**Crowdsourcing Video Coverage for the Marathon: A Comprehensive Approach**

The goal of this project is to create a rich and comprehensive video corpus that captures as much coverage of a marathon as possible. The focus is not just on professional athletes or select runners but on every individual participant, from the front runner to the fun runner in unique costumes. This approach aims to tell the story of any given runner throughout the race, capturing multiple shots of each runner from various vantage points along the course.

One notable challenge is understanding what "coverage" means and how it can be achieved. Unlike professional broadcasts that focus on the frontrunners and select runners, this project aims to capture a diverse range of participants. For example, a runner in a chicken suit for charity might only appear in one location in a professional broadcast, but this project seeks to tell their story from start to finish.

To achieve this, participants were asked to tag video fragments with specific details such as whether the shot is stable, if there are people clapping or cheering, and if runners are running or walking. This tagging process helps enrich the content and makes it more navigable for future users.

Another key aspect of the project is building an online repository where videos can be immediately made available to the public. This allows individuals to search for specific runners or events they might have missed in real-time. The goal is to make this tool accessible to everyone, encouraging people to download and use it at upcoming marathons and other events.

The project also explores the potential of applying this approach to other types of events, such as music festivals or team sports. For instance, a marathon with 10,000 runners tells 10,000 unique stories, while a music festival might have one central focus but still benefits from crowd-sourced footage.

One of the biggest challenges is managing the vast amount of video data generated during a marathon. With thousands of participants and spectators recording videos, the network infrastructure must be able to handle uploads efficiently. Strategies such as prioritizing stable shots or asking participants to save battery life for critical moments are being explored to address these issues.

The project also considers how to improve the quality of footage collected. This includes using technology similar to modern cameras that detect when a subject is smiling and taking better photos. The goal is to create an app that guides participants in capturing more professional-quality videos while still allowing them to enjoy the event as spectators.

Looking ahead, the team plans to make their app available through platforms like the App Store and advertise it at major events. The hope is that by crowdsourcing footage, they can create a valuable resource for everyone involved, from runners and spectators to researchers and archivists.

In summary, this project represents a significant step forward in how we capture and share large-scale events. By leveraging crowd-sourced video and advanced tagging techniques, it aims to create a navigable collection of videos that tell individual stories while also providing a broader picture of the event as a whole.

"WEBVTTKind: captionsLanguage: enone of the biggest challenges is to understand coverage um because that's what we're aiming for we're aiming for um a rich Corpus of data that has has as much coverage of the marathon as as is possible um now what coverage actually means kind of depends on what you ultimately might want to do with this video um now again if you look at the kind of the professional um kind of broadcast they tend to focus on the Front Runners kind of the professional athletes who's going to win um and then a few broad shots of of I guess the fundr runners maybe picking out kind of people of Interest are there the guy running in the chicken suit for charity for example um but what what we wanted to do was try and see if we could build Rich enough video that we could tell a story of any individual runner taking part in the marathon so can we get kind of multiple shots of Any Given Runner um so for example uh not just the guy in the chicken suit in kind of one particular location but can we tell his story as they start at the start line from various kind of Vantage points around the course and then film them again as they cross the finish line and so and for any given kind of fun Runner or charity Runner um uh can we begin to do that were there some real notable ones that you got right you thought wow that's brilliant I think we would we were we were surprised just by kind of an arbitrary couple that we took that we took out that about how recognizable a person was when you know that Runner 123 is wearing an orange top and you've seen him in four videos then it's remarkably easy to then find that person in given uh a minute minute of video um so what we're also now trying to do with this video set is also adds additional interesting information that allow us to tell different stories AR just kind of through by Runner number um so what what what we did is we had had an exercise where we basically presented all of our video fragments um to a bunch of people and we fed them pizzza in as a reward for doing this um but we asked them to watch the video in kind of one minute fragments and tag anything that they thought was interesting that somebody else might like to watch um and so there's all sorts of crazy stuff that happens at the marathon there's there's uh there's hundreds people dressed as Robin Hood um there's uh people wearing costumes there's people running for charity there's bands playing by the side of the road all this stuff might get captured by in our in our Corpus video that again add interesting um uh color uh to to telling stories of the marathon so another story we might tell is of all of the robin hoods in the marathon as opposed to just like an individual Runner so we got people to tag the video and we got them to tag it by kind of quality um so is there a finger in front of the lens is it pointing at the sun should we have that while you're talking yeah is it pointing at the sun is it a stable shot is it by the side of a road is it in Parkland are people clapping and cheering and then also thinking about the content so um yeah are there people in fancy dress are the runners running or are they walking um uh are there any Robin hoods you know any any kind of additional information and then again we can kind of add this add this information to kind of enrich our stories so for example we have footage of a a runner running the race dressed as the Cookie Monster um for for charity and we have tags of him by his Runner number but now we also have multiple tags of him as Cookie Monster so again we can apply the same approach and pull out eight videos of the Cookie Monster at various points around the race the next step for this is to basically build an online repository so the idea is that people collect these videos and they're immediately made available on on on on online um so people can look for people they know and we can automatically pull out videos that we know that they're in but also videos that they might be in and get the person who's perhaps interested in seeing the Cookie Monster to help us further enrich the data set so again this is this human computation task uh looking for Cookie Monster automatically in a video is probably difficult but people are really good at it um and so hopefully they can allow us to kind of pin down more instances of the Cookie Monster so you're basically saying that rather than waiting through if you're looking for the Cookie Monster maybe you are the Cookie Monster and you're looking at your race you don't have to wait through the 12 hours of footage cuz we can work out and give you a good guess at where you're probably going to be so you're only going to have to look through an hour for yeah exactly exactly and so we can give you clips that you're definitely in um and those that you that that you might be in and you maybe kind of tagging people you recognize and further points of interest all helps to kind of making the Corpus of data much more interesting and therefore much more interesting for future people to look at so we hope so so we think this is this is a process where throughout we have people kind of capturing and tagging data and adding points of interest and that only serves to enrich the data and make it more navigable um allow us to tell stories better with it so do you see this as an archiving thing or a or a thing for the people who are in the race or I think for research or I think all of those so uh it's definitely interesting well we hope it's interesting for people who were in The Race to be able to kind of pull out footage for themselves uh I think it's interesting for people who perhaps were spectators at the marathon or perhaps uh couldn't be there so for example if you know somebody you knew was running in the marathon but you couldn't physically be there then you can perhaps still relive some of that experience or or kind of re-engage with it in some way um I think as an archival tool it's quite it's quite useful so we have a we have an interesting large well collection of videos that might be useful for somebody at some point um and I think there's an argument for actually applying some of these techniques to things other than a marathon so we've chosen the marathon because it's perhaps the most obvious example um but you might think you could do this with anything that goes along a a a linear geographically kind of Route so you know the tour to France uh kind of rally rally racing kind of any any kind of sporting event um but I think I'm also interested in thinking how it could be applicable to music events gigs and festivals um citizen journalism kind of riots uh and kind of reconstructing riots and and what happened has been an interesting thing recently could be a legal tool in that instance yeah yeah exactly and so so there are lot it raises a lot of kind of interesting connotations as to yeah what it means when the the crowd is suddenly available uh is suddenly able um to kind of collect and make sense of a large uh portion of information kind of by themselves without having to rely on of the big organizations to to do it for them so yeah the interesting things for researcher understanding um the ability of a crowd to collect this data one of the issues we've already seen some so issues of kind of camera work can we train people automatically to hold their camera steady uh how do we kind of direct people around the course how do we kind of orchestrate a crowd as a whole because yeah I mean obviously this is me with my sort of camera on um but potentially the people could be pointing in the wrong direction they could be framed wrong I mean there's the obvious things you just said about shaky cameras and things but you know yeah and so we we might we might think about how we so research-wise we might think about how we improve our Corpus and Improvement doesn't just mean getting more footage it means getting better footage um so in the same way I guess a a modern Stills camera will detect when you're smiling and then automatically take the picture we might have things uh that instruct you in how to take U better better footage um so for example maybe yeah maybe when you start up our app it kind of tells you what kind of shots it needs or wants to help the best contribute to the cost I suppose if you've got five people in in the very near V vicinity you can then ask those five people to get different to to yeah yeah or maybe move to different places or maybe it's kind of I think it's give and take we might say um if we know that you're here to watch your uncle we might say if you go to this bit of the course you'll be able to see your uncle because he's going to turn up in about 10 minutes because he's been tagged 10 minutes ago um and also you're helping to fill a hole in our video coverage um so you can send people to the industrial estate is what you're telling me yeah yeah um but one of the real one of the big challenges is to kind of sensitively balance people's experience of being a spectator because when they're not there to be camera operators they're there to spectate and have a good time so it's about sensitively balancing that while also being able to collect some interesting footage from them I'm going to be going to XYZ event next I mean how can I get involved in this can I use your technology um yes absolutely uh the challenge with the biggest challenge with crowd sourcing so you know essentially this is what we're doing is crowd sourcing um the biggest challenge with crowd sourcing is assembling a crowd um you need people to do this it doesn't work if you don't have enough people so the first time we did this we um we explicitly recruited people so we essentially bribed them to take part um to gather our our initial Corpus but now what we want to do is kind of uh make this more available and get people to use it kind of on mass essentially um and to also begin to understand kind of uh how people use it naturally so when they're not explicitly asked to ask for it um so what we're planning on doing is is trying to be a bit more organized um um making our app available um kind of through the App Store that kind of thing we we're hoping to develop it for uh for multiple platforms um and then to advertise when we're deploying it at certain races to so to have a concerted effort to kind of uh for example at the next marathon wherever um is to have a concerted effort to make it available and advertise it and get people using it because the more people who use it the the the better our collection of footage is um okay so even if only one person uses it you know we've still got one point of view of the marathon the more people use it the more points of view we can get um so that that's that's my aim moving forward is to definitely make it available um encourage people to download it and use it um for for upcoming events and Marathon and then see what we get from it cuz yeah I can see it being used for all sorts of things from I know ski and downhill mountain bike racing through to Bob you know you name it it it could be anything that takes place over a even a I suppose a team event could potentially you know if you're watching a rugby match and you're on one side of the pitch and 20 people are on different parts of the pitch yeah sport professional sporting gets really difficult because of broadcasting rights uh lawyers tend to get involved um so we like to doing we like doing kind of uh things that a bit a bit a bit less commercial a bit more kind of uh interesting for the local community for example so so perhaps school school football might or school rugby might be something that's interesting but even I suppose that comes to if a marathon if there's a band playing a music oh yeah yeah personally I'm really interested in understanding how this might work for festivals music festivals um so uh people Milling around kind of filming stuff they think is interesting what's what's the kind of the uh navigable online version of glastenbury uh that that you can visit because you couldn't get a ticket because that's sold out too quickly for example um but that opens a whole can of worms about kind of yeah broadcasting filming bootlegging that that that kind of thing um but I think this uh I think fundamentally the idea of uh crowdsourcing footage in General whether or not it's tagged uh and and kind of visualized in exactly the way we've done it it's certainly a growing uh ability that we all have there's a bunch of tools popping up now where you record um a bunch of people go to a gig and they record footage of the band playing uh and it automatically uses the uh the soundtrack to sync up those clips to to to provide a uh to provide um uh a single video and there's a bunch of examples of this so like f the Beasty Boys organized a kind of a crowd a crowd videoed uh video as as did some a bunch of radio head fans so it's interesting that a bunch of people are kind of like self-organizing in order to collect this stuff um I think what makes the marathon stuff interesting is there's no common frame of reference for stitching this stuff together and understanding you've purely got the clock yeah you you you've purely got the clock so there's no there's no Global soundtrack that we can use to kind of make sense for these of all these videos so they're they're they're at the it's like I suppose if you got 10,000 Runners that's like 10,000 songs or 10,000 Stories is yeah yeah exactly so I think so I think it's interesting to think about a spectrum of these events whereas at a gig there's one point of focus and there's there's basically one story to be told whereas a marathon there's 10,000 Stories to be told um and so that's that's where it becomes more interesting it's not just about fitting video clips on a timeline it's about providing a a navigable collection of videos that enable to people to extract their own stories in in in some way tell me one or two of the challenges of making an app like this or yeah well so the main outstanding challenge is um is about what to do with all this video so at the moment people are just recording videos it's not streamed it's not broadcast they're just recording videos that s that sit on their phones um and ideally we'd want to get those videos online as fast as possible to provide some some kind of reasonable online experience um now what happens when we have 10,000 people all spectating the marathon they're only going to be in a few mobile phone cells and there's going to be many more than usual so none of them are going to um have a reasonable mobile phone signal um and so if we start trying to upload lots and lots of videos from people's mobile phones this the the network is just not going to not going to handle it um and so what one approach we might take to this is to think about again is that another factor that we can use to kind of organize people or orchestrate people um so if you and I are stood in the same spot um maybe we can say we'll only upload your video because yours is better for some reason rather than trying to upload them both because that that has an effect on the network if we have a collection of people and we kind of want to instruct them in some way we can do so a number of levels so one might be to is to get people to collect different shots for example for our video Corpus the other might be with an understanding of actually the network is pretty flaky here so only one of us should be recording um so either we get us both to record and only pick what we consider to be the best or most appropriate shot what whatever that is or we might say well no you save your battery life for something else uh you wouldn't be able to upload anything you recorded anyway so I'll record stuff um and upload it so there's a bunch of levels we can think about this one is about uh coverage like raw coverage somebody being in a spot filming something one is the finer details of exactly what the multiple shots are and one is other kind of I guess more infrastructural considerations like who's got the most battery life who will be able to upload stuff um even if if they can record it what's the best use of the network um that that these people are operating with who's got credit who's got creditone of the biggest challenges is to understand coverage um because that's what we're aiming for we're aiming for um a rich Corpus of data that has has as much coverage of the marathon as as is possible um now what coverage actually means kind of depends on what you ultimately might want to do with this video um now again if you look at the kind of the professional um kind of broadcast they tend to focus on the Front Runners kind of the professional athletes who's going to win um and then a few broad shots of of I guess the fundr runners maybe picking out kind of people of Interest are there the guy running in the chicken suit for charity for example um but what what we wanted to do was try and see if we could build Rich enough video that we could tell a story of any individual runner taking part in the marathon so can we get kind of multiple shots of Any Given Runner um so for example uh not just the guy in the chicken suit in kind of one particular location but can we tell his story as they start at the start line from various kind of Vantage points around the course and then film them again as they cross the finish line and so and for any given kind of fun Runner or charity Runner um uh can we begin to do that were there some real notable ones that you got right you thought wow that's brilliant I think we would we were we were surprised just by kind of an arbitrary couple that we took that we took out that about how recognizable a person was when you know that Runner 123 is wearing an orange top and you've seen him in four videos then it's remarkably easy to then find that person in given uh a minute minute of video um so what we're also now trying to do with this video set is also adds additional interesting information that allow us to tell different stories AR just kind of through by Runner number um so what what what we did is we had had an exercise where we basically presented all of our video fragments um to a bunch of people and we fed them pizzza in as a reward for doing this um but we asked them to watch the video in kind of one minute fragments and tag anything that they thought was interesting that somebody else might like to watch um and so there's all sorts of crazy stuff that happens at the marathon there's there's uh there's hundreds people dressed as Robin Hood um there's uh people wearing costumes there's people running for charity there's bands playing by the side of the road all this stuff might get captured by in our in our Corpus video that again add interesting um uh color uh to to telling stories of the marathon so another story we might tell is of all of the robin hoods in the marathon as opposed to just like an individual Runner so we got people to tag the video and we got them to tag it by kind of quality um so is there a finger in front of the lens is it pointing at the sun should we have that while you're talking yeah is it pointing at the sun is it a stable shot is it by the side of a road is it in Parkland are people clapping and cheering and then also thinking about the content so um yeah are there people in fancy dress are the runners running or are they walking um uh are there any Robin hoods you know any any kind of additional information and then again we can kind of add this add this information to kind of enrich our stories so for example we have footage of a a runner running the race dressed as the Cookie Monster um for for charity and we have tags of him by his Runner number but now we also have multiple tags of him as Cookie Monster so again we can apply the same approach and pull out eight videos of the Cookie Monster at various points around the race the next step for this is to basically build an online repository so the idea is that people collect these videos and they're immediately made available on on on on online um so people can look for people they know and we can automatically pull out videos that we know that they're in but also videos that they might be in and get the person who's perhaps interested in seeing the Cookie Monster to help us further enrich the data set so again this is this human computation task uh looking for Cookie Monster automatically in a video is probably difficult but people are really good at it um and so hopefully they can allow us to kind of pin down more instances of the Cookie Monster so you're basically saying that rather than waiting through if you're looking for the Cookie Monster maybe you are the Cookie Monster and you're looking at your race you don't have to wait through the 12 hours of footage cuz we can work out and give you a good guess at where you're probably going to be so you're only going to have to look through an hour for yeah exactly exactly and so we can give you clips that you're definitely in um and those that you that that you might be in and you maybe kind of tagging people you recognize and further points of interest all helps to kind of making the Corpus of data much more interesting and therefore much more interesting for future people to look at so we hope so so we think this is this is a process where throughout we have people kind of capturing and tagging data and adding points of interest and that only serves to enrich the data and make it more navigable um allow us to tell stories better with it so do you see this as an archiving thing or a or a thing for the people who are in the race or I think for research or I think all of those so uh it's definitely interesting well we hope it's interesting for people who were in The Race to be able to kind of pull out footage for themselves uh I think it's interesting for people who perhaps were spectators at the marathon or perhaps uh couldn't be there so for example if you know somebody you knew was running in the marathon but you couldn't physically be there then you can perhaps still relive some of that experience or or kind of re-engage with it in some way um I think as an archival tool it's quite it's quite useful so we have a we have an interesting large well collection of videos that might be useful for somebody at some point um and I think there's an argument for actually applying some of these techniques to things other than a marathon so we've chosen the marathon because it's perhaps the most obvious example um but you might think you could do this with anything that goes along a a a linear geographically kind of Route so you know the tour to France uh kind of rally rally racing kind of any any kind of sporting event um but I think I'm also interested in thinking how it could be applicable to music events gigs and festivals um citizen journalism kind of riots uh and kind of reconstructing riots and and what happened has been an interesting thing recently could be a legal tool in that instance yeah yeah exactly and so so there are lot it raises a lot of kind of interesting connotations as to yeah what it means when the the crowd is suddenly available uh is suddenly able um to kind of collect and make sense of a large uh portion of information kind of by themselves without having to rely on of the big organizations to to do it for them so yeah the interesting things for researcher understanding um the ability of a crowd to collect this data one of the issues we've already seen some so issues of kind of camera work can we train people automatically to hold their camera steady uh how do we kind of direct people around the course how do we kind of orchestrate a crowd as a whole because yeah I mean obviously this is me with my sort of camera on um but potentially the people could be pointing in the wrong direction they could be framed wrong I mean there's the obvious things you just said about shaky cameras and things but you know yeah and so we we might we might think about how we so research-wise we might think about how we improve our Corpus and Improvement doesn't just mean getting more footage it means getting better footage um so in the same way I guess a a modern Stills camera will detect when you're smiling and then automatically take the picture we might have things uh that instruct you in how to take U better better footage um so for example maybe yeah maybe when you start up our app it kind of tells you what kind of shots it needs or wants to help the best contribute to the cost I suppose if you've got five people in in the very near V vicinity you can then ask those five people to get different to to yeah yeah or maybe move to different places or maybe it's kind of I think it's give and take we might say um if we know that you're here to watch your uncle we might say if you go to this bit of the course you'll be able to see your uncle because he's going to turn up in about 10 minutes because he's been tagged 10 minutes ago um and also you're helping to fill a hole in our video coverage um so you can send people to the industrial estate is what you're telling me yeah yeah um but one of the real one of the big challenges is to kind of sensitively balance people's experience of being a spectator because when they're not there to be camera operators they're there to spectate and have a good time so it's about sensitively balancing that while also being able to collect some interesting footage from them I'm going to be going to XYZ event next I mean how can I get involved in this can I use your technology um yes absolutely uh the challenge with the biggest challenge with crowd sourcing so you know essentially this is what we're doing is crowd sourcing um the biggest challenge with crowd sourcing is assembling a crowd um you need people to do this it doesn't work if you don't have enough people so the first time we did this we um we explicitly recruited people so we essentially bribed them to take part um to gather our our initial Corpus but now what we want to do is kind of uh make this more available and get people to use it kind of on mass essentially um and to also begin to understand kind of uh how people use it naturally so when they're not explicitly asked to ask for it um so what we're planning on doing is is trying to be a bit more organized um um making our app available um kind of through the App Store that kind of thing we we're hoping to develop it for uh for multiple platforms um and then to advertise when we're deploying it at certain races to so to have a concerted effort to kind of uh for example at the next marathon wherever um is to have a concerted effort to make it available and advertise it and get people using it because the more people who use it the the the better our collection of footage is um okay so even if only one person uses it you know we've still got one point of view of the marathon the more people use it the more points of view we can get um so that that's that's my aim moving forward is to definitely make it available um encourage people to download it and use it um for for upcoming events and Marathon and then see what we get from it cuz yeah I can see it being used for all sorts of things from I know ski and downhill mountain bike racing through to Bob you know you name it it it could be anything that takes place over a even a I suppose a team event could potentially you know if you're watching a rugby match and you're on one side of the pitch and 20 people are on different parts of the pitch yeah sport professional sporting gets really difficult because of broadcasting rights uh lawyers tend to get involved um so we like to doing we like doing kind of uh things that a bit a bit a bit less commercial a bit more kind of uh interesting for the local community for example so so perhaps school school football might or school rugby might be something that's interesting but even I suppose that comes to if a marathon if there's a band playing a music oh yeah yeah personally I'm really interested in understanding how this might work for festivals music festivals um so uh people Milling around kind of filming stuff they think is interesting what's what's the kind of the uh navigable online version of glastenbury uh that that you can visit because you couldn't get a ticket because that's sold out too quickly for example um but that opens a whole can of worms about kind of yeah broadcasting filming bootlegging that that that kind of thing um but I think this uh I think fundamentally the idea of uh crowdsourcing footage in General whether or not it's tagged uh and and kind of visualized in exactly the way we've done it it's certainly a growing uh ability that we all have there's a bunch of tools popping up now where you record um a bunch of people go to a gig and they record footage of the band playing uh and it automatically uses the uh the soundtrack to sync up those clips to to to provide a uh to provide um uh a single video and there's a bunch of examples of this so like f the Beasty Boys organized a kind of a crowd a crowd videoed uh video as as did some a bunch of radio head fans so it's interesting that a bunch of people are kind of like self-organizing in order to collect this stuff um I think what makes the marathon stuff interesting is there's no common frame of reference for stitching this stuff together and understanding you've purely got the clock yeah you you you've purely got the clock so there's no there's no Global soundtrack that we can use to kind of make sense for these of all these videos so they're they're they're at the it's like I suppose if you got 10,000 Runners that's like 10,000 songs or 10,000 Stories is yeah yeah exactly so I think so I think it's interesting to think about a spectrum of these events whereas at a gig there's one point of focus and there's there's basically one story to be told whereas a marathon there's 10,000 Stories to be told um and so that's that's where it becomes more interesting it's not just about fitting video clips on a timeline it's about providing a a navigable collection of videos that enable to people to extract their own stories in in in some way tell me one or two of the challenges of making an app like this or yeah well so the main outstanding challenge is um is about what to do with all this video so at the moment people are just recording videos it's not streamed it's not broadcast they're just recording videos that s that sit on their phones um and ideally we'd want to get those videos online as fast as possible to provide some some kind of reasonable online experience um now what happens when we have 10,000 people all spectating the marathon they're only going to be in a few mobile phone cells and there's going to be many more than usual so none of them are going to um have a reasonable mobile phone signal um and so if we start trying to upload lots and lots of videos from people's mobile phones this the the network is just not going to not going to handle it um and so what one approach we might take to this is to think about again is that another factor that we can use to kind of organize people or orchestrate people um so if you and I are stood in the same spot um maybe we can say we'll only upload your video because yours is better for some reason rather than trying to upload them both because that that has an effect on the network if we have a collection of people and we kind of want to instruct them in some way we can do so a number of levels so one might be to is to get people to collect different shots for example for our video Corpus the other might be with an understanding of actually the network is pretty flaky here so only one of us should be recording um so either we get us both to record and only pick what we consider to be the best or most appropriate shot what whatever that is or we might say well no you save your battery life for something else uh you wouldn't be able to upload anything you recorded anyway so I'll record stuff um and upload it so there's a bunch of levels we can think about this one is about uh coverage like raw coverage somebody being in a spot filming something one is the finer details of exactly what the multiple shots are and one is other kind of I guess more infrastructural considerations like who's got the most battery life who will be able to upload stuff um even if if they can record it what's the best use of the network um that that these people are operating with who's got credit who's got credit\n"