DataChats _ Episode 6 _ An Interview With Jo Hardin

**The Future of Education: A Discussion with a Data Analyst**

Technology is going to play a big role in shaping the future of education, and as someone who has experience building courses, I can attest to that. Coming to Data Camp was an exciting opportunity for me to see firsthand how technology can be used to teach statistics and data analysis. However, I also recognize that there's more to education than just technology - one-on-one interactions, research projects, and class discussions are essential components of a well-rounded educational experience.

**Addressing Beginners**

For those who are just getting started with R and statistics, my advice would be to do a lot of statistics and data analysis. Find some data, analyze it, come up with some cool graphics, and put it on a blog or GitHub site. If you're not ready to analyze data yet, start by taking classes, reading blogs like Simply Statistics, and engaging with the online community. The key is to keep trying new things and learning from your mistakes. I think that's the best way to figure out whether this field is right for you.

**The Future of Education**

As we move forward, it's clear that education will not be the same as it is today in 20 years. Neither of us has a crystal ball, but we're trying to figure this out every day. One thing I believe is essential for a successful educational system or process is the liberal arts model. This approach emphasizes one-on-one interactions, research projects, and class discussions, where students receive personalized feedback from instructors. While this model doesn't scale easily, it's invaluable in preparing students for real-world applications.

**Scaling Education**

To integrate this model into online platforms like Data Camp, we need to find ways to scale while still maintaining the personal touch. Peer-to-peer grading and peer-to-peer feedback can be a starting point. Platforms like Sack Exchange allow students to receive quick and accurate responses from their peers, which is invaluable in today's fast-paced learning environment.

**A Combination of Approaches**

Ultimately, I believe that education will require a combination of human interaction and technology. While technology has its place in supplementing traditional teaching methods, getting toe-to-toe with other human beings is still essential for effective communication and learning. As we continue to develop new educational technologies, it's crucial that we keep the core components of education intact: personalized feedback, research projects, and class discussions.

**A New Era of Learning**

The future of education will be exciting to watch unfold over the next couple of decades. It's clear that technology will play a larger role in shaping our learning experiences, but I also believe that there's no substitute for human interaction. By combining these approaches, we can create an educational system that is both effective and engaging. As educators and learners, it's our responsibility to ensure that we're doing everything possible to prepare students for success in the years to come.

"WEBVTTKind: captionsLanguage: enhi uh I'm Nick I'm a data scientist at data camp and I'm here today with Joe Harden we just uh finished recording videos for her new course on statistical inference as part of our intro stat series that's pretty exciting um so welcome thank you it's good to be here it's been fun good um so Joe is a morning person so we started at 6:00 a.m. this morning um in Boston in in Boston which is 3:00 a.m. in California yep and she's a trooper she actually flew here uh last night um and then was up at 6:00 a. or 5: amm our time this morning so right okay interesting um so anyways just wanted to ask a few questions of Joe uh today to pick her brain a little bit while we have her here in the studio um so the first thing I want to ask is you're a professor of math and stats um how did you get into statistics did you start out as a mathematician and then find your way into stats the other way around um and why stats so my uh undergraduate degrees in mathematics mostly because we didn't have a stat Department um I actually came to college thinking I wanted to be an actuary which has some overlap in um relation to statistics but I fell in love with teaching and I fell in love with college and and the academy so I thought to myself how can I continue to do this for the rest of my life and uh statistics was was a really natural fit um I'd always been good at math I really loved the applied problems that that we were seeing uh so I did a lot of Statistics in undergrad um even though I was a math major and then I went on to get my graduate degree in statistics Okay cool so um what were some of the applications of stats that you were excited about that Drew you to the field um I actually did my senior thesis Topic in 1995 in on bootstrapping Okay so um it was a it was a project that used I don't even remember I think it was like a logistic regression model um and so we did bootstrapping to uh think about uh confidence intervals for the coefficients and the variability of the coefficients and so the data was actually data that my uh my adviser had had collaborators in the medical field and so it was real data and application to a real problem okay cool so that's something I didn't even know about you um and that's interesting because there bootstrapping is a big part of the course that we just worked on together yeah so something I've been doing for a long time been thinking about for a long time okay I understand it now a lot better than I did as a senior really in college yes why well part of it was that um that we didn't have quite the technology that we that we do now so um it was hard to simulate right exactly so I was using s+ uh as a as a student the precursor to R the precursor to R um that we had to pay for uh and uh so I did simulations and I was able to kind of generate the intervals and the variability that we needed but the the way that um that R is written and the packages that have been created since that time make it a lot easier to sort of see what's going on to visualize the graphics are so much better um and just just really um it takes away from the the work of having to code right so it felt like a lot of weeds when I was a student that we were just trying to get these technical details and um less Forest of just understanding the big Concepts it makes sense so um for people who don't know what bootstrapping is you should take the course and you will know um but maybe just give a quick explanation for somebody who's never been exposed to that before so for lots of years um we as statisticians have been able to use Theory to understand very variability so how samples vary from one to the next when you're taking samples from a population so for example polling a population to understand uh who's going to be the next president that's right that's right so we understand you know how one sample proportion could vary over lots of samples when the population value is something different right uh and proportions is a great example because that's one we really understand with the theory MH but there's lots of measures out there ways that we summarize the data that don't have theory behind them to understand how the samples could vary from one sample to the next given a population so for those statistics um you know it might be the trimmed mean where you're worried about extreme observations on one end of the other you get some high high uh some observations that are really high and some observations that are really low the trimmed mean is much harder to say things about theoretically um so instead what we do is we sample from the original data from the original sample and it turns out that this process of resampling from the original data is a great approximation for understanding the variability associated with a statistic that uh might not be the the traditional statistics we work withh sure So for anybody who's maybe been through an intro stats course in the past particularly for those people who had maybe a less than optimal experience going through intro stats in the past because they were taught to uh memorize the formula for stand yeah or t competence interval yeah right the bootstrap is actually another way to think about confidence intervals right right right because it's that variability piece and that's what a confidence interval you know buys for you buys for you that variability and and ultimately it's about saying um we think that um 55% of the population is going to vote for this president um but of course there's a lot of uncertainty in there and and so this is about trying to basically quantify how certain or uncertain we are about that that figure that's exactly right great that's exactly right okay cool so let's focus in on the r piece for a second okay um so this course is NR yep um you mentioned that you used to work in s+ yeah um some years ago so maybe just describe if you could um how you first came to um to work with s+ and um what the impact was for you yeah so I used s+ as an undergraduate in the early 90s which um I don't think was particularly common partly because you had to purchase it MH um at some point uh before s+ sort of went out of business they did offer free educational um licenses but um then I went on and did my graduate work and my graduate work had had um simulation pieces to it so I was using s+ there as well uh and then I started my job my current job at Pomona College and um and when I started I also had the institution by s+ licenses but um when when we have students working with a licensed software it gets complicated in terms of how do they get that software onto their own machines how do they get um you know how do they use the software after they leave the institution all those kind of is the institution really ready to support it forever all that kind of stuff um so when R started becoming more viable in terms of the help files and the the user interface I immediately um started using it in my classrooms and so uh I use R in all of my classes so my intro stats to my you know even my theoretical classes I'll have them simulating things so that they kind of understand how the how the theory is showing up um and my students love it I mean they they will write to me many years later and tell me they're using R and their job or their fun activity you know they they simulate like NBA pools and stuff to figure figure stuff out How likely am I to win this poker hand that's right that's right um that that's neat so what what do you think it is specifically um as as an educator about our or or teaching empowering people to to to be able to program things what do you think it is about that that has such a positive impact for students um I think that kind of like my experience as an undergraduate um R is really uh set up to sort of answer questions and so when the students come and they're motivated and they're excited about solving a particular problem and understanding the mechanisms behind those problems I think that um I think that R has just a really nice interface for for the students to to feel accomplished with the problem that they're trying to solve um but before you mentioned like like for you and your experience like um being able to like simulate things and um and and actually visualize the output like that that had a had an impact on you it made abstract things otherwise abstract things more concrete that's right that's right right and and I I agree I think that's true for our students I think that um the graphics in R are outstanding and um and pretty easy for the students to pick up quickly right so uh so when they're they're simulating to understand an abstract concept that visualization is important when they're um when they're just making Graphics of their own data that visualization is also important yeah and I think it's it's an important point because at the end of the day R is a tool unless you're a developer and your focus is the programming itself like um R is a tool and when you're trying to teach stati statistics you want everything else to kind of get out of the way that's right so that they can focus on the ideas that you're trying to convey um and not how to program a for Loop for example right right or or to get some dots on a page right um okay very cool um so why specifically I think this is probably a pretty good segue but why specifically did you want to um build a course on statistical inference of all things um well I'm an educator so I teach at a small uh Residential College and uh we have a lot of face-to-face interactions um I see my my students all the time I get to know them really well and that makes educating a lot easier um I know what they understand I know what they don't understand um and I can I can address those needs but I don't really feel like small liberal arts colleges as fantastic as they are and as wonderful as I could just go on and on about I don't know that they're the the education of the future I don't know you know I think a lot about where education will be in 20 years or 50 years years and um certainly I believe that technology is going to play a big role and so one of the um exciting things about coming to data camp for me was to see that other side and to understand what is the process of building a course how do you make choices about what to talk about and how to preempt the questions that the students might have because I'm not going to see their faces I'm not going to know what's what's going on sort of behind the scenes a much different medium it is it is and I'm not convinced that this is the medium that will exist in 50 years but but I do think that um that things are changing and that technology will will be important or will have a big role yeah absolutely okay very cool um so uh maybe addressing those um beginners out there people who are just getting started with r and statistics um what advice would you have for them so um so my advice for for my students which I think um is mostly uh they're they're quite similar to the students on the data camp platform um is to just do a lot of Statistics do a lot of data analysis um I tell my students for example to you know find some data somewhere there's lots of good P public databases and apis and catle competitions find some data and do an analysis come up with some cool graphics put it on a Blog get a GitHub site um and if you really feel like you're not quite ready to to be actually analyzing data then do things like this take some classes think about you know read read blogs like simply statistics or um um I don't know who else but but there but there are other other good blogs and and uh and just kind of keep trying things uh I think that's the best way to kind of think about whether it's right for you and how you can contribute to the field go make a bunch of mistakes and very publicly there you go yeah yeah absolutely um cool actually so I want to go back back to the last question okay um so I I'm I'm curious so obviously uh we are actively trying to figure out what the future of education is as well and I and I I wholeheartedly agree with you it won't look the same today whether it's Pomona or whether it's data Camp right it's not going to look the same today uh in 20 years as it does today right um so I'm curious of no neither of us have a crystal ball um and we're trying to figure this out every single day that's what we're doing but um what do you think it maybe uh broadly speaking what are the characteristics of um of a successful educational system or educational process that you think we should be striving toward um over the next 20 years like what is missing today that you hope to see in the future that we could do better at we in the collective well okay I mean you know that might come from a particular perspective so I'm going to say the things from my perspective which is that I actually believe that the liberal arts model is the best way to teach I I think that the the one-on-one interactions and the research projects that we do and the the class projects that the students do where I'm giving them you know really good feedback on every single Paper um is invaluable it just doesn't scale that's the problem with it there's no possible way for it to scale and whether we're talking about us education or we're talking about internationally right there's just um a disconnect there so so I wonder how we can integrate that piece into it and I know that on some of the other online platforms they do things like um peer-to-peer grading peer-to-peer feedback and I wonder if maybe that's a piece of it is some kind of scaling of um you know communicating about ideas that are not fully formed or ideas that are wrong um you know I'm a huge uh user of things like sack exchange right so so that's sort of that platform of getting good feedback I mean it's not always perfect but getting getting good feedback quickly and often I think uh is pretty important in a in a way where other people can actually benefit from a response that was directed at your at you questions you had my query right yeah yeah um no I think I think that's a great that's a great point and um yeah I think realistically it's it's going to be a combination of all of these things in the future there is no Silver Bullet when it comes to education technology has its place um getting toe-to-toe with other human beings also has its place right um but it will be really exciting to see how things um how things develop over the next couple of decades because I I I absolutely agree with you things will be different yeah yeah okay cool that's all I so thanks to and it's been really nice having you around yeah it's been fun all right take carehi uh I'm Nick I'm a data scientist at data camp and I'm here today with Joe Harden we just uh finished recording videos for her new course on statistical inference as part of our intro stat series that's pretty exciting um so welcome thank you it's good to be here it's been fun good um so Joe is a morning person so we started at 6:00 a.m. this morning um in Boston in in Boston which is 3:00 a.m. in California yep and she's a trooper she actually flew here uh last night um and then was up at 6:00 a. or 5: amm our time this morning so right okay interesting um so anyways just wanted to ask a few questions of Joe uh today to pick her brain a little bit while we have her here in the studio um so the first thing I want to ask is you're a professor of math and stats um how did you get into statistics did you start out as a mathematician and then find your way into stats the other way around um and why stats so my uh undergraduate degrees in mathematics mostly because we didn't have a stat Department um I actually came to college thinking I wanted to be an actuary which has some overlap in um relation to statistics but I fell in love with teaching and I fell in love with college and and the academy so I thought to myself how can I continue to do this for the rest of my life and uh statistics was was a really natural fit um I'd always been good at math I really loved the applied problems that that we were seeing uh so I did a lot of Statistics in undergrad um even though I was a math major and then I went on to get my graduate degree in statistics Okay cool so um what were some of the applications of stats that you were excited about that Drew you to the field um I actually did my senior thesis Topic in 1995 in on bootstrapping Okay so um it was a it was a project that used I don't even remember I think it was like a logistic regression model um and so we did bootstrapping to uh think about uh confidence intervals for the coefficients and the variability of the coefficients and so the data was actually data that my uh my adviser had had collaborators in the medical field and so it was real data and application to a real problem okay cool so that's something I didn't even know about you um and that's interesting because there bootstrapping is a big part of the course that we just worked on together yeah so something I've been doing for a long time been thinking about for a long time okay I understand it now a lot better than I did as a senior really in college yes why well part of it was that um that we didn't have quite the technology that we that we do now so um it was hard to simulate right exactly so I was using s+ uh as a as a student the precursor to R the precursor to R um that we had to pay for uh and uh so I did simulations and I was able to kind of generate the intervals and the variability that we needed but the the way that um that R is written and the packages that have been created since that time make it a lot easier to sort of see what's going on to visualize the graphics are so much better um and just just really um it takes away from the the work of having to code right so it felt like a lot of weeds when I was a student that we were just trying to get these technical details and um less Forest of just understanding the big Concepts it makes sense so um for people who don't know what bootstrapping is you should take the course and you will know um but maybe just give a quick explanation for somebody who's never been exposed to that before so for lots of years um we as statisticians have been able to use Theory to understand very variability so how samples vary from one to the next when you're taking samples from a population so for example polling a population to understand uh who's going to be the next president that's right that's right so we understand you know how one sample proportion could vary over lots of samples when the population value is something different right uh and proportions is a great example because that's one we really understand with the theory MH but there's lots of measures out there ways that we summarize the data that don't have theory behind them to understand how the samples could vary from one sample to the next given a population so for those statistics um you know it might be the trimmed mean where you're worried about extreme observations on one end of the other you get some high high uh some observations that are really high and some observations that are really low the trimmed mean is much harder to say things about theoretically um so instead what we do is we sample from the original data from the original sample and it turns out that this process of resampling from the original data is a great approximation for understanding the variability associated with a statistic that uh might not be the the traditional statistics we work withh sure So for anybody who's maybe been through an intro stats course in the past particularly for those people who had maybe a less than optimal experience going through intro stats in the past because they were taught to uh memorize the formula for stand yeah or t competence interval yeah right the bootstrap is actually another way to think about confidence intervals right right right because it's that variability piece and that's what a confidence interval you know buys for you buys for you that variability and and ultimately it's about saying um we think that um 55% of the population is going to vote for this president um but of course there's a lot of uncertainty in there and and so this is about trying to basically quantify how certain or uncertain we are about that that figure that's exactly right great that's exactly right okay cool so let's focus in on the r piece for a second okay um so this course is NR yep um you mentioned that you used to work in s+ yeah um some years ago so maybe just describe if you could um how you first came to um to work with s+ and um what the impact was for you yeah so I used s+ as an undergraduate in the early 90s which um I don't think was particularly common partly because you had to purchase it MH um at some point uh before s+ sort of went out of business they did offer free educational um licenses but um then I went on and did my graduate work and my graduate work had had um simulation pieces to it so I was using s+ there as well uh and then I started my job my current job at Pomona College and um and when I started I also had the institution by s+ licenses but um when when we have students working with a licensed software it gets complicated in terms of how do they get that software onto their own machines how do they get um you know how do they use the software after they leave the institution all those kind of is the institution really ready to support it forever all that kind of stuff um so when R started becoming more viable in terms of the help files and the the user interface I immediately um started using it in my classrooms and so uh I use R in all of my classes so my intro stats to my you know even my theoretical classes I'll have them simulating things so that they kind of understand how the how the theory is showing up um and my students love it I mean they they will write to me many years later and tell me they're using R and their job or their fun activity you know they they simulate like NBA pools and stuff to figure figure stuff out How likely am I to win this poker hand that's right that's right um that that's neat so what what do you think it is specifically um as as an educator about our or or teaching empowering people to to to be able to program things what do you think it is about that that has such a positive impact for students um I think that kind of like my experience as an undergraduate um R is really uh set up to sort of answer questions and so when the students come and they're motivated and they're excited about solving a particular problem and understanding the mechanisms behind those problems I think that um I think that R has just a really nice interface for for the students to to feel accomplished with the problem that they're trying to solve um but before you mentioned like like for you and your experience like um being able to like simulate things and um and and actually visualize the output like that that had a had an impact on you it made abstract things otherwise abstract things more concrete that's right that's right right and and I I agree I think that's true for our students I think that um the graphics in R are outstanding and um and pretty easy for the students to pick up quickly right so uh so when they're they're simulating to understand an abstract concept that visualization is important when they're um when they're just making Graphics of their own data that visualization is also important yeah and I think it's it's an important point because at the end of the day R is a tool unless you're a developer and your focus is the programming itself like um R is a tool and when you're trying to teach stati statistics you want everything else to kind of get out of the way that's right so that they can focus on the ideas that you're trying to convey um and not how to program a for Loop for example right right or or to get some dots on a page right um okay very cool um so why specifically I think this is probably a pretty good segue but why specifically did you want to um build a course on statistical inference of all things um well I'm an educator so I teach at a small uh Residential College and uh we have a lot of face-to-face interactions um I see my my students all the time I get to know them really well and that makes educating a lot easier um I know what they understand I know what they don't understand um and I can I can address those needs but I don't really feel like small liberal arts colleges as fantastic as they are and as wonderful as I could just go on and on about I don't know that they're the the education of the future I don't know you know I think a lot about where education will be in 20 years or 50 years years and um certainly I believe that technology is going to play a big role and so one of the um exciting things about coming to data camp for me was to see that other side and to understand what is the process of building a course how do you make choices about what to talk about and how to preempt the questions that the students might have because I'm not going to see their faces I'm not going to know what's what's going on sort of behind the scenes a much different medium it is it is and I'm not convinced that this is the medium that will exist in 50 years but but I do think that um that things are changing and that technology will will be important or will have a big role yeah absolutely okay very cool um so uh maybe addressing those um beginners out there people who are just getting started with r and statistics um what advice would you have for them so um so my advice for for my students which I think um is mostly uh they're they're quite similar to the students on the data camp platform um is to just do a lot of Statistics do a lot of data analysis um I tell my students for example to you know find some data somewhere there's lots of good P public databases and apis and catle competitions find some data and do an analysis come up with some cool graphics put it on a Blog get a GitHub site um and if you really feel like you're not quite ready to to be actually analyzing data then do things like this take some classes think about you know read read blogs like simply statistics or um um I don't know who else but but there but there are other other good blogs and and uh and just kind of keep trying things uh I think that's the best way to kind of think about whether it's right for you and how you can contribute to the field go make a bunch of mistakes and very publicly there you go yeah yeah absolutely um cool actually so I want to go back back to the last question okay um so I I'm I'm curious so obviously uh we are actively trying to figure out what the future of education is as well and I and I I wholeheartedly agree with you it won't look the same today whether it's Pomona or whether it's data Camp right it's not going to look the same today uh in 20 years as it does today right um so I'm curious of no neither of us have a crystal ball um and we're trying to figure this out every single day that's what we're doing but um what do you think it maybe uh broadly speaking what are the characteristics of um of a successful educational system or educational process that you think we should be striving toward um over the next 20 years like what is missing today that you hope to see in the future that we could do better at we in the collective well okay I mean you know that might come from a particular perspective so I'm going to say the things from my perspective which is that I actually believe that the liberal arts model is the best way to teach I I think that the the one-on-one interactions and the research projects that we do and the the class projects that the students do where I'm giving them you know really good feedback on every single Paper um is invaluable it just doesn't scale that's the problem with it there's no possible way for it to scale and whether we're talking about us education or we're talking about internationally right there's just um a disconnect there so so I wonder how we can integrate that piece into it and I know that on some of the other online platforms they do things like um peer-to-peer grading peer-to-peer feedback and I wonder if maybe that's a piece of it is some kind of scaling of um you know communicating about ideas that are not fully formed or ideas that are wrong um you know I'm a huge uh user of things like sack exchange right so so that's sort of that platform of getting good feedback I mean it's not always perfect but getting getting good feedback quickly and often I think uh is pretty important in a in a way where other people can actually benefit from a response that was directed at your at you questions you had my query right yeah yeah um no I think I think that's a great that's a great point and um yeah I think realistically it's it's going to be a combination of all of these things in the future there is no Silver Bullet when it comes to education technology has its place um getting toe-to-toe with other human beings also has its place right um but it will be really exciting to see how things um how things develop over the next couple of decades because I I I absolutely agree with you things will be different yeah yeah okay cool that's all I so thanks to and it's been really nice having you around yeah it's been fun all right take care\n"