#83 Empowering the Modern Data Analyst (with Peter Fishman)

The Importance of Passion and Analytics in Career Development

For many individuals, having a passion for a particular field can be a key driver of motivation and success. In the context of career development, particularly in the realm of data analysis, this passion can play a crucial role in shaping one's approach to work. As Peter mentioned, he had a passion for baseball and football, which led him to read extensively about analytics in these spaces. This level of dedication and enthusiasm ultimately drove him to explore analytical thinking and questioning breakdowns in these fields.

Peter's experience is echoed by Michael Lewis's book "Moneyball," which provides a compelling example of how the same principles of analytics can be applied across different disciplines. By breaking down complex problems into their component parts, identifying areas for improvement, and implementing targeted solutions, organizations can gain a competitive edge. This approach is equally relevant in the early stages of startups, where entrepreneurs must navigate uncertain markets and make informed decisions about resource allocation.

The Rise of Citizen Data Scientists

A significant trend shaping the modern data landscape is the emergence of citizen data scientists – individuals who possess technical skills, such as SQL or Python, but may not have traditional data engineering backgrounds. This shift has far-reaching implications for companies looking to harness the power of analytics. As Peter noted, savvy individuals from various departments are now writing code and driving data-driven decision-making processes.

The democratization of data analysis is opening up new opportunities for roles that were previously non-tech oriented. BizOps, marketing ops, and other teams are now equipped with the skills to tackle complex data problems, creating a more level playing field in the industry. This trend has significant implications for companies seeking to solve their data problems, as they can now tap into a wider pool of talent.

The Democratization of Data Infrastructure

Another trend gaining momentum is the democratization of data infrastructure. In the past, accessing advanced data tools and technologies often required significant investments – upwards of $2-3 million. Today, these resources are becoming more accessible, with many solutions available for a fraction of the cost. This shift has enabled smaller companies to compete with larger enterprises, which previously had a monopoly on data-driven insights.

The impact of this trend is multifaceted. With more organizations adopting data infrastructure, the types of roles that were once exclusive to large corporations are now becoming more mainstream. This shift is also driving greater hybridization of jobs, as individuals from various backgrounds come together to tackle complex data problems.

Datacamp's Approach to Empowering SMBs

At Datacamp, they've seen firsthand how these trends can be harnessed to support small businesses and startups. By empowering organizations with the tools and expertise needed to drive data-driven decision-making, companies like Datacamp are helping to level the playing field in the industry. Their approach recognizes that even smaller businesses can benefit from advanced data infrastructure, which is why they offer onboarding solutions designed to get SMBs up and running within an hour – without requiring extensive data engineering support.

Conclusion

As we close out our conversation, Peter offered some final thoughts on the future of analytics in the modern data stack. The rise of citizen data scientists, democratization of data infrastructure, and hybridization of jobs are all trends that will continue to shape the industry in the years to come. With more organizations embracing advanced data tools and technologies, the distinction between large corporations and smaller businesses is becoming increasingly blurred.

As we look ahead to the future, it's clear that passion, analytical thinking, and a willingness to adapt will be essential for success in this rapidly evolving landscape. By staying connected with Datacamp and exploring their resources, organizations can tap into a wealth of expertise and knowledge, empowering them to unlock the full potential of data-driven decision-making.

References:

Lewis, M. (2003). Moneyball: The Art of Winning an Unfair Game. W.W. Norton & Company.

Peter's Interview Notes

Interviewer: Peter

Topic: Data Analytics in Modern Organizations

Key Takeaways:

* Passion and analytical thinking are essential for success in data analytics.

* The rise of citizen data scientists is democratizing access to advanced data tools and technologies.

* Democratization of data infrastructure is enabling smaller businesses to compete with larger enterprises.

* Hybridization of jobs is driving greater collaboration between individuals from various backgrounds.

Final Call to Action:

For organizations looking to get started on their data journey, Peter recommends exploring Datacamp's resources. With their onboarding solutions designed to support small and medium-sized businesses, Datacamp is empowering a new generation of data-driven leaders.

"WEBVTTKind: captionsLanguage: enyou're listening to data framed a podcast by data camp in this show you'll hear all the latest trends and insights in data science whether you're just getting started in your data career or you're a data leader looking to scale data-driven decisions in your organization join us for in-depth discussions with data and analytics leaders at the forefront of the data revolution let's dive right in hello everyone this is adele data science educator and evangelist at datacamp the past few years have seen an incredible addition of new tools and frameworks that empower even the smallest data teams to do more these tools are often what is referred to as the modern data stack one aspect of the modern data stack is that it empowers practitioners like data analysts to deliver insights improve value at a much faster scale this is why i'm excited to be speaking with peter fishman ceo of mozart data mozart data empowers data analysts by providing them with out-of-the-box data warehouses that allow anyone to connect their disparate data sources easily apply simple transformations and start analyzing data all without any data engineers throughout our conversation we speak about his experience launching mozart data the trials and tribulations most data teams face when trying to hit the ground running the skills modern data analysts need to have the importance of developing subject matter expertise analytics roles and more if you enjoyed this podcast make sure to subscribe and rate the show but only if you enjoyed it also if you're interested in the modern data stack and want to transition your local notebook environment to a cloud-based collaborative environment i highly recommend checking out datacamp workspace where you will be able to code in python and r and use a bunch of templates and data sets to get you started in data science right in the browser now let's dive right in peter it's great to have you on the show i am excited to talk to you about the modern data stack the skills that define a successful data analyst today and more but before can you give us a bit of a background about yourself and how you ended up where you are today great to be here i'm pete fishman i'm the co-founder and ceo of mozart data like many people in the data space i'm something of a failed academic that transitioned into the world of sort of applying statistical experience and putting that into technology so i've been working at startups for the last sort of decade plus mostly in data functions and ultimately decided myself and my friend dan decided to build basically ourselves as a service and then we built mozart data which we called the easiest way to spin up a modern data stack that's great so can you walk us through how these experiences that you've had across industry and academia led you to launch mozart data and can you walk us through the challenges mozart data tries to solve there's a long thread there because it does sadly capture many many many years but there is a lot of consistency in the theme so what has basically happened is that data has become essentially like bigger over time not just the sort of buzzword of big data but basically the computing power ultimately has a lot of downstream effects like people can collect more data because they can get more value out of that data my sort of arc looks like i was really doing very early empirical work in grad school when obviously statistics have been around for a very very long time but the first time where you could really use hundreds of thousands or millions of observations you know today doing analysis with millions of observations is not just like trivial people would eye roll that but for me that was kind of the size of the data sets that i was working with during my phd program which at the time was almost unthinkably large sort of exceeding whatever excel could do but what ultimately has happened is that you know you find insight in the data and then companies figure out ways to take advantage of it and then you have to go find that next insight in the data so i started my career uh in the sort of facebook game space where a lot of these companies competed over using data in novel ways facebook had billions of users so as a result the data sizes and volumes were gigantic and you could make really novel insights and we started doing a lot of really really paid very close attention to cacs and ltvs and the game was to build a virtuous cycle of buying people eyeballs very efficiently and then sort of feeding that into monetizing and getting more people onto your platform and getting a virtuous cycle going i then saw the opportunity to deploy that into the b2b world so the sort of defining part of my career was this company yammer at yammer we took a lot of the b2c approach to software development and then applied it in the b2b world and the bottom up sas world which didn't really exist at the time but it calls for a lot of understanding what your users are doing and understanding sort of the attractiveness of the prospects as a function of who's actually using your product and that required data folks and not only that data infrastructure so i built a tool at yammer called avocado along with a really great team avocado today is really mozart data plus mode analytics and from there you know have had a lot of different opportunities to have similar data infrastructure at different companies before ultimately deciding to build it myself that's very exciting and i'm excited to unpack this further before though i want to set the stage for the landscape data teams are working in today and the dynamics that really led to the launch of mozart data as you said and i completely agree with this notion that data science has become table stakes and no longer and nice to have so i wanted to start off our chat by first asking how would you define a data driven organization and how can an organization integrate data science as a table stakes practice today sure so i think um most people have a mental image of a data driven organization as one with lots of tvs all over the office now there are non-existent offices and those tvs have time series of kpis and people just walking around the building understand what's going on with the company by observing the time series of the kpi i set up a strong man but i very deeply disagree with that so the first thing i'd say is very few sort of canned ways of looking at the data often provide the necessary insights that you're talking about a data-driven organization is one where data has a very important part at the key sort of decision making tables that can mean a very senior executive that's a data person that can mean that data starts every meeting that can mean that data analysts have access to all sorts of key decision makers or ultimately data becomes the key decision maker more so than than the word strategic i often find that non-data-driven organizations often talk about strategic investments ones that almost can't be justified in the data when you start a company when it when there's zero ideas zero data zero people or one of each of those things you end up really needing to actually be strategic you actually need to you need to sort of imagine a world that doesn't exist that cannot be justified by backwards looking and you need to essentially apply your own direction and and thought and beliefs now data can inform that i mean one of my favorite examples when i again worked at this facebook games company called platum we used to sometimes run advertisements on games that were basically half finished and though you couldn't make like a statistically highly confident conclusion about how effective or successful the game might be you could get a flavor for how difficult it would be to maybe acquire users so your belief about that could be tested even before the game existed so that's not to say at a super early company you have to only go on gut and strategy but what i think of as a data-driven organization is one that you know data is a first-class citizen but not just that they collect data and they have dashboards and that they look at time series and they can go to bed at night because they know that their company is going up and to the right but that rather key decisions are informed by data and cuts and dives and summaries and models of the data so to double down on your point here a data-driven organization is where data becomes a habit across the decision-making life cycle rather than something to look at absolutely so what are the main challenges affecting organizations today who truly want to make the most of their data the way that an organization gets to a place where it can be data driven is by not being data driven so the success that brought you here is not going to be a data driven success it's going to be a success that's driven by often the founders but typically sort of beliefs about the world that you know couldn't necessarily be justified at the time that end up actually proving out to be true you typically have this headwind of the thing that brought you here was you weren't data-driven how organizations become data-driven tends to be an underlying belief that our organization must be data-driven and not because a venture capitalist has told you to be data-driven not because the world and the podcast you listen to tell you to be data driven but rather because you ultimately truly believe that the signals that the world is giving you is going to be more informative when sort of aggregated and summarize the right way i teach a class sometimes at berkeley where i did my phd and i go back and i put up a bunch of different ads from games that we ran on facebook and i said which of these is the most effective which is going to get the best clicks people raise their hands not indiscriminately but they have some that they like so and the ones that they like actually tend to be sort of the better ones but when you show the ad to 100 million people their opinion is correct more so than any true experts and i think what you need to do is maybe develop that muscle over time now that is not to say that if you haven't done that if you haven't had it beaten into you that you know you really need to be thinking about the data think about the data the right way and using the data you can still very quickly adopt that if i go back to my time at yammer we had two very strongly opinionated leaders two co-founders david sacks and adam fizzoni who have a ton of intuition they're they're famously very talented at product and technology and they would have a ton of intuition and it was in fact that intuition that made yammer an attractive company for me to join but early in my career actually in the first three months of my career we ran an a b test on the new user flow which went counter to both of their intuitions and we we did it like almost haphazardly by accident but it really set my career up for success because the results were very clear and slightly counterintuitive and you very very rarely see that in technology you know i think even data people like to hype oh you run these experiments and you get these counter-intuitive results and your company becomes better that happens rarely much more often you get no results off of things that you think are almost certainly going to work rather than you get counter-intuitive statistically significant results that's happened to me not a handful of times my career but very very very few times in my career and and it just happened that it was in an early part of my time at yammer which basically changed their whole perspective on how important it was to run a b tests when releasing products and it became an essential part of the release criteria and i think ultimately that was a little bit of chance but a lot of open-mindedness of those two folks both of whom are now investors in mozart data but on top of it i think it takes like either it's at your core or you get a very clear lesson and then that's how you become sort of a data-driven organization that's awesome and i wish we can dedicate an entire episode just to impact your experience at yammer and working with people like david sacks now of course a key component of becoming data driven as an organization is the set of tools and supporting infrastructure that enables faster time to insight this is what often is referred to as the modern data stack i'd love it if you can break down what you think is meant by the modern data stack and what are the characteristics that differentiate it from the previous set of tools data teams are used to the modern data stack is not really all that modern the modern data stack is a modernization of existing data tools and data pipeline tools that have been around for a very long amount of time the branding on it is great because i hear the words all of the time and it is well deserved in some level which is to say cloud data warehousing has become like uh ubiquitous in the sort of users of data space so the first thing you know i'd say is that there are these powerful columners that are sort of able to crunch again giant amounts of data not the types of not the data sizes that i was working with 20 years ago but like real joins on on huge data sets so what that does is that enables you to use data from multiple places um so what a modern data stack is effectively not too different than what a vlookup in excel would be which is to say it's joining data from multiple places the stack that gets you there is an el tool a powerful data warehouse and a t a transform layer so a layer to essentially clean up your data so you have to extract and load data from many different sources and then you have to clean and transform it so elt the data so when people talk about the modern data stack they are talking about elt but t now has a big meaning cleaning everybody everybody's always known that cleaning is a huge part of what a data person does my old boss at microsoft was ronnie cohabi who has a joke and i don't know if it's his joke but i know he loves to use it which is to say you know 95 of data science is cleaning data and only five percent of data science is complaining about cleaning data which is to say he he he lands the punchline a little bit better than i do but what he's saying is that actually people think oh you know it's all building these incredible models off of these beautiful data sets that you compete on or are given and in practice actually so much of the work is cleaning and making sure the data is right or consistent and and so little of the the sort of work that a data person does is real data analysis and certainly not zero percent but the joke lands better when you expect the answer to be five percent but only it turns out it's actually just about complaining is the rest of the time if i think about sort of what the modern data stack is is now all of these tools that represent the sort of cleaning layer and and it's not just essentially scheduled tables it's it's a variety of different parts of making sure that the data that you are looking at downstream whether that's in your bi tool most likely is actually from has essentially traveled without sort of any problems an exciting part of the modern data stack for me is really kind of the emergence of new categories within the data stack for example last year we interviewed bar moses ceo of monte carlo and how they're trailblazing the data observability category what are some of the categories and tools you've seen emerge over the past few years that you've been excited about sure of course i'm gonna say a managed data pipeline i think is the coolest category and i happen to love one particular company in that however beyond that there are a variety of tools that get a little bit more what i call like upstream and up market into larger companies that have larger data teams that are using their data in a variety of ways but ultimately once you have loaded your data into your warehouse there's a variety of things there's data observability there's data cataloging i i kind of remember way back in the day you know we used to have columns revenue underscore final underscore the one to use underscore you really want this one v6 you know like and what i think is obviously the ability for larger data teams to come in and understand the world typically quickly which is you actually you know what you find is once you actually have a mature data organization it might take someone weeks to come in or months even to come in and understand the stack and dj patil has a line about his time at linkedin which was so much about being successful as a data scientist at linkedin was about getting a win in your first 90 days and if it takes you 90 days to get up on the stack or 89 days you better like be amazing you better be able to find something incredible in in one day whereas if it takes you a week or a day or an hour to get up on the stack well now you have a real chance to be successful at that company so there's a proliferation of tools that sort of really savvy companies like a linkedin like a yammer you know all built and used obviously airbnb has built a number of the famous ones and what those tools were about were making data people effective so you and now a lot of companies have sprouted up in terms of building those toolings that these companies spent countless airbnb probably spent hundreds of millions of dollars on not that it mattered but they spent countless millions of dollars developing now making that accessible to companies that don't have the budgets of airbnb or facebook or whomever so i see a lot of sort of development in that space obviously other categories that are popping up that are you know reverse etl is a really a great example of a downstream one that we had built bottom up sas world and at like subscale right so now having services that will do this or having services that do extract and load i think are really really important for companies where does mozart data fit within the modern data stack and how does it solve you know some of the challenges we've discussed thus far and can you walk us through some examples of mozart data in action mozart data basically is an all-in-one data platform so what that means is in under an hour you can start connecting multiple data sources we spin you up a snowflake data warehouse and you can start writing um your transforms and connecting a bi tool or reverse ctl tool and start to get insights so the real sort of magic is that this used to take months and a number of data engineer hires or you do a lot of vendor assessment and then pick your you know potpourri of vendors to do it or you hire a consultant to do it today this can all be done in effectively no time and by the time you're sort of done with the demo you could be up and running and querying your data in your favorite bi tool really there is this challenge of this speed to insight and mozart wants to empower not just like very savvy data engineers but rather everyone in the sort of data landscape to be up and running with this modern data stack all very quickly all without being gated by engineering what i love about mozart data is how much it empowers data analysts and citizen data analysts to get started quickly with data and provide value quickly without depending on data engineering or infrastructure work you know you're someone who's led data teams worked with a lot of data analysts while developing mozart data and more i'd love it if you can break down how you think the data analyst role has evolved over the past few years and where do you see it heading in the future right at the time where the the term data science again like jeff heimerbacher and dj patel sort of kicked off this term data scientist and then the incredibly rapid growth in that profession happened the title data scientist was being applied everywhere in the data space and the reason was because working as a data scientist basically meant that you got paid a lot more than working as a data analyst so everybody started co-opting the term and then you saw it to represent you know you had folks that were doing ml engineering all the way to folks that were maybe just out of college working with data for a first time all holding this title data scientist and it sort of represented a vastly different set of skills all encompassed by the same title and different it meant a different thing at different companies today you see much greater granularity of that you see people that hold revops or bizops titles you see folks where their specific expertise is distinguished so an analytics engineer is someone that's very different from a data engineer and you know a data scientist today has a specific role within a company a data analyst tends to have a specific role now we still see a lot of if you had a venn diagram of the skill sets a lot of that would overlap and i think actually the best i don't think that one title is like there's no greater than sign i think a lot of the core skill set ends up being the same like what makes for like a really great data scientist actually makes for a really great marketing ops analyst which is to say sort of a deep understanding of of causal relationships of inference and like it's a different set of technical skills it's obviously it's a different role within the the company in the organization you do different things on a day in day out basis but the core is still about sort of data thinking and data capabilities rather than specific technical expertises i completely agree here especially since there's a layer of skills that's a certain extent in variant as the role evolves over the time what do you think are the defining skills data analysts should cultivate to become successful in a modern data team today i'm a little bit biased because i spent a big part of my 20s sort of thinking about really causality so i did a phd in economics i studied behavioral economics and what was typically true was you would get great data sets that were not generated by experiment so data sets where something you know you you measured things over time and you sort of had an understanding of an individual with an id over time but you didn't necessarily have what you really wanted which is to run a scientific experiment put people in condition and condition b and then have hypothesis and see which ones wins out when you don't have that you have to basically do almost statistical tricks you have to think about okay what is something like an experiment and i think often that this is one of the most like underrated skills in data to really think about you know what you're trying to do with your data is essentially assign a causal relationship based on the past that you think applies in the future for a number of reasons right you think that there was a mechanism that brought it that still exists today so i think people that have really that deep thinking about like understanding causal relationships and understanding what typically is wrong with data so the classic example is you say okay well drowning deaths are always up in the months where ice cream consumption is up it's like obviously all novices say well that's because in the warm months people are eating ice cream and they're going to the beach or they're going to the pool and of course and they they realize that that's actually not the causal mechanism but then then you divorce it from that specific joking context and then you bring it into a world where many things are going on and your job depends in some sense the value you bring to the company depends on identifying a relationship that you think moves maybe the companies whether it's their marketing their business their product their users forward and then you start abandoning that critical perspective so in general what i like is a set of almost dismantling of good work thinking about all the ways in which a good insight or good work could be flawed maybe somebody did a robustness check that sort of proved that it wasn't flawed but at the very least when you read it can you be you know or look at the work that was done can you be skeptical and say okay well maybe it's mostly driven by something that won't necessarily repeat itself because a lot of these when they do replication studies and you know when i worked at microsoft i worked at bing and and bing you know you had the huge luxury of not just you know millions not just billions trillions of observations and you know you could keep tests running and get inference from there so i think like inference is the big skill but then also inference with small data is also a real skill it's a little confusing because typically you can't make inference with small data so you know if you see one observation or n of one or two literally you can't make a valid statistical inference from that but really having a deep thought about mechanism and how you would set it up to actually learn that answer in a space where you're pretty constrained by database what you find is and and we found this a bing that even when your data size is infinity you always want to cut it and cut it and cut it and cut it and cut it to a smaller and smaller cohort to make a more and more precise inference and without fail you run out of data even when the data seems like the the size is infinity i think two skills to me often are the most underrated it's the ones that that i think people should develop and work on and it's also the ones that we interview for not just at mozart but at a lot of the places that i've worked and being able to make these inferences and spot these causal relationships within the data set requires a lot of subject matter expertise you know oftentimes what's missed in the discourse around upskilling and breaking into tech is subject matter expertise and domain knowledge especially to be able to succeed in analytics roles and data roles can you comment or expand on the importance of subject matter expertise in a data role and how it has helped you in your career well just literally picking this picks up great like you mentioned off of the last question which is if your key insight is thinking about the right mechanism that is driving the causal relationship you're ascribing to your data then actually understanding what your users are doing and what motivates your users is critical so again i i worked at yammer we were we were the biggest per capita consumers of our product you know as a company so it's not surprising dan and i my co-founder of mozart data he and i 13 years ago started a hot sauce company we were also the number one consumers of that hot sauce so subject matter expertise is 100 like a table stakes thinking that you have to bring in order to understand those relationships now the flip is sometimes that deeply works against you so it's not linear up it's not necessarily just concave as in as you get more and more subject matter expertise this first derivative remains positive you can find that sometimes you are so deep in your world you are missing what the typical user is doing and actually a lot of times in past jobs we've had that problem where we are the right tale of usage and expect everybody to understand kind of some of the subtle things that are going on within the tool and what you find is that people have a surprisingly surface willingness to pay attention you know you're the most important thing to you and a lot of times you can build software and to you it's incredible but to the typical user that isn't willing to make that same investment into learning all of your nuances it might not be the case so subject matter expertise first of all is the table stakes to get started you can't reasonably understand the mechanisms that are driving your user base without understanding your users in the first place that's why you often see companies like airbnb and uber consumer companies people that work there are just nuts about using those products brian chesky famously stayed in airbnb for one whole year didn't have an apartment and that was a critical part of essentially developing domain next you know it's yes it's about empathy for the customer but also it's also about tuning that domain expertise and everybody i knew that worked and ride sharing was taking ride chairs everywhere if they had to go across the street they'd take a ride share i think it's developing not just that subject matter expertise but also the real kind of getting to your users mindset so given your experience in startups and working in smaller organizations how do you instill that subject matter expertise in early stage startups when they don't necessarily have these massive user bases when they make hires for example so i worked at a company called open door and open door when i was there was largely buying and selling homes in phoenix and i had no desire to buy or sell i didn't own a home in phoenix but i didn't have a desire to buy a home in phoenix obviously now they're in many many many more markets and i didn't have the ability to gain expertise in essentially that buyer's journey because i i never went through it you're not always gifted the situation that i discussed with the consumer companies where if you're a data scientist that say facebook your dog fooding it all the time i think the key is one obviously if you can do that it is a huge advantage and if you can't i think you really want to disproportionately invest in talking with you know sort of yc tropes which are talk to customers talk to customers talk to customers so i do think that sitting down observing customers talking to customers talking to prospects that rejected you all of those things are trying to up your you know your knowledge now the the flip is i'm now selling a product that de facto i've worked on for 20 years so you know your subject matter expertise is not necessarily one that happens the second that you sign your offer letter your subject matter expertise hopefully you're leveraging you know in my case over 40 years of subject matter expertise but beyond that you want to be able to really understand your customer whether that customer is you whether that's exhaustive research you shouldn't think of your title as well my title says data in it so i've got to be in a back corner doing data a lot of the term that i like to use is use your feet which is talk to the product or customer facing folks in your organization or if you can talk to customers that's great and flipping the question slightly if i am a data analyst breaking into a new vertical whether at a startup or an enterprise what is the fastest way for me to develop subject matter expertise so i do think adjacent problems can be helpful i mean i've loved reading nate silver for the longest time and i do think sort of reading folks that think about data the right way i started my career in the nfl as a statistician not as a player you know i had been into sports statistics for my whole life and you know i think that there were a lot of parallels to the thinking about baseball famously had solved a lot of these sort of real like problems of figuring out what had tight relationships with performance one had you know predictability etc but that thinking then very deeply motivated and i was excited about i i just had a passion for it i read up a lot about it and i think if there's you know an analytics in a space that you love now for me that was baseball and football and there was now tons of material at the time there was just sort of limited amounts of material but if you can find those people that love writing really sadly in the spaces that you love i think that you're gonna find good analytical thought questioning breakdown and that's gonna apply to whatever discipline that you're gonna do i mean reading moneyball my favorite book from michael lewis really is the same type of thinking that i would give early stage startups essentially about it's the fact though the same you know advice yc gives which is write down the equation of success and then break it into its component parts and then measure those parts and then dive into when one of these pieces is not working cohort and cut and summarize and that's how you start analytics anywhere but it certainly was how the a's did it 20 some odd years ago when they were trying to compete with bigger market teams that's awesome and as we close out our conversation i'd love if we can think about the future for a bit and what you think are some of the trends that are really going to shape how individuals and organizations work with data i'd love it if you can list some of the trends that you're particularly excited about when it comes to the modern data stack and what and how it will affect data driven organizations i think we touched on one of them which is the real rise of the citizen data scientist first of all you see a bunch of savvy people that write sql that don't have data titles you know biz ops marketing ops all of these sort of writing sql or python or something like that is just not uncommon in roles that were almost exclusively non-tech i think like that is a really exciting moment for anyone in the data space because the data space is now opening up to many many many more roles at companies and many many many more people have the chops to do something to be a little dangerous with their data i think this is a great trend for companies trying to solve the data problem for smbs and obviously i'm very excited about one of those companies mozart data the other thing the other part of the trend that i'm excited about that sort of also relates to mozart data is it used to cost you like hire a couple of data engineers and buy a bunch of expensive infrastructure and you might be out two three four five million dollars just to get started on your data journey today it's a six dollar swipe of a credit card and you're off to the races now it's metered and your bills become significant your investment in data ultimately becomes significant but the fact that you can get started for close to nothing is incredible it is a huge difference so if you think about the types of companies those are companies that could really afford a multi-million dollar investment in data so that they could have that advantage we're the largest companies you could only get jobs at the biggest companies because those are the companies that had the data teams those are the companies that were leveraging the data and could effectively take advantage of their scale in applying those data insights today this is becoming table stakes earlier and earlier so more and more companies like ours not just ours but like ours are really empowering and enabling the smb to use data infra the types of data tooling that i see more upmarket in fact generally i find data stacks to actually be stronger downstream before there's a dozen sources of truth you know it's actually kind of a little bit paradoxical which is actually almost the more constrained your budget the more likely you are to end up with effectively a tighter data stack that's great and i love that first trend especially and this is something that we've definitely seen at datacamp with the hybridization of jobs and the emergence of data skills and you know traditional roles like finance marketing and a lot more now finally peter i had an awesome chat with you today do you have any final call to action before we wrap up yeah obviously rooting for so many people in their data journeys and we love helping small companies at the start of their data journey get up and running on their data infrastructure all in under an hour without really needing any data engineering support so if you're interested in that we'd love to talk to you at mozart data so i'm pete mozartdata.com that's awesome thank you so much peter for coming on the podcast you've been listening to data framed a podcast by data camp keep connected with us by subscribing to the show in your favorite podcast player please give us a rating leave a comment and share episodes you love that helps us keep delivering insights into all things data thanks for listening until next timeyou're listening to data framed a podcast by data camp in this show you'll hear all the latest trends and insights in data science whether you're just getting started in your data career or you're a data leader looking to scale data-driven decisions in your organization join us for in-depth discussions with data and analytics leaders at the forefront of the data revolution let's dive right in hello everyone this is adele data science educator and evangelist at datacamp the past few years have seen an incredible addition of new tools and frameworks that empower even the smallest data teams to do more these tools are often what is referred to as the modern data stack one aspect of the modern data stack is that it empowers practitioners like data analysts to deliver insights improve value at a much faster scale this is why i'm excited to be speaking with peter fishman ceo of mozart data mozart data empowers data analysts by providing them with out-of-the-box data warehouses that allow anyone to connect their disparate data sources easily apply simple transformations and start analyzing data all without any data engineers throughout our conversation we speak about his experience launching mozart data the trials and tribulations most data teams face when trying to hit the ground running the skills modern data analysts need to have the importance of developing subject matter expertise analytics roles and more if you enjoyed this podcast make sure to subscribe and rate the show but only if you enjoyed it also if you're interested in the modern data stack and want to transition your local notebook environment to a cloud-based collaborative environment i highly recommend checking out datacamp workspace where you will be able to code in python and r and use a bunch of templates and data sets to get you started in data science right in the browser now let's dive right in peter it's great to have you on the show i am excited to talk to you about the modern data stack the skills that define a successful data analyst today and more but before can you give us a bit of a background about yourself and how you ended up where you are today great to be here i'm pete fishman i'm the co-founder and ceo of mozart data like many people in the data space i'm something of a failed academic that transitioned into the world of sort of applying statistical experience and putting that into technology so i've been working at startups for the last sort of decade plus mostly in data functions and ultimately decided myself and my friend dan decided to build basically ourselves as a service and then we built mozart data which we called the easiest way to spin up a modern data stack that's great so can you walk us through how these experiences that you've had across industry and academia led you to launch mozart data and can you walk us through the challenges mozart data tries to solve there's a long thread there because it does sadly capture many many many years but there is a lot of consistency in the theme so what has basically happened is that data has become essentially like bigger over time not just the sort of buzzword of big data but basically the computing power ultimately has a lot of downstream effects like people can collect more data because they can get more value out of that data my sort of arc looks like i was really doing very early empirical work in grad school when obviously statistics have been around for a very very long time but the first time where you could really use hundreds of thousands or millions of observations you know today doing analysis with millions of observations is not just like trivial people would eye roll that but for me that was kind of the size of the data sets that i was working with during my phd program which at the time was almost unthinkably large sort of exceeding whatever excel could do but what ultimately has happened is that you know you find insight in the data and then companies figure out ways to take advantage of it and then you have to go find that next insight in the data so i started my career uh in the sort of facebook game space where a lot of these companies competed over using data in novel ways facebook had billions of users so as a result the data sizes and volumes were gigantic and you could make really novel insights and we started doing a lot of really really paid very close attention to cacs and ltvs and the game was to build a virtuous cycle of buying people eyeballs very efficiently and then sort of feeding that into monetizing and getting more people onto your platform and getting a virtuous cycle going i then saw the opportunity to deploy that into the b2b world so the sort of defining part of my career was this company yammer at yammer we took a lot of the b2c approach to software development and then applied it in the b2b world and the bottom up sas world which didn't really exist at the time but it calls for a lot of understanding what your users are doing and understanding sort of the attractiveness of the prospects as a function of who's actually using your product and that required data folks and not only that data infrastructure so i built a tool at yammer called avocado along with a really great team avocado today is really mozart data plus mode analytics and from there you know have had a lot of different opportunities to have similar data infrastructure at different companies before ultimately deciding to build it myself that's very exciting and i'm excited to unpack this further before though i want to set the stage for the landscape data teams are working in today and the dynamics that really led to the launch of mozart data as you said and i completely agree with this notion that data science has become table stakes and no longer and nice to have so i wanted to start off our chat by first asking how would you define a data driven organization and how can an organization integrate data science as a table stakes practice today sure so i think um most people have a mental image of a data driven organization as one with lots of tvs all over the office now there are non-existent offices and those tvs have time series of kpis and people just walking around the building understand what's going on with the company by observing the time series of the kpi i set up a strong man but i very deeply disagree with that so the first thing i'd say is very few sort of canned ways of looking at the data often provide the necessary insights that you're talking about a data-driven organization is one where data has a very important part at the key sort of decision making tables that can mean a very senior executive that's a data person that can mean that data starts every meeting that can mean that data analysts have access to all sorts of key decision makers or ultimately data becomes the key decision maker more so than than the word strategic i often find that non-data-driven organizations often talk about strategic investments ones that almost can't be justified in the data when you start a company when it when there's zero ideas zero data zero people or one of each of those things you end up really needing to actually be strategic you actually need to you need to sort of imagine a world that doesn't exist that cannot be justified by backwards looking and you need to essentially apply your own direction and and thought and beliefs now data can inform that i mean one of my favorite examples when i again worked at this facebook games company called platum we used to sometimes run advertisements on games that were basically half finished and though you couldn't make like a statistically highly confident conclusion about how effective or successful the game might be you could get a flavor for how difficult it would be to maybe acquire users so your belief about that could be tested even before the game existed so that's not to say at a super early company you have to only go on gut and strategy but what i think of as a data-driven organization is one that you know data is a first-class citizen but not just that they collect data and they have dashboards and that they look at time series and they can go to bed at night because they know that their company is going up and to the right but that rather key decisions are informed by data and cuts and dives and summaries and models of the data so to double down on your point here a data-driven organization is where data becomes a habit across the decision-making life cycle rather than something to look at absolutely so what are the main challenges affecting organizations today who truly want to make the most of their data the way that an organization gets to a place where it can be data driven is by not being data driven so the success that brought you here is not going to be a data driven success it's going to be a success that's driven by often the founders but typically sort of beliefs about the world that you know couldn't necessarily be justified at the time that end up actually proving out to be true you typically have this headwind of the thing that brought you here was you weren't data-driven how organizations become data-driven tends to be an underlying belief that our organization must be data-driven and not because a venture capitalist has told you to be data-driven not because the world and the podcast you listen to tell you to be data driven but rather because you ultimately truly believe that the signals that the world is giving you is going to be more informative when sort of aggregated and summarize the right way i teach a class sometimes at berkeley where i did my phd and i go back and i put up a bunch of different ads from games that we ran on facebook and i said which of these is the most effective which is going to get the best clicks people raise their hands not indiscriminately but they have some that they like so and the ones that they like actually tend to be sort of the better ones but when you show the ad to 100 million people their opinion is correct more so than any true experts and i think what you need to do is maybe develop that muscle over time now that is not to say that if you haven't done that if you haven't had it beaten into you that you know you really need to be thinking about the data think about the data the right way and using the data you can still very quickly adopt that if i go back to my time at yammer we had two very strongly opinionated leaders two co-founders david sacks and adam fizzoni who have a ton of intuition they're they're famously very talented at product and technology and they would have a ton of intuition and it was in fact that intuition that made yammer an attractive company for me to join but early in my career actually in the first three months of my career we ran an a b test on the new user flow which went counter to both of their intuitions and we we did it like almost haphazardly by accident but it really set my career up for success because the results were very clear and slightly counterintuitive and you very very rarely see that in technology you know i think even data people like to hype oh you run these experiments and you get these counter-intuitive results and your company becomes better that happens rarely much more often you get no results off of things that you think are almost certainly going to work rather than you get counter-intuitive statistically significant results that's happened to me not a handful of times my career but very very very few times in my career and and it just happened that it was in an early part of my time at yammer which basically changed their whole perspective on how important it was to run a b tests when releasing products and it became an essential part of the release criteria and i think ultimately that was a little bit of chance but a lot of open-mindedness of those two folks both of whom are now investors in mozart data but on top of it i think it takes like either it's at your core or you get a very clear lesson and then that's how you become sort of a data-driven organization that's awesome and i wish we can dedicate an entire episode just to impact your experience at yammer and working with people like david sacks now of course a key component of becoming data driven as an organization is the set of tools and supporting infrastructure that enables faster time to insight this is what often is referred to as the modern data stack i'd love it if you can break down what you think is meant by the modern data stack and what are the characteristics that differentiate it from the previous set of tools data teams are used to the modern data stack is not really all that modern the modern data stack is a modernization of existing data tools and data pipeline tools that have been around for a very long amount of time the branding on it is great because i hear the words all of the time and it is well deserved in some level which is to say cloud data warehousing has become like uh ubiquitous in the sort of users of data space so the first thing you know i'd say is that there are these powerful columners that are sort of able to crunch again giant amounts of data not the types of not the data sizes that i was working with 20 years ago but like real joins on on huge data sets so what that does is that enables you to use data from multiple places um so what a modern data stack is effectively not too different than what a vlookup in excel would be which is to say it's joining data from multiple places the stack that gets you there is an el tool a powerful data warehouse and a t a transform layer so a layer to essentially clean up your data so you have to extract and load data from many different sources and then you have to clean and transform it so elt the data so when people talk about the modern data stack they are talking about elt but t now has a big meaning cleaning everybody everybody's always known that cleaning is a huge part of what a data person does my old boss at microsoft was ronnie cohabi who has a joke and i don't know if it's his joke but i know he loves to use it which is to say you know 95 of data science is cleaning data and only five percent of data science is complaining about cleaning data which is to say he he he lands the punchline a little bit better than i do but what he's saying is that actually people think oh you know it's all building these incredible models off of these beautiful data sets that you compete on or are given and in practice actually so much of the work is cleaning and making sure the data is right or consistent and and so little of the the sort of work that a data person does is real data analysis and certainly not zero percent but the joke lands better when you expect the answer to be five percent but only it turns out it's actually just about complaining is the rest of the time if i think about sort of what the modern data stack is is now all of these tools that represent the sort of cleaning layer and and it's not just essentially scheduled tables it's it's a variety of different parts of making sure that the data that you are looking at downstream whether that's in your bi tool most likely is actually from has essentially traveled without sort of any problems an exciting part of the modern data stack for me is really kind of the emergence of new categories within the data stack for example last year we interviewed bar moses ceo of monte carlo and how they're trailblazing the data observability category what are some of the categories and tools you've seen emerge over the past few years that you've been excited about sure of course i'm gonna say a managed data pipeline i think is the coolest category and i happen to love one particular company in that however beyond that there are a variety of tools that get a little bit more what i call like upstream and up market into larger companies that have larger data teams that are using their data in a variety of ways but ultimately once you have loaded your data into your warehouse there's a variety of things there's data observability there's data cataloging i i kind of remember way back in the day you know we used to have columns revenue underscore final underscore the one to use underscore you really want this one v6 you know like and what i think is obviously the ability for larger data teams to come in and understand the world typically quickly which is you actually you know what you find is once you actually have a mature data organization it might take someone weeks to come in or months even to come in and understand the stack and dj patil has a line about his time at linkedin which was so much about being successful as a data scientist at linkedin was about getting a win in your first 90 days and if it takes you 90 days to get up on the stack or 89 days you better like be amazing you better be able to find something incredible in in one day whereas if it takes you a week or a day or an hour to get up on the stack well now you have a real chance to be successful at that company so there's a proliferation of tools that sort of really savvy companies like a linkedin like a yammer you know all built and used obviously airbnb has built a number of the famous ones and what those tools were about were making data people effective so you and now a lot of companies have sprouted up in terms of building those toolings that these companies spent countless airbnb probably spent hundreds of millions of dollars on not that it mattered but they spent countless millions of dollars developing now making that accessible to companies that don't have the budgets of airbnb or facebook or whomever so i see a lot of sort of development in that space obviously other categories that are popping up that are you know reverse etl is a really a great example of a downstream one that we had built bottom up sas world and at like subscale right so now having services that will do this or having services that do extract and load i think are really really important for companies where does mozart data fit within the modern data stack and how does it solve you know some of the challenges we've discussed thus far and can you walk us through some examples of mozart data in action mozart data basically is an all-in-one data platform so what that means is in under an hour you can start connecting multiple data sources we spin you up a snowflake data warehouse and you can start writing um your transforms and connecting a bi tool or reverse ctl tool and start to get insights so the real sort of magic is that this used to take months and a number of data engineer hires or you do a lot of vendor assessment and then pick your you know potpourri of vendors to do it or you hire a consultant to do it today this can all be done in effectively no time and by the time you're sort of done with the demo you could be up and running and querying your data in your favorite bi tool really there is this challenge of this speed to insight and mozart wants to empower not just like very savvy data engineers but rather everyone in the sort of data landscape to be up and running with this modern data stack all very quickly all without being gated by engineering what i love about mozart data is how much it empowers data analysts and citizen data analysts to get started quickly with data and provide value quickly without depending on data engineering or infrastructure work you know you're someone who's led data teams worked with a lot of data analysts while developing mozart data and more i'd love it if you can break down how you think the data analyst role has evolved over the past few years and where do you see it heading in the future right at the time where the the term data science again like jeff heimerbacher and dj patel sort of kicked off this term data scientist and then the incredibly rapid growth in that profession happened the title data scientist was being applied everywhere in the data space and the reason was because working as a data scientist basically meant that you got paid a lot more than working as a data analyst so everybody started co-opting the term and then you saw it to represent you know you had folks that were doing ml engineering all the way to folks that were maybe just out of college working with data for a first time all holding this title data scientist and it sort of represented a vastly different set of skills all encompassed by the same title and different it meant a different thing at different companies today you see much greater granularity of that you see people that hold revops or bizops titles you see folks where their specific expertise is distinguished so an analytics engineer is someone that's very different from a data engineer and you know a data scientist today has a specific role within a company a data analyst tends to have a specific role now we still see a lot of if you had a venn diagram of the skill sets a lot of that would overlap and i think actually the best i don't think that one title is like there's no greater than sign i think a lot of the core skill set ends up being the same like what makes for like a really great data scientist actually makes for a really great marketing ops analyst which is to say sort of a deep understanding of of causal relationships of inference and like it's a different set of technical skills it's obviously it's a different role within the the company in the organization you do different things on a day in day out basis but the core is still about sort of data thinking and data capabilities rather than specific technical expertises i completely agree here especially since there's a layer of skills that's a certain extent in variant as the role evolves over the time what do you think are the defining skills data analysts should cultivate to become successful in a modern data team today i'm a little bit biased because i spent a big part of my 20s sort of thinking about really causality so i did a phd in economics i studied behavioral economics and what was typically true was you would get great data sets that were not generated by experiment so data sets where something you know you you measured things over time and you sort of had an understanding of an individual with an id over time but you didn't necessarily have what you really wanted which is to run a scientific experiment put people in condition and condition b and then have hypothesis and see which ones wins out when you don't have that you have to basically do almost statistical tricks you have to think about okay what is something like an experiment and i think often that this is one of the most like underrated skills in data to really think about you know what you're trying to do with your data is essentially assign a causal relationship based on the past that you think applies in the future for a number of reasons right you think that there was a mechanism that brought it that still exists today so i think people that have really that deep thinking about like understanding causal relationships and understanding what typically is wrong with data so the classic example is you say okay well drowning deaths are always up in the months where ice cream consumption is up it's like obviously all novices say well that's because in the warm months people are eating ice cream and they're going to the beach or they're going to the pool and of course and they they realize that that's actually not the causal mechanism but then then you divorce it from that specific joking context and then you bring it into a world where many things are going on and your job depends in some sense the value you bring to the company depends on identifying a relationship that you think moves maybe the companies whether it's their marketing their business their product their users forward and then you start abandoning that critical perspective so in general what i like is a set of almost dismantling of good work thinking about all the ways in which a good insight or good work could be flawed maybe somebody did a robustness check that sort of proved that it wasn't flawed but at the very least when you read it can you be you know or look at the work that was done can you be skeptical and say okay well maybe it's mostly driven by something that won't necessarily repeat itself because a lot of these when they do replication studies and you know when i worked at microsoft i worked at bing and and bing you know you had the huge luxury of not just you know millions not just billions trillions of observations and you know you could keep tests running and get inference from there so i think like inference is the big skill but then also inference with small data is also a real skill it's a little confusing because typically you can't make inference with small data so you know if you see one observation or n of one or two literally you can't make a valid statistical inference from that but really having a deep thought about mechanism and how you would set it up to actually learn that answer in a space where you're pretty constrained by database what you find is and and we found this a bing that even when your data size is infinity you always want to cut it and cut it and cut it and cut it and cut it to a smaller and smaller cohort to make a more and more precise inference and without fail you run out of data even when the data seems like the the size is infinity i think two skills to me often are the most underrated it's the ones that that i think people should develop and work on and it's also the ones that we interview for not just at mozart but at a lot of the places that i've worked and being able to make these inferences and spot these causal relationships within the data set requires a lot of subject matter expertise you know oftentimes what's missed in the discourse around upskilling and breaking into tech is subject matter expertise and domain knowledge especially to be able to succeed in analytics roles and data roles can you comment or expand on the importance of subject matter expertise in a data role and how it has helped you in your career well just literally picking this picks up great like you mentioned off of the last question which is if your key insight is thinking about the right mechanism that is driving the causal relationship you're ascribing to your data then actually understanding what your users are doing and what motivates your users is critical so again i i worked at yammer we were we were the biggest per capita consumers of our product you know as a company so it's not surprising dan and i my co-founder of mozart data he and i 13 years ago started a hot sauce company we were also the number one consumers of that hot sauce so subject matter expertise is 100 like a table stakes thinking that you have to bring in order to understand those relationships now the flip is sometimes that deeply works against you so it's not linear up it's not necessarily just concave as in as you get more and more subject matter expertise this first derivative remains positive you can find that sometimes you are so deep in your world you are missing what the typical user is doing and actually a lot of times in past jobs we've had that problem where we are the right tale of usage and expect everybody to understand kind of some of the subtle things that are going on within the tool and what you find is that people have a surprisingly surface willingness to pay attention you know you're the most important thing to you and a lot of times you can build software and to you it's incredible but to the typical user that isn't willing to make that same investment into learning all of your nuances it might not be the case so subject matter expertise first of all is the table stakes to get started you can't reasonably understand the mechanisms that are driving your user base without understanding your users in the first place that's why you often see companies like airbnb and uber consumer companies people that work there are just nuts about using those products brian chesky famously stayed in airbnb for one whole year didn't have an apartment and that was a critical part of essentially developing domain next you know it's yes it's about empathy for the customer but also it's also about tuning that domain expertise and everybody i knew that worked and ride sharing was taking ride chairs everywhere if they had to go across the street they'd take a ride share i think it's developing not just that subject matter expertise but also the real kind of getting to your users mindset so given your experience in startups and working in smaller organizations how do you instill that subject matter expertise in early stage startups when they don't necessarily have these massive user bases when they make hires for example so i worked at a company called open door and open door when i was there was largely buying and selling homes in phoenix and i had no desire to buy or sell i didn't own a home in phoenix but i didn't have a desire to buy a home in phoenix obviously now they're in many many many more markets and i didn't have the ability to gain expertise in essentially that buyer's journey because i i never went through it you're not always gifted the situation that i discussed with the consumer companies where if you're a data scientist that say facebook your dog fooding it all the time i think the key is one obviously if you can do that it is a huge advantage and if you can't i think you really want to disproportionately invest in talking with you know sort of yc tropes which are talk to customers talk to customers talk to customers so i do think that sitting down observing customers talking to customers talking to prospects that rejected you all of those things are trying to up your you know your knowledge now the the flip is i'm now selling a product that de facto i've worked on for 20 years so you know your subject matter expertise is not necessarily one that happens the second that you sign your offer letter your subject matter expertise hopefully you're leveraging you know in my case over 40 years of subject matter expertise but beyond that you want to be able to really understand your customer whether that customer is you whether that's exhaustive research you shouldn't think of your title as well my title says data in it so i've got to be in a back corner doing data a lot of the term that i like to use is use your feet which is talk to the product or customer facing folks in your organization or if you can talk to customers that's great and flipping the question slightly if i am a data analyst breaking into a new vertical whether at a startup or an enterprise what is the fastest way for me to develop subject matter expertise so i do think adjacent problems can be helpful i mean i've loved reading nate silver for the longest time and i do think sort of reading folks that think about data the right way i started my career in the nfl as a statistician not as a player you know i had been into sports statistics for my whole life and you know i think that there were a lot of parallels to the thinking about baseball famously had solved a lot of these sort of real like problems of figuring out what had tight relationships with performance one had you know predictability etc but that thinking then very deeply motivated and i was excited about i i just had a passion for it i read up a lot about it and i think if there's you know an analytics in a space that you love now for me that was baseball and football and there was now tons of material at the time there was just sort of limited amounts of material but if you can find those people that love writing really sadly in the spaces that you love i think that you're gonna find good analytical thought questioning breakdown and that's gonna apply to whatever discipline that you're gonna do i mean reading moneyball my favorite book from michael lewis really is the same type of thinking that i would give early stage startups essentially about it's the fact though the same you know advice yc gives which is write down the equation of success and then break it into its component parts and then measure those parts and then dive into when one of these pieces is not working cohort and cut and summarize and that's how you start analytics anywhere but it certainly was how the a's did it 20 some odd years ago when they were trying to compete with bigger market teams that's awesome and as we close out our conversation i'd love if we can think about the future for a bit and what you think are some of the trends that are really going to shape how individuals and organizations work with data i'd love it if you can list some of the trends that you're particularly excited about when it comes to the modern data stack and what and how it will affect data driven organizations i think we touched on one of them which is the real rise of the citizen data scientist first of all you see a bunch of savvy people that write sql that don't have data titles you know biz ops marketing ops all of these sort of writing sql or python or something like that is just not uncommon in roles that were almost exclusively non-tech i think like that is a really exciting moment for anyone in the data space because the data space is now opening up to many many many more roles at companies and many many many more people have the chops to do something to be a little dangerous with their data i think this is a great trend for companies trying to solve the data problem for smbs and obviously i'm very excited about one of those companies mozart data the other thing the other part of the trend that i'm excited about that sort of also relates to mozart data is it used to cost you like hire a couple of data engineers and buy a bunch of expensive infrastructure and you might be out two three four five million dollars just to get started on your data journey today it's a six dollar swipe of a credit card and you're off to the races now it's metered and your bills become significant your investment in data ultimately becomes significant but the fact that you can get started for close to nothing is incredible it is a huge difference so if you think about the types of companies those are companies that could really afford a multi-million dollar investment in data so that they could have that advantage we're the largest companies you could only get jobs at the biggest companies because those are the companies that had the data teams those are the companies that were leveraging the data and could effectively take advantage of their scale in applying those data insights today this is becoming table stakes earlier and earlier so more and more companies like ours not just ours but like ours are really empowering and enabling the smb to use data infra the types of data tooling that i see more upmarket in fact generally i find data stacks to actually be stronger downstream before there's a dozen sources of truth you know it's actually kind of a little bit paradoxical which is actually almost the more constrained your budget the more likely you are to end up with effectively a tighter data stack that's great and i love that first trend especially and this is something that we've definitely seen at datacamp with the hybridization of jobs and the emergence of data skills and you know traditional roles like finance marketing and a lot more now finally peter i had an awesome chat with you today do you have any final call to action before we wrap up yeah obviously rooting for so many people in their data journeys and we love helping small companies at the start of their data journey get up and running on their data infrastructure all in under an hour without really needing any data engineering support so if you're interested in that we'd love to talk to you at mozart data so i'm pete mozartdata.com that's awesome thank you so much peter for coming on the podcast you've been listening to data framed a podcast by data camp keep connected with us by subscribing to the show in your favorite podcast player please give us a rating leave a comment and share episodes you love that helps us keep delivering insights into all things data thanks for listening until next time\n"