#209 Effective Data Engineering _ Liya Aizenberg, Director of Data Engineering at Away
Creating a Vibe and Atmosphere in the Data Team: A Crucial Role in Driving Innovation and Results
It's essential to create an atmosphere where everyone in the data team feels comfortable coming up with new ideas and trying them out. This vibe should foster innovation, experimentation, and collaboration among team members. At the same time, it's crucial to ensure that the team understands the importance of focusing on ideas that drive real value for the business. By striking a balance between creativity and practicality, teams can unlock new opportunities for growth and improvement.
This approach is rooted in the concept of result-driven innovation, where the primary goal is to bring tangible results rather than just exploring new ideas for their own sake. As a data team leader, it's essential to emphasize this mindset and encourage team members to prioritize outcomes over process. By doing so, teams can avoid getting bogged down in unnecessary experimentation and focus on delivering meaningful value to the business.
In the context of generative AI, the role of the data engineering team is critical in building tools and functionalities that drive real results. One key aspect of this role is bringing together data from various sources to power the models that enable generative AI. Data engineers must be adept at stitching these datasets together to create a cohesive platform for machine learning and training.
As generative AI continues to evolve, the skill set of data engineering teams will need to adapt to meet the changing demands of this field. The ability to process large amounts of data efficiently and effectively will become increasingly crucial, as well as the capacity to learn from experience and stay up-to-date with emerging trends and technologies. With the field advancing rapidly, it's essential for data engineers to remain flexible and open to new ideas and approaches.
Trends in Data Engineering: Chatbots and Beyond
Chatbots are already being used extensively in various applications, and this trend is likely to continue in the coming years. As the technology advances, we can expect to see more sophisticated chatbots that integrate with other AI systems to create seamless experiences for users. However, it's essential to note that these advancements will be incremental rather than revolutionary.
While many experts are still learning about generative AI and its applications, there is a growing recognition of the need for interdisciplinary collaboration between data engineers, machine learning practitioners, and domain experts. This collaborative approach is essential for unlocking the full potential of generative AI and creating tools that drive real value for businesses.
Key Takeaways from Leah's Insights
Leah shared several valuable insights during our conversation about generative AI and its impact on the data engineering space. One key takeaway is the importance of experimentation and learning from failure. With generative AI, we are still in the early stages of exploration, and it's essential to be open to new ideas and approaches.
Another crucial aspect of this field is the need for adaptability and flexibility. As the technology advances rapidly, data engineers must be willing to learn from experience and stay up-to-date with emerging trends and technologies. By doing so, they can create tools that drive real value for businesses and help unlock the full potential of generative AI.
Finally, Leah emphasized the importance of having fun and not being afraid to try new things. With generative AI, we are still in uncharted territory, and it's essential to approach this field with a sense of excitement and curiosity. By embracing this mindset and working collaboratively as teams, we can unlock new opportunities for growth and improvement.
"WEBVTTKind: captionsLanguage: enit's important to create a a Vibe an atmosphere where everyone in the data team feels like they can come up with a cool new ideas and try them out but at the same time we need to make sure the team knows that it's important to focus on ideas that actually help the business it's like hey let's try new stuff and see what works but let's also make sure it's useful and get job done right it goes back to result driven you always want to bring the results you don't want to just you wheels for the reason Leah Eisenberg it's great to have you on the show oh glad to be here do so you are the director of data engineering at awaytravel.com and have been managing data engineering teams for a while now uh so maybe set the stage for our conversation what makes an effective data engineering team there are what few things I highlighted for myself that I find important um to be data engineering successful is trust collaborative work not afraid to make a mistakes um transparency and knowledging CH exchange opportunities people should have an opportunity to exchange the knowledge within the team we can um I can go deep deep dive into um each and every I identified so we'll definitely deep dive into those uh what we want to focus on first maybe is you know building a data engineering team from scratch right if we take a step back you know you mentioned uh you know building the right team team knowledge Exchange building trust collaboration right you know if you're building a data team from scratch there's tons of data leaders listening here on the episode right trying to build their own data engineering teams as well uh let's focus maybe first on the type of roles that you would hire right uh what type of profiles do you look for in an early data engineering team or as an early hire in a data engineering team I would say uh the first thing uh the I think I identifi the important trait for myself I'm looking for a spark a spark and passion uh for the data and passion to learn new things good understanding and knowledge of data engineering principles are very important um I find this in good personality um open mindness um eager to learn and help others um I find this important and passion passion about data and Empower your um teammates okay great and then you know you mentioned here kind of the personality if you deep dive into the bit more what are the kind of the cultural traits that you look for in an early data engineering hire I'm looking for the uh friendly and open-minded people um I value uh people who take ownership of their work uh people who always finish what they start you know sometimes people get S sidetracked not finishing things starting one thing then sidetracked to do something else so I I I value that people who finish what they started uh not easily distracted also good communicators uh people who speak out it's very important to speak out and not and also the people who not afraid to change because our industry is evolves all the time there are a lot of changes uh even within the one company can be a lot of uh changes and people who Not Afraid and easily adapt to the change are very important and also result driven folks are very important it's amazing trade so you want to see the result you want to make sure that whatever you do you produce the result and outcomes or AG yeah that's really great I couldn't agree more on results Ren especially you know we're going to talk about you know how to focus on the right projects that matter because that's a big trap that data teams can fall in and maybe you know we talked about the cultural trades but you know in an early Tech in early data engineering higher I'm sure you know the technical skill set is so wide the data engineering ecosystem is so fragmented it's so uh there's so many tools so many skills to to adopt here what are kind of the technical skills you also look for in an early data engineering team a good question it's very like today um I'm looking for the foundation of the um uh every data engineer should have a good knowledge of a python tql and relation and non- relationship databases this is foundational uh all in knowledge of relation non relational database as well as python also uh there are a lot of hide items today on the market uh it's very good to have a knowledge of uh V virtualization tools such a looker Tableau because you build all your uh data marks you build all your data but you actually need to serve this data to the business so visualization tools are important that's your um that's the tools you serving your data with um like name a few looker tblo um there are are very very good integration tools are available today uh stage five Tran Maan are currently uh in demand DB knowledge of DBT is important this is the tool that you build models so that but I want to highlight SQL in Python I mean SQL is the linga franka of of data whether you're you're a data engineer or a data scientist or a data analyst and you know when you mention kind of these canonical tools that you mentioned kind of DBT mat uh you know SQL skills what stands out for us is kind of like the musthave tool knowledge especially in a modern data engineering team that you need to have you know outside of you know traditional tools like you know here I see a traditional quote unquote between Python and SQL I think any knowledge of iPad solution uh it's a integration um integration platform as a service um it's a good knowledge of uh uh integration tools knowledge of airflow um that's knowledge of the uh any Cloud AWS gcp um Azure any of this knowledge of the uh one of these cloud or multiple cloudes very important to today okay that's really great and you know as the team grows and becomes more complex at what point do you decide when it's time to bring in Specialists like what kind of roles do you start looking for what are those specialist skills looking like and yeah walk me through that first you always want to see if you have a potential specialist already within your team or if you have if you can rise one up right you want to empower the growth of your immediate team first uh as a leader I have a periodical check checkins with my team members so I have a good understanding uh and good knowledge what is people want to do uh how they want to move forward uh in their career uh but there yeah but there are cases when you uh when you when your project project is required um very specific expertise and you don't have it within your team and you have no time to train your folks in this case in this situation I usually Source uh the exper is outside of the company to make sure the project is move forward we're not blocking anyone so everything go plan but however it's important to pair up this expert external expert uh together with your team members so knowledge stays internally and your folks are learning something new right so uh it's very important to pair them up work so they can work together together um your team members can learn new things and what yeah while working along the side of this external yeah that's really great and you mentioned here kind of bringing up people to become Specialists right and that you know growing people and training them what does that look like in practice I'd love to know kind of you know how you've approached upskilling data engineering teams so that they can specialize into you know their you know respective specialist roles as you grow your data engineering te uh you uh once you have your periodical checkin you have an understanding some people would like to be technical managers some people would like to be uh people managers not everyone want to be people manager not everyone want to be technical manager so you work uh you work um with your folks to see what interest of theirs and you also ID identify the strengths of your team members and based on the strengths you um you uh uh give them uh you assign them to the project give them the work to do and based on based on their interest it's very important to grow your team and keep their interest and keep them excited of the work they do and you know we've been really focused on the uh Talent side of building a data engineering team but I think a big question uh that a lot of data engineering teams especially you know new ones have to face is what is the tech stack that we want to invest in right you know we mentioned like you know between AWS Google Cloud Azure right as a cloud service there's so many different uh data pipelining tools that one can use right can you walk us through how you make a decision of what teex stack to adopt as a data engineering team and what are the factors that you look in to guide these decisions I'm True Believer you don't need to have a tons of different Tech uh be successful I I believe less is more I choose the to and their ability uh to scale uh how future proof they are also what kind of level of expertise I have within my team right I don't want to bring completely something that new that my my uh team has no idea how to deal with and also the price tag of these tools is also very important you want to be you want to make sure that you're not overpaying and you're staying within your budget and cost effici you know oftentimes as a data engineering leader when you're you know making these decisions on a teex stack right you mentioned kind of Simplicity of tools and like not having a lot of tools um why why do you think a lot of data teams s to fall in the Trap of buying so many tools and getting so many tools and like how do you avoid that as a data engineering leader it's very important to choose the stack that easy to maintain and adapt and also easy to find the talent will actually support moving forward so you don't want to have that tool like multiple tools that doing the same thing you want to make sure because it's just waste of money right so two two things to highlight here the tools that um easily you maintain easy to adopt the tools you can uh find easy find the talent and you don't want to have the tools that overlap like doing the same thing so let me let's switch gears here and talk about what makes a data engineering team value driven right you mentioned earlier in our discussion when talking about what makes a a a data engineering uh Team successful right it is that it is you know focuses on value and is results driven right you know a big risk that we discussed behind the scenes that data teams can generally fall into the Trap of building shiny toys Shiny Toys sorry that generate generate little business value but are really exciting to put on a resume right I think like you know this deep learning model or this you know machine learning pipeline that doesn't necessarily drive a lot of business value uh maybe walk us through why this Dynamic still exists today and how do you avoid that as a data engineering leader as a leader I'm responsible for bringing visibility to the data engineering across organization I build Partnerships um I usually build a partnership across the organization with various business teams and functions like marketing Finance analytics product uh we were closely together on the company's uh strategy and road map eventually this team this teams become my team stakeholders right that engineering work gets prioritized and align aligned based on their business objectives this approach allows data Engineers to understand the bigger picture bigger picture companywide and also helps to identify companywide uh challenges and the goals it also helps um my team prioritize its initiatives and that's how I usually I ensure that data engineering team uh produces um value driven outcomes yeah and you're talking here about building interlocks right building interlocks with the with the finance team with the revenue operations team with you know the different stakeholders within the organization uh maybe what Ty what does a good successful interlock look like what does successful collaboration look like with other teams here you want your business stakeholders to be data savy thinking the resource that can be used in many different ways uh that's why it is important to educate and show to your business partners what is possible from the data standpoint a product manager plays a critical role actually uh in a in effective collaboration between data engineering and business is in the middle between Technical and business um it's important um that feedback loop has to be established between business and data engineering uh feedback is super important uh as well as a regular chickin so when you have a regular chickin uh to see how things are going and figuring out uh what could be done better that's definitely improv the collaboration between a business and data engineering teams and it's improve uh how they work over time um another thing I want to highlight um it's alignment on the goals right so we all align uh data team and business um so we all aligned uh in on the same page uh what we trying to achieve together uh this means uh the understanding how the data can help the business uh to succeed and make sure the data team's goals the data team goals is to make a business better and to be good partner across the company okay and you know you're talking here about kind of you know building Partnerships with the all of the company I'm sure as a data engineering leader it becomes really hard to prioritize what is the most valuable thing I can do for the company right now so how do you quantify different projects what like that will you know deliver business value like how do you prioritize the road map usually the projects that get prioritized first is the one who actually address critical business right projects that also contributing into Revenue generation cutting cost or other PPI it's very it's it's always good to figure out if there are any low hanging fruits if there are any something we can do with uh relatively low level of effort but deliver immeasurable Val any quick wins out there you know like any quick wins we can get if uh if we find any type of quiick win it's a good thing to prioritize that and when you look at certain projects you know a data engineering team does sometimes a lot of that work is invisible right like for example building a data platform or you know um you know improving the integration of one data set to another how do you look at the value of these projects how do you quantify the ROI of these types of projects let's say we implemented a feature that recommends additional product for the customer purchase uh let's say you have something in your card and we build a project that uh recommends the product based on the current content of your shopping cart this feure uh plan to improve a conversion rate so if you right so like you have a product recommendation uh in we're committing you to buy another product so we're trying to uh improve the conversion rate in this case we would be monitoring a percentage increase in conversion rate using like various AIT testing functionality that how would you quantify so if we build a product and this project drop uh increase of conversion rate that it's considered successful okay yeah there are some projects that actually doesn't have intangible benefits it's very hard to measure like for example let's say data data team develops a new feature that provides a cut this visibility uh to the order stepbystep progress like I'm in retail so I'm talking orders and shopping CS so like let's say you placed an order but you don't have you don't see the visibility how it progresses uh now you have an opportunity to see like uh that your order has been received it been processed the payment thing so you see the detail the progression of your order right it doesn't we cannot measure the benefit of that project but it's actually but we know that it will drive a high customer satisfication right and can we can obtain this information do like using different Service uh different customer feedbacks but it's actually not intangible and cannot be measured like conversion rate for example yeah that's great because this is what often time when I think about is that you know our data engineering team makes a data set available a data Camp to analyzed by business users right like how do you make how do you measure the success of the ROI of you know a feature that is dependent on other teams using it for example so like maybe kind of to ref the question how do you ensure that a lot of the as a data engineering leader that your partners are leveraging the newly built data that is now available by data engineering teams if customer data platform has been built right and uh we give uh the requirement was to build the unified customer data platform and this um so working closely with marketing we can help them build a customer segmentation for example like we give them a visualization um tools that they can use to actually uh segment the the segment their customer based based uh based based on their uh data that we uh uh based on the customer data platform we buil um you Empower business folks um through education of the data of showcasing what possible and again working together to um to ensure data is used visualization tools um a good place to start you know we've been talking a lot about how data engineering teams can uh you know uh Drive value and prioritize the road map and focus on value driven project but I think projects but I think an important thing to think about here is how you can be iterative in agile as a data engineering team right so being agile is something data teams across the board need to you know adopt more or like are slightly less mature than software engineering in general right so maybe what does AD look like for data engineering teams I think it's important to be flexible and AD quickly if something a plan unplanned comes up the team adjusts and moves forward without creating trition and roadblocks um collaboration uh Team workor really um important uh so we need to rely on each other to do their to do to do their part similar as in team sports so you have to be rely that your um team mates will be doing what they need to do uh business strategy changes frequently ability to adapt is important and flexibility ability to Pivot into different direction if need be okay great and then you know we're talking about here kind of you know being able to be flexible and to uh pivit but sometimes when looking at massive projects that data engineering teams undergo such as you know a building a data platform or you know uh uh building an entirely new data collection pipeline right uh how are how are data engineering seem able to kind of pemal these types of projects in a way that incrementally drives value how have you approached this in the past Do you have a big project on your plate um I usually look at this as a building the house right it's trying instead of trying to build the entire house at once you start with basic a solid foundation walls and the roof this is your MVP the minimum version of the house that functional and serves the purpose of providing shelter same goes to the project as the project progresses um additional features and fality added into into the solution in small incremental steps right um it's like adding rooms Furniture decorations um to the house over time make it more comfortable and functional as you go um first we like first we build um the required MVP minimum valuable product and after MVP is delivered um we start to we start to prioritize additional features based on the value that will be that they will be driving for organization so you know when you're building an MVP um how do you ensure that the MVP delivers value right like how do you define what an MVP looks like depending on the project for example this MVP have to be defined together with uh with the stakeholders so we collabora to identify what is MVP should look like for a given project um and identify what features are required and what features are actually nice to have it can be delivered later so they get prioritized uh the foundation has to build regardless we need the walls we need the roof all of this stuff but then uh you prioritize based on the uh you prioritize the features based on the way they trct the value for the business start small win big uh don't try to overhaul right uh Everything at Once pick a manageable project and break it down into the small short Sprints like mini projects so you so it lets you see the benefits of of agile quickly um and make adjustments and make adjustments as you progress and you know given the above like what are you know what is the advice that you would give data engineering leaders here looking to adopt agile methodologies how would you recommend that they start small in their next big project for example work closely with your product a product manager is your um is your I would say your business voice right uh work closer with your uh product uh product manager manager work closely with your business and Def Define what is the important what is the important for organization at this particular time right and um um like I said start small when big um one step at a time and uh and priority prioritization and value is defined together with uh business stakeholders and product managers okay that's really great and then you know we've been talking about uh how uh you know data engineering teams can be agile and can drive a lot of value and we also talked about how uh data engineering teams can be results driven right and focus on uh project that matter right you know you know I'm sure we alluded to we alluded to this earlier in our discussion about the risk of pursuing Shiny Toys right and I know now with generative AI uh we see a lot of hype today about you know the importance of building generative AI tools and generative AI uh kind of product features right um you find there's an increased risk of building something shiny and not necessarily useful here how do you balance that as a data engineering leader it's important to create a a Vibe an atmosphere um where everyone in the data team feels like they can come up with a cool new ideas and try them out but at the same time uh we need to make sure that team knows that it's important to focus on ideas that actually help the business it's like hey let's try new stuff and see what works but let's also make sure it's useful and get job done right it it goes back to result driven you want toing you always want to bring the result you don't want to just um feel your wheels for no reason you want to bring the results you want to see you wanna you want to bring the uh valuable outcome let's try let's experiment but let's uh concentrate on the um let's concentrate on tools and functionalities that actually bring bring the value to the business we are in currently and with the context of generative AI you know what is the role of the data engineering team necessarily in kind of building these tools like how do you see kind of data engineering teams build build building generative AI like what's the role of the data engineering team in building generative AI use cases primarily um I see data engineering team responsible actually bring them the data over from all different sources generative ai ai would require a lot of data from a lot of different uh sources to be um placed in a one spot for actually models to work and use this data uh primarily focus of data engineering actually to bring this data over possibly it together so the models can be so the models can be used this data for Learning and training so responsibility actually to bring this data over and Stitch it together for models and data science folks to use and do you find that the data engineering skill set will have to evolve for example to build you know trieval augmented generation pipelines or to build you know uh really specific type of pipelines unique to generative AI uh use cases like how do you see the skill set of data engineering teams evolve over the next few years as gentiv AI becomes more prevalent I see what I see is that a volume of data changing so we will need to adapt the skills and tools that can process large amounts of data um and like I said that space and skills are evolving all the time and we what we did 10 years ago it doesn't make sense today and obviously I see this same thing happening uh in 10 years from now so we need to adapt we need to learn we need to stay on top of what's going on uh in today's data world again large data data subset data sets um will need a lot of storage and a lot of horsepower to process all of that and build actually something valuable yeah 100% I think we see a lot of use cases but uh not a lot of value yet and then maybe as we close out our conversations Leah what are you know trends that you're seeing in the data engineering space you know as we're talking about generative AI we're talking about different tools what are trends that you look in the gener data engineering space that you're excited about that you're looking forward to seeing in the next few years or you know next few years is going to be maybe hard to predict by the next 12 months I uh I see a lot of um chat Bots being used so uh I see a lot of um I see a lot of changes in uh um in adopting um in adapting the um like a lot of um I like I don't think we're there yet uh in I think we just taken a small incremental steps in uh in a towards AI I don't think like everybody 100% understand what's going on it can be like not 100% oriented what's happening in the actual in the field I think we just take again like small steps and learn and uh try different things see what sticks what works um and try to uh be creative and don't um don't um try to in reinvent the will right if there is something exist let's reuse it let's use it let's see what we can adapt again it's a lot of unknown and a lot to learn still for everyone in this space I could agree more and maybe as we close out our chat uh Leah do you have any final notes to share with the audience I would say experiment don't don't afraid to fail learn from your mistake mistakes and don't forget to have fun with it it's a lot of it's unknown for everyone there are not a lot of experts in the fields at this moment so we're all learning together and ask questions right ask questions reach out to your um team teammates reach out to your part parners and learn learn keep learning and don't forget to have fun that's important definitely have fun that is very very important thank you so much Leah for coming on data friend thank you ad for having me it was pleasureit's important to create a a Vibe an atmosphere where everyone in the data team feels like they can come up with a cool new ideas and try them out but at the same time we need to make sure the team knows that it's important to focus on ideas that actually help the business it's like hey let's try new stuff and see what works but let's also make sure it's useful and get job done right it goes back to result driven you always want to bring the results you don't want to just you wheels for the reason Leah Eisenberg it's great to have you on the show oh glad to be here do so you are the director of data engineering at awaytravel.com and have been managing data engineering teams for a while now uh so maybe set the stage for our conversation what makes an effective data engineering team there are what few things I highlighted for myself that I find important um to be data engineering successful is trust collaborative work not afraid to make a mistakes um transparency and knowledging CH exchange opportunities people should have an opportunity to exchange the knowledge within the team we can um I can go deep deep dive into um each and every I identified so we'll definitely deep dive into those uh what we want to focus on first maybe is you know building a data engineering team from scratch right if we take a step back you know you mentioned uh you know building the right team team knowledge Exchange building trust collaboration right you know if you're building a data team from scratch there's tons of data leaders listening here on the episode right trying to build their own data engineering teams as well uh let's focus maybe first on the type of roles that you would hire right uh what type of profiles do you look for in an early data engineering team or as an early hire in a data engineering team I would say uh the first thing uh the I think I identifi the important trait for myself I'm looking for a spark a spark and passion uh for the data and passion to learn new things good understanding and knowledge of data engineering principles are very important um I find this in good personality um open mindness um eager to learn and help others um I find this important and passion passion about data and Empower your um teammates okay great and then you know you mentioned here kind of the personality if you deep dive into the bit more what are the kind of the cultural traits that you look for in an early data engineering hire I'm looking for the uh friendly and open-minded people um I value uh people who take ownership of their work uh people who always finish what they start you know sometimes people get S sidetracked not finishing things starting one thing then sidetracked to do something else so I I I value that people who finish what they started uh not easily distracted also good communicators uh people who speak out it's very important to speak out and not and also the people who not afraid to change because our industry is evolves all the time there are a lot of changes uh even within the one company can be a lot of uh changes and people who Not Afraid and easily adapt to the change are very important and also result driven folks are very important it's amazing trade so you want to see the result you want to make sure that whatever you do you produce the result and outcomes or AG yeah that's really great I couldn't agree more on results Ren especially you know we're going to talk about you know how to focus on the right projects that matter because that's a big trap that data teams can fall in and maybe you know we talked about the cultural trades but you know in an early Tech in early data engineering higher I'm sure you know the technical skill set is so wide the data engineering ecosystem is so fragmented it's so uh there's so many tools so many skills to to adopt here what are kind of the technical skills you also look for in an early data engineering team a good question it's very like today um I'm looking for the foundation of the um uh every data engineer should have a good knowledge of a python tql and relation and non- relationship databases this is foundational uh all in knowledge of relation non relational database as well as python also uh there are a lot of hide items today on the market uh it's very good to have a knowledge of uh V virtualization tools such a looker Tableau because you build all your uh data marks you build all your data but you actually need to serve this data to the business so visualization tools are important that's your um that's the tools you serving your data with um like name a few looker tblo um there are are very very good integration tools are available today uh stage five Tran Maan are currently uh in demand DB knowledge of DBT is important this is the tool that you build models so that but I want to highlight SQL in Python I mean SQL is the linga franka of of data whether you're you're a data engineer or a data scientist or a data analyst and you know when you mention kind of these canonical tools that you mentioned kind of DBT mat uh you know SQL skills what stands out for us is kind of like the musthave tool knowledge especially in a modern data engineering team that you need to have you know outside of you know traditional tools like you know here I see a traditional quote unquote between Python and SQL I think any knowledge of iPad solution uh it's a integration um integration platform as a service um it's a good knowledge of uh uh integration tools knowledge of airflow um that's knowledge of the uh any Cloud AWS gcp um Azure any of this knowledge of the uh one of these cloud or multiple cloudes very important to today okay that's really great and you know as the team grows and becomes more complex at what point do you decide when it's time to bring in Specialists like what kind of roles do you start looking for what are those specialist skills looking like and yeah walk me through that first you always want to see if you have a potential specialist already within your team or if you have if you can rise one up right you want to empower the growth of your immediate team first uh as a leader I have a periodical check checkins with my team members so I have a good understanding uh and good knowledge what is people want to do uh how they want to move forward uh in their career uh but there yeah but there are cases when you uh when you when your project project is required um very specific expertise and you don't have it within your team and you have no time to train your folks in this case in this situation I usually Source uh the exper is outside of the company to make sure the project is move forward we're not blocking anyone so everything go plan but however it's important to pair up this expert external expert uh together with your team members so knowledge stays internally and your folks are learning something new right so uh it's very important to pair them up work so they can work together together um your team members can learn new things and what yeah while working along the side of this external yeah that's really great and you mentioned here kind of bringing up people to become Specialists right and that you know growing people and training them what does that look like in practice I'd love to know kind of you know how you've approached upskilling data engineering teams so that they can specialize into you know their you know respective specialist roles as you grow your data engineering te uh you uh once you have your periodical checkin you have an understanding some people would like to be technical managers some people would like to be uh people managers not everyone want to be people manager not everyone want to be technical manager so you work uh you work um with your folks to see what interest of theirs and you also ID identify the strengths of your team members and based on the strengths you um you uh uh give them uh you assign them to the project give them the work to do and based on based on their interest it's very important to grow your team and keep their interest and keep them excited of the work they do and you know we've been really focused on the uh Talent side of building a data engineering team but I think a big question uh that a lot of data engineering teams especially you know new ones have to face is what is the tech stack that we want to invest in right you know we mentioned like you know between AWS Google Cloud Azure right as a cloud service there's so many different uh data pipelining tools that one can use right can you walk us through how you make a decision of what teex stack to adopt as a data engineering team and what are the factors that you look in to guide these decisions I'm True Believer you don't need to have a tons of different Tech uh be successful I I believe less is more I choose the to and their ability uh to scale uh how future proof they are also what kind of level of expertise I have within my team right I don't want to bring completely something that new that my my uh team has no idea how to deal with and also the price tag of these tools is also very important you want to be you want to make sure that you're not overpaying and you're staying within your budget and cost effici you know oftentimes as a data engineering leader when you're you know making these decisions on a teex stack right you mentioned kind of Simplicity of tools and like not having a lot of tools um why why do you think a lot of data teams s to fall in the Trap of buying so many tools and getting so many tools and like how do you avoid that as a data engineering leader it's very important to choose the stack that easy to maintain and adapt and also easy to find the talent will actually support moving forward so you don't want to have that tool like multiple tools that doing the same thing you want to make sure because it's just waste of money right so two two things to highlight here the tools that um easily you maintain easy to adopt the tools you can uh find easy find the talent and you don't want to have the tools that overlap like doing the same thing so let me let's switch gears here and talk about what makes a data engineering team value driven right you mentioned earlier in our discussion when talking about what makes a a a data engineering uh Team successful right it is that it is you know focuses on value and is results driven right you know a big risk that we discussed behind the scenes that data teams can generally fall into the Trap of building shiny toys Shiny Toys sorry that generate generate little business value but are really exciting to put on a resume right I think like you know this deep learning model or this you know machine learning pipeline that doesn't necessarily drive a lot of business value uh maybe walk us through why this Dynamic still exists today and how do you avoid that as a data engineering leader as a leader I'm responsible for bringing visibility to the data engineering across organization I build Partnerships um I usually build a partnership across the organization with various business teams and functions like marketing Finance analytics product uh we were closely together on the company's uh strategy and road map eventually this team this teams become my team stakeholders right that engineering work gets prioritized and align aligned based on their business objectives this approach allows data Engineers to understand the bigger picture bigger picture companywide and also helps to identify companywide uh challenges and the goals it also helps um my team prioritize its initiatives and that's how I usually I ensure that data engineering team uh produces um value driven outcomes yeah and you're talking here about building interlocks right building interlocks with the with the finance team with the revenue operations team with you know the different stakeholders within the organization uh maybe what Ty what does a good successful interlock look like what does successful collaboration look like with other teams here you want your business stakeholders to be data savy thinking the resource that can be used in many different ways uh that's why it is important to educate and show to your business partners what is possible from the data standpoint a product manager plays a critical role actually uh in a in effective collaboration between data engineering and business is in the middle between Technical and business um it's important um that feedback loop has to be established between business and data engineering uh feedback is super important uh as well as a regular chickin so when you have a regular chickin uh to see how things are going and figuring out uh what could be done better that's definitely improv the collaboration between a business and data engineering teams and it's improve uh how they work over time um another thing I want to highlight um it's alignment on the goals right so we all align uh data team and business um so we all aligned uh in on the same page uh what we trying to achieve together uh this means uh the understanding how the data can help the business uh to succeed and make sure the data team's goals the data team goals is to make a business better and to be good partner across the company okay and you know you're talking here about kind of you know building Partnerships with the all of the company I'm sure as a data engineering leader it becomes really hard to prioritize what is the most valuable thing I can do for the company right now so how do you quantify different projects what like that will you know deliver business value like how do you prioritize the road map usually the projects that get prioritized first is the one who actually address critical business right projects that also contributing into Revenue generation cutting cost or other PPI it's very it's it's always good to figure out if there are any low hanging fruits if there are any something we can do with uh relatively low level of effort but deliver immeasurable Val any quick wins out there you know like any quick wins we can get if uh if we find any type of quiick win it's a good thing to prioritize that and when you look at certain projects you know a data engineering team does sometimes a lot of that work is invisible right like for example building a data platform or you know um you know improving the integration of one data set to another how do you look at the value of these projects how do you quantify the ROI of these types of projects let's say we implemented a feature that recommends additional product for the customer purchase uh let's say you have something in your card and we build a project that uh recommends the product based on the current content of your shopping cart this feure uh plan to improve a conversion rate so if you right so like you have a product recommendation uh in we're committing you to buy another product so we're trying to uh improve the conversion rate in this case we would be monitoring a percentage increase in conversion rate using like various AIT testing functionality that how would you quantify so if we build a product and this project drop uh increase of conversion rate that it's considered successful okay yeah there are some projects that actually doesn't have intangible benefits it's very hard to measure like for example let's say data data team develops a new feature that provides a cut this visibility uh to the order stepbystep progress like I'm in retail so I'm talking orders and shopping CS so like let's say you placed an order but you don't have you don't see the visibility how it progresses uh now you have an opportunity to see like uh that your order has been received it been processed the payment thing so you see the detail the progression of your order right it doesn't we cannot measure the benefit of that project but it's actually but we know that it will drive a high customer satisfication right and can we can obtain this information do like using different Service uh different customer feedbacks but it's actually not intangible and cannot be measured like conversion rate for example yeah that's great because this is what often time when I think about is that you know our data engineering team makes a data set available a data Camp to analyzed by business users right like how do you make how do you measure the success of the ROI of you know a feature that is dependent on other teams using it for example so like maybe kind of to ref the question how do you ensure that a lot of the as a data engineering leader that your partners are leveraging the newly built data that is now available by data engineering teams if customer data platform has been built right and uh we give uh the requirement was to build the unified customer data platform and this um so working closely with marketing we can help them build a customer segmentation for example like we give them a visualization um tools that they can use to actually uh segment the the segment their customer based based uh based based on their uh data that we uh uh based on the customer data platform we buil um you Empower business folks um through education of the data of showcasing what possible and again working together to um to ensure data is used visualization tools um a good place to start you know we've been talking a lot about how data engineering teams can uh you know uh Drive value and prioritize the road map and focus on value driven project but I think projects but I think an important thing to think about here is how you can be iterative in agile as a data engineering team right so being agile is something data teams across the board need to you know adopt more or like are slightly less mature than software engineering in general right so maybe what does AD look like for data engineering teams I think it's important to be flexible and AD quickly if something a plan unplanned comes up the team adjusts and moves forward without creating trition and roadblocks um collaboration uh Team workor really um important uh so we need to rely on each other to do their to do to do their part similar as in team sports so you have to be rely that your um team mates will be doing what they need to do uh business strategy changes frequently ability to adapt is important and flexibility ability to Pivot into different direction if need be okay great and then you know we're talking about here kind of you know being able to be flexible and to uh pivit but sometimes when looking at massive projects that data engineering teams undergo such as you know a building a data platform or you know uh uh building an entirely new data collection pipeline right uh how are how are data engineering seem able to kind of pemal these types of projects in a way that incrementally drives value how have you approached this in the past Do you have a big project on your plate um I usually look at this as a building the house right it's trying instead of trying to build the entire house at once you start with basic a solid foundation walls and the roof this is your MVP the minimum version of the house that functional and serves the purpose of providing shelter same goes to the project as the project progresses um additional features and fality added into into the solution in small incremental steps right um it's like adding rooms Furniture decorations um to the house over time make it more comfortable and functional as you go um first we like first we build um the required MVP minimum valuable product and after MVP is delivered um we start to we start to prioritize additional features based on the value that will be that they will be driving for organization so you know when you're building an MVP um how do you ensure that the MVP delivers value right like how do you define what an MVP looks like depending on the project for example this MVP have to be defined together with uh with the stakeholders so we collabora to identify what is MVP should look like for a given project um and identify what features are required and what features are actually nice to have it can be delivered later so they get prioritized uh the foundation has to build regardless we need the walls we need the roof all of this stuff but then uh you prioritize based on the uh you prioritize the features based on the way they trct the value for the business start small win big uh don't try to overhaul right uh Everything at Once pick a manageable project and break it down into the small short Sprints like mini projects so you so it lets you see the benefits of of agile quickly um and make adjustments and make adjustments as you progress and you know given the above like what are you know what is the advice that you would give data engineering leaders here looking to adopt agile methodologies how would you recommend that they start small in their next big project for example work closely with your product a product manager is your um is your I would say your business voice right uh work closer with your uh product uh product manager manager work closely with your business and Def Define what is the important what is the important for organization at this particular time right and um um like I said start small when big um one step at a time and uh and priority prioritization and value is defined together with uh business stakeholders and product managers okay that's really great and then you know we've been talking about uh how uh you know data engineering teams can be agile and can drive a lot of value and we also talked about how uh data engineering teams can be results driven right and focus on uh project that matter right you know you know I'm sure we alluded to we alluded to this earlier in our discussion about the risk of pursuing Shiny Toys right and I know now with generative AI uh we see a lot of hype today about you know the importance of building generative AI tools and generative AI uh kind of product features right um you find there's an increased risk of building something shiny and not necessarily useful here how do you balance that as a data engineering leader it's important to create a a Vibe an atmosphere um where everyone in the data team feels like they can come up with a cool new ideas and try them out but at the same time uh we need to make sure that team knows that it's important to focus on ideas that actually help the business it's like hey let's try new stuff and see what works but let's also make sure it's useful and get job done right it it goes back to result driven you want toing you always want to bring the result you don't want to just um feel your wheels for no reason you want to bring the results you want to see you wanna you want to bring the uh valuable outcome let's try let's experiment but let's uh concentrate on the um let's concentrate on tools and functionalities that actually bring bring the value to the business we are in currently and with the context of generative AI you know what is the role of the data engineering team necessarily in kind of building these tools like how do you see kind of data engineering teams build build building generative AI like what's the role of the data engineering team in building generative AI use cases primarily um I see data engineering team responsible actually bring them the data over from all different sources generative ai ai would require a lot of data from a lot of different uh sources to be um placed in a one spot for actually models to work and use this data uh primarily focus of data engineering actually to bring this data over possibly it together so the models can be so the models can be used this data for Learning and training so responsibility actually to bring this data over and Stitch it together for models and data science folks to use and do you find that the data engineering skill set will have to evolve for example to build you know trieval augmented generation pipelines or to build you know uh really specific type of pipelines unique to generative AI uh use cases like how do you see the skill set of data engineering teams evolve over the next few years as gentiv AI becomes more prevalent I see what I see is that a volume of data changing so we will need to adapt the skills and tools that can process large amounts of data um and like I said that space and skills are evolving all the time and we what we did 10 years ago it doesn't make sense today and obviously I see this same thing happening uh in 10 years from now so we need to adapt we need to learn we need to stay on top of what's going on uh in today's data world again large data data subset data sets um will need a lot of storage and a lot of horsepower to process all of that and build actually something valuable yeah 100% I think we see a lot of use cases but uh not a lot of value yet and then maybe as we close out our conversations Leah what are you know trends that you're seeing in the data engineering space you know as we're talking about generative AI we're talking about different tools what are trends that you look in the gener data engineering space that you're excited about that you're looking forward to seeing in the next few years or you know next few years is going to be maybe hard to predict by the next 12 months I uh I see a lot of um chat Bots being used so uh I see a lot of um I see a lot of changes in uh um in adopting um in adapting the um like a lot of um I like I don't think we're there yet uh in I think we just taken a small incremental steps in uh in a towards AI I don't think like everybody 100% understand what's going on it can be like not 100% oriented what's happening in the actual in the field I think we just take again like small steps and learn and uh try different things see what sticks what works um and try to uh be creative and don't um don't um try to in reinvent the will right if there is something exist let's reuse it let's use it let's see what we can adapt again it's a lot of unknown and a lot to learn still for everyone in this space I could agree more and maybe as we close out our chat uh Leah do you have any final notes to share with the audience I would say experiment don't don't afraid to fail learn from your mistake mistakes and don't forget to have fun with it it's a lot of it's unknown for everyone there are not a lot of experts in the fields at this moment so we're all learning together and ask questions right ask questions reach out to your um team teammates reach out to your part parners and learn learn keep learning and don't forget to have fun that's important definitely have fun that is very very important thank you so much Leah for coming on data friend thank you ad for having me it was pleasure\n"