Intel Nervana Update + Productizing AI Research with Naveen Rao and Hanlin Tang - #31

The Developer Experience of Using the Graph Project: A Conversation with Navneet Chandhok

Navneet Chandhok, Head of Intel's AI Product Group, recently sat down to discuss the latest developments in the Graph Project, a new initiative aimed at transforming the world of AI. As we talked about the project, it became clear that the experience of using the Graph Project is vastly different from traditional deep learning frameworks like Neon or TensorFlow.

So, how does the developer experience change when working with the graph project? According to Navneet, thinking about a neural network as a computational graph allows for optimizations to be performed in a way similar to a compiler or query planner in a database. This explicit representation of the graph enables developers to access and modify it at a node level, allowing for more composable code.

For developers who primarily use components like convolutional and pooling layers, their experience will be relatively similar. However, when developing new layers or applying custom computations, they can directly access the graph level and compose operations themselves. In many cases, this brings significant value, as not everyone applies vanilla models and layers that have already been developed for certain problem sets or data sets.

One of the key benefits of using the Graph Project is its ability to handle complex topologies with ease. Unlike traditional frameworks, which require explicit guidance on how to perform forward and backward passes during training, the graph takes care of much of this work. This makes it easier to compose models that meet specific needs, such as concatenating multiple streams of data or applying custom operations.

The Graph Project is still in its early stages, but it's clear that the community is already making significant contributions. Navneet encouraged listeners to check out the project's GitHub page and blog posts for more information on how to use it and get started with pre-trained models. The project also welcomes external contributions, so if you see a feature that you like but is missing, don't hesitate to contribute.

In conclusion, the Graph Project represents a significant shift in the way developers approach deep learning frameworks. By thinking about neural networks as computational graphs, developers can access and modify the graph at a node level, allowing for more composable code and easier handling of complex topologies. With its early-stage contributions and welcoming community, this project is sure to be an exciting development in the world of AI.

Getting Started with the Graph Project

Want to learn more about the Graph Project? Navneet recommends checking out the GitHub page, which hosts the latest commits and provides a wealth of information on how to use the framework. Additionally, Intel has released several blog posts that introduce the project itself and provide links to pre-trained models that you can easily get started with.

Intel also offers a model zoo, where you can find many pre-trained models that have been developed using the Graph Project. This is an excellent resource for developers who want to try out different architectures or techniques without having to develop them from scratch.

Contributing to the Graph Project

The Graph Project is still in its early stages, but it's clear that the community is already making significant contributions. If you see a feature that you like but is missing, don't hesitate to contribute. The project welcomes external contributions and encourages developers to share their ideas and feedback.

To get started with contributing, you can begin by checking out the GitHub page and reading through the documentation and guides provided. Once you have a better understanding of how the framework works, you can start exploring the issues list and proposing new features or bug fixes.

Conclusion

As we wrapped up our conversation with Navneet, it became clear that the Graph Project represents a significant shift in the way developers approach deep learning frameworks. By thinking about neural networks as computational graphs, developers can access and modify the graph at a node level, allowing for more composable code and easier handling of complex topologies.

With its early-stage contributions and welcoming community, this project is sure to be an exciting development in the world of AI. Whether you're a seasoned developer or just starting out, the Graph Project offers a wealth of resources and opportunities for growth. So why not get started today and see what you can build?

"WEBVTTKind: captionsLanguage: enhello and welcome to another episode of twiml talk the podcast where I interview interesting people doing interesting things in machine learning and artificial intelligence I'm your host Sam charington the show you're about to hear is the first of a series of shows recorded in New York City at the O'Reilly AI New York event but before we get to the show I've got a ton of updates and announcements for you first off I want to give a huge thank you to everyone who came out to our very first twiml happy hour in New York City there was a great mix of folks in attendance including listeners from New York O'Reilly AI attendees and NY AI members it was an awesome awesome night I want to especially thank miam and the rest of the team at the nyi Meetup for helping us pull this entire thing together and for clarify for supporting it the O'Reilly AI conference itself was great and of course my favorite part was getting to meet so many listeners I especially enjoyed meeting twiml super fans Bill Baran and bethanne Noble they're both longtime listeners of the show and highly engaged members of this community and it was just amazing to get a chance to hang out with them I did a ton of interviews at the show and I'm pleased to present them to you for your binge listening pleasure we have got your commute covered for the entire week with this series the series is brought to you by our friends at Intel Nirvana I talked about Intel's acquisition of Nirvana systems when it happened almost a year ago and I was super excited to have an opportunity to sit down with Nirvana co-founder naven raal who now leads Intel's newly formed AI products group naven and I talked about how Intel plans to extend its leadership position in general purpose compute into the AI Realm by delivering silicon design specifically for AI endtoend Solutions including the cloud Enterprise Data Center and the edge and tools that let customers quickly productize and scale AI based Solutions I also spoke with hanin tang an algorithms engineer on that team about two such tools announced at the conference version 2.0 of neon Intel Nirvana's deep learning framework and Nirvana graph a new project for expressing and running deep learning appli ation as framework and Hardware independent computational graphs Nirvana graph in particular sounds like a very interesting project not to mention a smart move for Intel and I'd encourage folks to take a look at their GitHub repo AT github.com nirvan Systems as well as their main site at Intel nirvana.com one of the things announced at the conference is that Intel and O'Reilly will be partnering on the AI conference going forward starting with the San Francisco disco event in September they've also changed the name of the event to the AI conference to celebrate all this we at twiml are going to start our the AI conference ticket giveaway early and run it through the end of the month to enter just let us know what you think about any of the podcasts in this series or post your favorite quote from any of them on our series page on Twitter or via any of our social media channels make sure to mention at twiml aai at Intel aai and atthi com so that we know you want to enter full details can be found on the series page at twim ai.com oilly AI by the time this series drops I'll have just return to the States from my trip to Europe I'll be back on the road later this month to check out the Wrangle conference in San Francisco on July 20th you may remember that I recorded the very first twiml talk show with Claire Corel at at Wrangle and Cloud era was my very first sponsor for the podcast so I'm really looking forward to getting back there I definitely hope to catch up with some twiml listeners while I'm out there so please check the event out if you'd like to attend I've arranged for a special discount for twiml listeners using the code PC viip that's good for 20% off of registration finally a couple of shows ago I mentioned the idea of starting a paper reading group and it turns out a bunch of you are interested so let's make it happen if you'd like to give some input on the details visit twiml ai.com Meetup and join the discussion in the comments actually I've got one more finally longtime listeners will know that I've been talking about doing a newsletter since the dawn of time well a friend and I were chatting the other day about how we've both been putting off launching our newsletters and ended up challenging each other to just do it so look for it next week and if you're not already signed up please do so at twiml ai.com newsletter okay apologies for the long intro but now a bit more about this series in addition to my conversations with naven and hanlin this series is packed with more interviews that I know you'll love including my conversation with Doug E of Google brains project Magenta in which we discussed the intersection of AI and art in general as well as Google's recently announced performance RNN project which was demonstrated for the first time at the O'Reilly AI conference Ben VOD of gamalon in which we discuss probabilistic programming this one I think is nerd alert worthy raaza Z day of matroid about how his company is scaling video object detection and R El cybi of affectiva about how her company uses emotional AI to allow Brands to better measure the effectiveness of customer experiences all right enough meta let's jump jump right into the first episode of our O'Reilly AI New York series after the bumper you'll hear my brief interview with naven and immediately after that my interview with hanlin enjoy so hey everyone I am here with naven raal naven is the vice president of Intel AI product group and we're here on location at the O'Reilly AI conference where he just delivered a keynote de how are you great it's I think it went over pretty well short and sweet got to announce a few important things around some of the open source projects we have going on as well as our direction of endtoend AI great why don't you tell us a little bit about the announcements you made yeah so one of them was about Intel Nirvana graph this is a almost an abstraction where for Hardware basically collapsing Primitives from different deep learning Frameworks into a common representation ation that we can then optimize for different types of Hardware platforms CPUs gpus our new architecture fpj that kind of thing so it really lessens the burden on optimizing each framework for every new hardware platform out there I think this is something we want to drive forward as a standard in industry and the other one is we release neon 2.0 which is our reference standard framework for deep warning and this supports Intel architectures CPUs the latest CPU that's going to be launched from Intel will be supported by this framework and optimized highly okay great and I've got a conversation schedule with one of the technical folks in at Intel Handlin for later on so we'll dig into some of that but I wanted to also just kind of get a pulse from you on it's been almost exactly a year since the acquisition it's been about 10 months now about 10 months how's it going and what have you been up to what's been consuming your time besides from the announcements that you just made yeah it's been a ride actually so when we came in to Intel we were a 50 person startup now we're we've formed an entire new division devoted to AI kind of seated from that 50% startup and it's it's much bigger than that now so it's actually been quite exciting to you know bring together the resources that we have at Intel and actually drive a bigger picture a broader portfolio of products and solutions to the industry the way we think has to change a bit as a startup you're trying to be Scrappy you're trying to get that next deal now we can think in a in much bigger way right we can say well what can we do that'll have the Maximum Impact across the entire industry the sales channels and you know relationships that Intel has with Enterprise is just enormous 6,000 sales people can be unwashed which is just a different way of thinking entirely from a startup absolutely absolutely if I can kind of dig right in one of the you know what I think of like the elephant in the room when I think about Intel is you know at the the chip level Nvidia was kind of at the right place at the right time for a with their gpus and they a lot of people think that they've got a big head start in the market and you know I wonder what's kind of how does Intel think about that and what's the plan I mean there's no doubt they're executing extremely well they're doing a great job they've adapted their architecture for these kinds of problems pretty well you know Head Start sure Yahoo had head start over Google too so there's a there's a lot of examples to the to the contrary there and you know I I I welcome the competition I think we're at a point now where obviously there's not going to be there's never going to be one provider for these things Intel really owns the host processor in the data center the heart of it the huge software Investments that have been made in terms of building the internet right things where you can really scale out infrastructure make it reliable like when you hit a website it works every time because of all the software investment built on top of Intel architecture so we're leveraging that and actually most AI solutions that are deployed in the data center running on Intel right so we have we have those things we're enabling them with our software stack today some of the announcements I made are relevant to that we're adapting our main product lines for these purposes and we're also going straight for AI as a as a preferred workload essentially for acceleration so you'll be seeing some announcements in the next you know 6 months to a year around our our silicon in that space as well okay you guys have made some pre-announcements in terms of the broad picture broad brush road map can you walk us through you know what we should expect to see sure so that was really based on the road map we had from Nirvana as a as a small company so we are developing the Silicon that we were developing at Nirvana you know it's going to be in prototypes this year and we're really taking the learnings from that building a real new architecture for this kind of workload is not simple right takes a few iterations so we're not really announcing products beyond that in terms of road map but we are basically going to have products out in 2018 19 20 we have an entire road map that we're not talking about performance just yet but I mean we do have some really important and exciting things on the horizon from the Silicon engineering side as well so AI is is a great showcase for those capabilities because density of compute and power per operation matter a lot right so that's something that in it's right in Intel's wheelous and then speaking of of density a big part of the Nana story was around Cloud how are you guys thinking about the role of cloud with regards to AI so yeah that's a really interesting question so part of it is actually we're continuing our hosted cloud service so we're calling until Nirvana Cloud we look at that as very very much a quick way of getting going on a solution for an Enterprise in addition to that we want to bring those capabilities on Prem for Enterprise customers who don't necessarily want to move data off their premises and so that's kind of the products we'll be you'll be seeing in the next year or so then obviously broader indust industry cloud service providers are a huge customer of Intel so Amazon Google Microsoft the big ones so they're all developing their AI platforms and we're supporting that effort it's basically a different kind of customer for us but we look at is basically they intercept at different points in the stack right and so I think you're going to see a variety of solutions ranging from fully in the cloud to hybrid on Prem cloud and completely on on PR okay oh great great one of the announcements you made was around the Nirvana software sack how does can you talk about how that relates to some of the other Frameworks that are out in the marketplace tensorflow for example has gained a lot of traction I think my impression was that Nirvana stack was initially positioned as an alternative to something like a tensor flow is that still the case and how do you see the kind of the lscape there yeah so when we first started n actually there was no tensor flow right there were there were a few fragment mented Frameworks we put neon out at that time and it is still an alternative two tensor flow it's kind of works at the same semantic level okay we are keeping that development going as a reference standard people can obviously build on it and we're supporting it that's good for us because it allows us to bring the latest optimizations that we have for Hardware to the open source Community quickly we're not beholding to anyone else who owns the database basically so we can get those out Intel Nirvana graph is about supporting everybody else's framework so if you go to Intel nirvana.com you can actually see how we're plugging tensorflow into Intel nervonic graph and allowing it to be optimized on various Hardware platforms so we want to play in the community that way but we can control the ecosystem from the neon side and provide the latest Innovations there and it'll take a little bit longer for the trickle down into the rest of the the open source Community okay what are some of the specific ways today that the hardware Innovations are surfacing in into the neon framework I mean these are some of these things are we can't talk about just yet but the way we're going to do parallelism and distribution of workloads we have some novel constructs and the way we handle memory and things like that okay it's not to say we couldn't make it work in other Frameworks but we'd have to really Fork it and do things a little bit differently so we can get those new Concepts out and I think now what's what's cool about being part of such a big company is that we can actually shape how the rest of Industry sees this so we get those things out I think researcher starts playing with it and we start seeing uh changes happening in all the Frameworks probably okay I'm sure it'll be inter in I've seen a similar path happen with the Intel investment in Cloud era and how they pushed a lot of the security and encryption Innovation and other things like that into the Hado ecosystem it'll be interesting to watch so I think we're about at our time anything else you'd like to mention to the listeners well I think you know the partnership with O'Reilly is very exciting for us I think we're at a time in Industry where we're seeing adoption happen quickly and so uh O'Reilly has been and the strata Hadoop side has been really a big player in that and so I see a parallel happening with AI as well and so I hope to see this this grow for O'Reilly and we'll be part of it so and you just announced a strategic partnership where you guys are the exclusive partner for the O'Reilly AI conference going forward not exclusive we'll still take on other Partnerships of course with them but we are the the main headline sponsor yes okay analogous to the cloud era and exactly strata data now got conference okay awesome awesome well looking forward to seeing you in September in at the O'Reilly AI San Francisco looking forward to as well awesome thanks ni all right thank you hey everyone I am here with hanin tang hanlin is a senior algorithms engineer with Intel Nirvana hanlin gave a talk here yesterday at the O'Reilly AI conference and we're here to talk about his talk and what he's been up to how you doing hlin good how are you I'm doing great I'm doing great why don't we start by actually having you talk a little bit about your background and how you got into Ai and algorithms yeah of course I guess it mainly started when I was in graduate school I was doing research in computational neuroscience and that's really where the connection between understanding how the brain works and attempting to transfer some that knowledge into silicon and computer systems really took hold so after graduate school I joined Nirvana which is a deep learning startup and through that I've begun to sort of apply the research that I did in graduate school to some of the applications that that Nirvana is is developing that now of course as Intel we have the opportunity to scale that out quite significantly across all fronts Hardware software algorithms great great and you gave a talk here yesterday at the conference that's right I think I mainly focused on how do we do that exact same process that that I had just described of you know taking research these sort of algorithms and models that you see in the scientific literature and then begin to apply them and deploy them into production settings okay there are sort of unique challenges that you face when trying to do something like that why don't we have you walk us through that I know a lot of myself and a lot of our listeners will read papers and walk through you know the latest Cutting Edge research and try to understand how to implement it but putting it into a production is a whole another issue so how did you frame that up in your talk I think I mainly focused around three key aspects so the first one being the lack of data so I think we've often heard that there is a flood of data in the world today and certainly with Fortune 500 companies and government agencies there's a large Corpus of data but all that data has to be funnel through a very small pipe of of manual annotations because existing methods we need a human to actually go through and put you know boxes around all of the cars for thousands of images before a model can learn to do it so we're data rich but labeled data poor that's right and being able to navigate that environment with either in heavy investments in data or some of the newer techniques and generating synthetic data from what you already have is sort of quite critical in building applications that perform well because deep learning particularly requires a large amount of data to reach the level of performance that sort of exceeds what humans can do mhm so on your first point then with regards to data you know we there's clearly you know there are ways to take this on manually by you know just investing in labeling data but on the synthetic data side what's happening in that part of the what kind of activity is happening there so one great example is is from Intel Labs where they have used video games mhm to generate some realistic imagery by sort of getting Graphics artists and such to build out a video game environment to use that sort of build a synthetic data in order to train many of the autonomous driving applications okay or alternatively there have also been advances in using generative ad serial networks to also generate realistic imagery that could be used during the training process oh interesting I think I've seen examples of the using video game data to train autonomous driving programs at the time I thought the results that I saw suggested that for whatever reason the results didn't transfer very well did you guys in the lab research that you're referring to find some ways to address that I think that's still an active area of research is how to generalize but they did find that if you're able to augment your existing rear World data set with the synthetic data set you do get better performance overall because this transferability problem exists for real data sets as well or you may collect large amounts of data in one city but not able to generalize to other cities or or other environments okay oh interesting yeah the other sort of aspect that I highlighted was was building a feedback loop into the into your systems to have annotation occur on the edge and what I mean by that is if you're building say a Aviation Security application when you have sort of detectors at the scanning sites looking for you know dangerous objects and baggage you also want to build in a system for the agents to provide feedback on how the algorithm is doing and in that way you build a sort of cycle of colle collecting data and monitoring data in production right and we've seen that to be quite critical because the world and the can change underneath you so objects that may be more popular during the summer may be less popular in the winter MH so being able to monitor those changes of the distribution of object that you expect to see and modify the algorithm appropriately is quite critical that's an interesting point I know a lot of startups are you know founded basically around this idea of collecting data and allowing consumers to to basically annotate models for them annotate data for them but I can imagine Enterprises building these systems and putting them out and not kind of closing the loop that's right is relatively easily to close the loop when it's just a web interface where the closing the loop is is is quite simple but in autonomous driving or Aviation Security or many of these other applications where physically the inference occurs on the edge you actually have to build in the networking and the storage and the memory and all the sort of components that Intel has in order to close that Loop in many of these many of these scenarios okay all right so you talked about data as one of the first elements of being able to put these systems into production what else did you talk about the other point that I really wanted to highlight was around model selection uhhuh it's a difficult challenge these days because for any particular task such as object localization you will find many models in the literature so faster rcnn single shot detection models R fcn and they're always newer ones coming out you know all the time so Intel recently has PVA net as well and how do you make a decision as a data scientist of what models to choose mhm and I guess what we've seen is that many customers May sort of just choose the latest model and run with that where sometimes you have to make very fine grain speed accuracy tradeoffs mhm around your particular use case so a particular model may be more performant but also take longer to train in which case your iteration cycle is slower right or and your training costs are higher yes and your training costs are higher or some models May perform better than another model on sort of an aggregate performance metric but perhaps one model will perform better at small object compared to large objects and so be able to make that fine grained determination not just on sort of an average metric level but also splitting it up into the individual categories depending on what you're interested in is what we found to be valuable for many of our Enterprise customers do you find that that that understanding the the various tradeoffs is it to what degree is it dependent on a very specific use case and specific data set and I guess the broader question is is it possible to to kind of come up with some standard metrics around you know in a given category like object detection or speech recognition and you know rate the different algorithms according to you know some set of standard metrics I haven't seen anything like that but it would certainly be helpful to folks that are you know coming into a space like object detection and trying to figure out where to get started there are certainly ways to do that so there was a recent paper by many of our colleagues at Google on doing exactly that okay measuring the various types of object detection models on performance and speed mhm and that is sort of valuable work to help guide many of our customers and that determination somewhat General across different use cases within object detection however for your particular use case you need to dive much deeper than that it's not enough just to look at the overall mean average Precision which is the metric that they use you then have to split it out by particular pedestrians or motion large objects small objects and that determination is much more use case specific and then in this paper that you're referring to did they also consider practicalities like training time and you know training cost things like that they did not I think they were mostly focused on the INF side of the equation so that's certainly valuable work moving forward mhm interesting so what else did you cover in your talk I think those were the three sort of main points M I made around data closing the loop and model selection okay I guess if I were to add a fourth one that I had mentioned was knowing your model Provence mhm so going back to the object detection example there that model and many of those models are designed for specific data sets you know as a grass student you you build a model around specific benchmark data sets to help guide you on on how you are doing MH and so in particular two data sets Pascal VOC and Ms Coco have been very popular for object detection MH but those data sets typically have maybe 5 to 10 objects per image mhm if you're trying to transer that same object detection model to do a different application such as satell imagery mhm suddenly you have a scenario where the data set statistics don't match the Benchmark data set that the model was designed for right right in satellite imagery you have hundreds of objects in a particular image right you have rotational symmetry because an aerial image can can be rotated and retain many of the similar properties MH you have boxes around buildings that are no longer you know that that are rotated and so now you need to additionally predict an angle in addition to the coordinates and so we've been actively developing ways to adapt existing models to that application so even though both tasks are object detection knowing where the model came from and what it was designed for can help you sort of maximize the performance on your particular application where the statistics may be completely different MH yeah this comes up all the time in the context of the research community and the practitioner Community kind of you know quote unquote overfitting on image net right and it sounds like some similar things are happening in the object detection realm where there are some standard data sets that you know folks are building models on and then trying to apply all over the place knowing the Providence of your models is one thing it sounds like you're also then coming up with techniques that allow you to generalize and adapt those models to new situations in the case of satellite imagery for example how exactly do you what's kind of the underlying techniques that you're using to enable the model to adapt to a different use case it's really doing surgery on the model itself so in this particular case existing object detection models in the literature mostly just predict the XY coordinate of the upper left corner and then the width and the height of of a box mhm and so we are working on modifying that for to additionally predict rotation angle for example or dealing with multispectral satellite data where it's not just red green blue in the image but IR near IR as well so there also these other spectrums that we can begin to build a model to to take advantage of okay so we're here we're talking about evolving the models themselves as opposed to we're not training a model on the original data set and then using some techniques with a a train model to give it better inference in a new scenario it's you know we're talking about how do we build a new model that is better adapted to this situation yes exactly I mean don't get me wrong these existing object localization models are are very powerful oh absolutely you know five years ago as a grass I would never have imagined a world where these sort of applications can be done and not only be perform well but at real time speed right is quite incredible and so we're really sort of standing on the shoulders of sort of these papers but then iterating further for particular use cases where you need to make some some changes mhm okay and then I think in your presentation you also talked a little bit about the Nirvana stack and what was happening in that area it's an exciting time for us we've spent the last two and a half years building a full stack for for deep learning and I think as Nirvana now as Intel we still firmly believe in that philosophy that you need the full stack in order to get maximal performance and ease of use out of what we're building so it's everything from the custom silicon to our software work up to a cloud or platform service and then finally to to applications and now that we're part of Intel we have an opportunity to supercharge those efforts in a sense mhm and so as naven had mentioned we've released neon 2.0 which neon previously was known you know when we were startup as one of the fastest Frameworks on gpus due to the custom assembly kernels that we wrote and some of the algorithmic innovations that we introduced with winter grad convolution and now we're you know have we've been working very hard with other engineers at Intel to also optimize neon for Intel architecture and so by integrating into Intel's math kernel Library we can achieve quite significant speed UPS so on an image classification model such as googlet inference is about 98 times faster than previously so these are very serious optimizations that Intel has done for other Frameworks as well such as tensorflow and mxnet and we're excited to work with them to now bring many of these optimizations into neon as well okay for those that aren't familiar with neon can you walk us through the the design philosophy and how it differs from some of the other Frameworks that are out in the market of course so neon we really designed for Enterprise use cases where speed was quite important and many of our customers don't really need the model of the week they want a stable and fast and optimized object localization model or speech recognition model and that's really what we provide to many of our customers MH in addition we pay a lot of attention to data loading because the you know the folks that we talk to at O'Reilly you know those that run companies that build applications they don't train their models on imet MH or on Ms coco or on pentry bank they train on their own custom data sets right and so we put a lot of effort into designing modular data loaders that are fast but also flexible okay and easily provide a data API for users to switch between different models so if I'm doing an object detection model we have a couple that we've implemented there's a common data API for loading that kind of data so that you can test different models relatively easily mhm okay are there particular use cases beyond the ones that you've mentioned that you found to be the sweet spot for neon relative to other Frameworks I think we found use cases across a variety of domains so not just an image that I mostly focused on mhm but also in speech recognition where we've developed a model based on by state-ofthe-art Deep speech 2 model OKAY in natural language processing which many of our financial customers are using I think one of our sort of MOS is to keep track of the quickly changing literature for new models coming out that bring new level of capabilities and then porting them into neon and then optimizing them for Speed and for stability and for ease of of data loading and that's really where we find the value to provide to many of the folks that that we work with one of the the challenges for Enterprises putting these types of machine learning models into production is monitoring their performance over time and then building a a feedback loop that allows them to improve and enhance their models does neon have anything in particular to offer in that scenario yes so in neon we've built in callbacks okay that allow the model to report back its progress either during the training process MH but also actually mostly during during the training process we don't have anything currently for sort of specially built for monitoring an inference okay but that's certainly a good idea that we can look into okay so in addition to the neon 2.0 announcement you also announced the Nirvana graph product can you talk to us a little bit about that the Nirvana graph project really started even before we were acquired okay at Nirvana when we realized that many of the newer models coming out attention based models for example were much easier expressed in a computational graph mhm and that's really the core of the effort that we're doing where we've really rethought the back end of neon and with the nirvan graph effort we built and we're in in the process of Designing a a nervon graph inmunity representation mhm which different Frameworks can then hook into okay and then on the back and different Hardware backends will take the graph emitted from Nirvana graph and then apply their Hardware specific optimization passes to eventually build an executable execution graph that can run on different devices so at Intel we're fortunate to have a variety of Hardware targets Zeon Zeon fi future Lake Crest movidas fpgas and having a common tool chain to allow folks to train on one Hardware device and Deploy on another one or train on a heterogeneous mixture of Hardware devices we think will really change how models are being developed and make it much easier for industry to transfer those models you know between these different devices can you talk a little bit more about graph as a paradigm for building out these models you mentioned attention based models as an example how what's the relationship between you know a graph based view of the world and attention models and and how does a graph framework support building that kind of model better fundamentally neural networks are graphs of computations M mhm and representing that directly in the Nirvana graph framework and allowing folks to interact with the graph in building these models will allow much more flexibility than what we had originally envisioned with neon okay and so is the idea that for an arbitrary Network my different layers represent nodes in the graph or what does a node in the graph represent so a node in a graph represents a fundamental operation like adding or exponential function or Matrix multiply okay and so layers which is a higher level concept each layer implements a graph of operations okay and when we string together layers we're essentially adding components and nodes to the graph itself and that's during the construction phase MH and after that construction is complete we now have a complete view of the exact operations that you need to do to train your model and from that we can then apply optimization passes to reduce unnecessary computations to optimize memory usage and also take into account the specific data layout requirements of different Hardware mm so how does the developer experience change in using the graph project would you call it a project or a product I would call it a project okay okay but I'm an engineer not a not a product person so I'm not quite sure what the exact semantics are yeah yeah so it's clear that you know thinking about a neural network as a computational graph and kind of having that explicitly laid out allows us to you know perform optimizations just like you know a compiler M or like a query planner and a database might is this something that's happening at the developer level or is it kind of dimed in underneath an experience that the developer might already be using like neon or tensor flow or something else if you're a developer that principally uses components that have already been written out so layers like convolution and pooling MH then your experience will be relatively similar mhm if you have to develop new layers or apply some custom computation then you're able to access this node Lev level directly okay and develop on that so you can compose the Ops yourself okay into into a graph and in many cases we find that the ladder brings a lot of value because not everyone is just sort of applying the sort of vanilla models and layers that have already come out there for in part for the reasons that we previously talked about they don't apply as well to you know some problem sets or data sets than a model that's been you know augmented to meet the specific needs of that data set right and in many cases it's not just the different layers that they're composing together but also the way they're composing them together so I have a bunch of layers coming in and I want to Fork that into outputs that get broadcast to multiple streams or I may have multiple streams of data coming in whether it be images and video and text I want able to concatenate that together mhm so the graph also allows that to be much easier than before because if know the graph we know how to do the forward and the backward passes during the training regime whereas before a neon you would have to explicitly guide the model towards okay I have this topology do the forward pass this way and then pass you know the the outputs across this fork and then collect the gradients ETC so the graph takes care a lot of that so you can it's much more composable mhm is one of the key points okay oh super interesting how can folks learn more about the Neon and the graph projects I definitely encourage listeners to check out our GitHub page okay Nirvana systems there is a repository for neon and also for Eng graph the graph okay additionally if you go to www.el nirvana.com we have a couple blog posts that introduce the sort of the framework itself how to use it links to the model Zoo where we have a lot of pre-trained models that folks can easily get started with m and we definitely encourage the community to contribute as well the Nirvana graph effort is still in the early stages we sort of releasing our sort of latest commits quite often but definitely if if users look into that and see a feature that they like that they're missing you know we definitely welcome external contributions awesome awesome anything else you'd like to mention I think that's it I'm excited to be here at Ed O'Reilly talking to you and happy to continue the conversation in future future meetings fantastic thank you hen thank you all right everyone that is our show thanks so much for listening and for your continued support comments and feedback a special Thanks goes out to our series sponsor Intel Nirvana if you didn't catch the first show in this series definitely check it out if you didn't catch the first show in this series where I talk to navine raal the head of Intel's AI product group about how they plan to Leverage their leading position in proven history in Silicon Innovation to transform the world of AI you're going to want to check that out next for more information about Intel Nirvana's AI platform visit Intel nirvana.com remember that with this series we've kicked off our giveaway for tickets to the AI conference to enter just let us know what you think about any of the podcast in this series or post your favorite quote from any of them on the show notes page on Twitter or via any of our social media channels make sure to mention at twiml AI at Intel aai and at the AI conf so that we know you want to enter the contest full details can be found on the series page and of course all entrance get one of our slick twiml laptop stickers speaking of the series page you can find links to all of the individual show notes Pages by visiting twiml ai.com oilly a i n y thanks so much for listening and catch you next timehello and welcome to another episode of twiml talk the podcast where I interview interesting people doing interesting things in machine learning and artificial intelligence I'm your host Sam charington the show you're about to hear is the first of a series of shows recorded in New York City at the O'Reilly AI New York event but before we get to the show I've got a ton of updates and announcements for you first off I want to give a huge thank you to everyone who came out to our very first twiml happy hour in New York City there was a great mix of folks in attendance including listeners from New York O'Reilly AI attendees and NY AI members it was an awesome awesome night I want to especially thank miam and the rest of the team at the nyi Meetup for helping us pull this entire thing together and for clarify for supporting it the O'Reilly AI conference itself was great and of course my favorite part was getting to meet so many listeners I especially enjoyed meeting twiml super fans Bill Baran and bethanne Noble they're both longtime listeners of the show and highly engaged members of this community and it was just amazing to get a chance to hang out with them I did a ton of interviews at the show and I'm pleased to present them to you for your binge listening pleasure we have got your commute covered for the entire week with this series the series is brought to you by our friends at Intel Nirvana I talked about Intel's acquisition of Nirvana systems when it happened almost a year ago and I was super excited to have an opportunity to sit down with Nirvana co-founder naven raal who now leads Intel's newly formed AI products group naven and I talked about how Intel plans to extend its leadership position in general purpose compute into the AI Realm by delivering silicon design specifically for AI endtoend Solutions including the cloud Enterprise Data Center and the edge and tools that let customers quickly productize and scale AI based Solutions I also spoke with hanin tang an algorithms engineer on that team about two such tools announced at the conference version 2.0 of neon Intel Nirvana's deep learning framework and Nirvana graph a new project for expressing and running deep learning appli ation as framework and Hardware independent computational graphs Nirvana graph in particular sounds like a very interesting project not to mention a smart move for Intel and I'd encourage folks to take a look at their GitHub repo AT github.com nirvan Systems as well as their main site at Intel nirvana.com one of the things announced at the conference is that Intel and O'Reilly will be partnering on the AI conference going forward starting with the San Francisco disco event in September they've also changed the name of the event to the AI conference to celebrate all this we at twiml are going to start our the AI conference ticket giveaway early and run it through the end of the month to enter just let us know what you think about any of the podcasts in this series or post your favorite quote from any of them on our series page on Twitter or via any of our social media channels make sure to mention at twiml aai at Intel aai and atthi com so that we know you want to enter full details can be found on the series page at twim ai.com oilly AI by the time this series drops I'll have just return to the States from my trip to Europe I'll be back on the road later this month to check out the Wrangle conference in San Francisco on July 20th you may remember that I recorded the very first twiml talk show with Claire Corel at at Wrangle and Cloud era was my very first sponsor for the podcast so I'm really looking forward to getting back there I definitely hope to catch up with some twiml listeners while I'm out there so please check the event out if you'd like to attend I've arranged for a special discount for twiml listeners using the code PC viip that's good for 20% off of registration finally a couple of shows ago I mentioned the idea of starting a paper reading group and it turns out a bunch of you are interested so let's make it happen if you'd like to give some input on the details visit twiml ai.com Meetup and join the discussion in the comments actually I've got one more finally longtime listeners will know that I've been talking about doing a newsletter since the dawn of time well a friend and I were chatting the other day about how we've both been putting off launching our newsletters and ended up challenging each other to just do it so look for it next week and if you're not already signed up please do so at twiml ai.com newsletter okay apologies for the long intro but now a bit more about this series in addition to my conversations with naven and hanlin this series is packed with more interviews that I know you'll love including my conversation with Doug E of Google brains project Magenta in which we discussed the intersection of AI and art in general as well as Google's recently announced performance RNN project which was demonstrated for the first time at the O'Reilly AI conference Ben VOD of gamalon in which we discuss probabilistic programming this one I think is nerd alert worthy raaza Z day of matroid about how his company is scaling video object detection and R El cybi of affectiva about how her company uses emotional AI to allow Brands to better measure the effectiveness of customer experiences all right enough meta let's jump jump right into the first episode of our O'Reilly AI New York series after the bumper you'll hear my brief interview with naven and immediately after that my interview with hanlin enjoy so hey everyone I am here with naven raal naven is the vice president of Intel AI product group and we're here on location at the O'Reilly AI conference where he just delivered a keynote de how are you great it's I think it went over pretty well short and sweet got to announce a few important things around some of the open source projects we have going on as well as our direction of endtoend AI great why don't you tell us a little bit about the announcements you made yeah so one of them was about Intel Nirvana graph this is a almost an abstraction where for Hardware basically collapsing Primitives from different deep learning Frameworks into a common representation ation that we can then optimize for different types of Hardware platforms CPUs gpus our new architecture fpj that kind of thing so it really lessens the burden on optimizing each framework for every new hardware platform out there I think this is something we want to drive forward as a standard in industry and the other one is we release neon 2.0 which is our reference standard framework for deep warning and this supports Intel architectures CPUs the latest CPU that's going to be launched from Intel will be supported by this framework and optimized highly okay great and I've got a conversation schedule with one of the technical folks in at Intel Handlin for later on so we'll dig into some of that but I wanted to also just kind of get a pulse from you on it's been almost exactly a year since the acquisition it's been about 10 months now about 10 months how's it going and what have you been up to what's been consuming your time besides from the announcements that you just made yeah it's been a ride actually so when we came in to Intel we were a 50 person startup now we're we've formed an entire new division devoted to AI kind of seated from that 50% startup and it's it's much bigger than that now so it's actually been quite exciting to you know bring together the resources that we have at Intel and actually drive a bigger picture a broader portfolio of products and solutions to the industry the way we think has to change a bit as a startup you're trying to be Scrappy you're trying to get that next deal now we can think in a in much bigger way right we can say well what can we do that'll have the Maximum Impact across the entire industry the sales channels and you know relationships that Intel has with Enterprise is just enormous 6,000 sales people can be unwashed which is just a different way of thinking entirely from a startup absolutely absolutely if I can kind of dig right in one of the you know what I think of like the elephant in the room when I think about Intel is you know at the the chip level Nvidia was kind of at the right place at the right time for a with their gpus and they a lot of people think that they've got a big head start in the market and you know I wonder what's kind of how does Intel think about that and what's the plan I mean there's no doubt they're executing extremely well they're doing a great job they've adapted their architecture for these kinds of problems pretty well you know Head Start sure Yahoo had head start over Google too so there's a there's a lot of examples to the to the contrary there and you know I I I welcome the competition I think we're at a point now where obviously there's not going to be there's never going to be one provider for these things Intel really owns the host processor in the data center the heart of it the huge software Investments that have been made in terms of building the internet right things where you can really scale out infrastructure make it reliable like when you hit a website it works every time because of all the software investment built on top of Intel architecture so we're leveraging that and actually most AI solutions that are deployed in the data center running on Intel right so we have we have those things we're enabling them with our software stack today some of the announcements I made are relevant to that we're adapting our main product lines for these purposes and we're also going straight for AI as a as a preferred workload essentially for acceleration so you'll be seeing some announcements in the next you know 6 months to a year around our our silicon in that space as well okay you guys have made some pre-announcements in terms of the broad picture broad brush road map can you walk us through you know what we should expect to see sure so that was really based on the road map we had from Nirvana as a as a small company so we are developing the Silicon that we were developing at Nirvana you know it's going to be in prototypes this year and we're really taking the learnings from that building a real new architecture for this kind of workload is not simple right takes a few iterations so we're not really announcing products beyond that in terms of road map but we are basically going to have products out in 2018 19 20 we have an entire road map that we're not talking about performance just yet but I mean we do have some really important and exciting things on the horizon from the Silicon engineering side as well so AI is is a great showcase for those capabilities because density of compute and power per operation matter a lot right so that's something that in it's right in Intel's wheelous and then speaking of of density a big part of the Nana story was around Cloud how are you guys thinking about the role of cloud with regards to AI so yeah that's a really interesting question so part of it is actually we're continuing our hosted cloud service so we're calling until Nirvana Cloud we look at that as very very much a quick way of getting going on a solution for an Enterprise in addition to that we want to bring those capabilities on Prem for Enterprise customers who don't necessarily want to move data off their premises and so that's kind of the products we'll be you'll be seeing in the next year or so then obviously broader indust industry cloud service providers are a huge customer of Intel so Amazon Google Microsoft the big ones so they're all developing their AI platforms and we're supporting that effort it's basically a different kind of customer for us but we look at is basically they intercept at different points in the stack right and so I think you're going to see a variety of solutions ranging from fully in the cloud to hybrid on Prem cloud and completely on on PR okay oh great great one of the announcements you made was around the Nirvana software sack how does can you talk about how that relates to some of the other Frameworks that are out in the marketplace tensorflow for example has gained a lot of traction I think my impression was that Nirvana stack was initially positioned as an alternative to something like a tensor flow is that still the case and how do you see the kind of the lscape there yeah so when we first started n actually there was no tensor flow right there were there were a few fragment mented Frameworks we put neon out at that time and it is still an alternative two tensor flow it's kind of works at the same semantic level okay we are keeping that development going as a reference standard people can obviously build on it and we're supporting it that's good for us because it allows us to bring the latest optimizations that we have for Hardware to the open source Community quickly we're not beholding to anyone else who owns the database basically so we can get those out Intel Nirvana graph is about supporting everybody else's framework so if you go to Intel nirvana.com you can actually see how we're plugging tensorflow into Intel nervonic graph and allowing it to be optimized on various Hardware platforms so we want to play in the community that way but we can control the ecosystem from the neon side and provide the latest Innovations there and it'll take a little bit longer for the trickle down into the rest of the the open source Community okay what are some of the specific ways today that the hardware Innovations are surfacing in into the neon framework I mean these are some of these things are we can't talk about just yet but the way we're going to do parallelism and distribution of workloads we have some novel constructs and the way we handle memory and things like that okay it's not to say we couldn't make it work in other Frameworks but we'd have to really Fork it and do things a little bit differently so we can get those new Concepts out and I think now what's what's cool about being part of such a big company is that we can actually shape how the rest of Industry sees this so we get those things out I think researcher starts playing with it and we start seeing uh changes happening in all the Frameworks probably okay I'm sure it'll be inter in I've seen a similar path happen with the Intel investment in Cloud era and how they pushed a lot of the security and encryption Innovation and other things like that into the Hado ecosystem it'll be interesting to watch so I think we're about at our time anything else you'd like to mention to the listeners well I think you know the partnership with O'Reilly is very exciting for us I think we're at a time in Industry where we're seeing adoption happen quickly and so uh O'Reilly has been and the strata Hadoop side has been really a big player in that and so I see a parallel happening with AI as well and so I hope to see this this grow for O'Reilly and we'll be part of it so and you just announced a strategic partnership where you guys are the exclusive partner for the O'Reilly AI conference going forward not exclusive we'll still take on other Partnerships of course with them but we are the the main headline sponsor yes okay analogous to the cloud era and exactly strata data now got conference okay awesome awesome well looking forward to seeing you in September in at the O'Reilly AI San Francisco looking forward to as well awesome thanks ni all right thank you hey everyone I am here with hanin tang hanlin is a senior algorithms engineer with Intel Nirvana hanlin gave a talk here yesterday at the O'Reilly AI conference and we're here to talk about his talk and what he's been up to how you doing hlin good how are you I'm doing great I'm doing great why don't we start by actually having you talk a little bit about your background and how you got into Ai and algorithms yeah of course I guess it mainly started when I was in graduate school I was doing research in computational neuroscience and that's really where the connection between understanding how the brain works and attempting to transfer some that knowledge into silicon and computer systems really took hold so after graduate school I joined Nirvana which is a deep learning startup and through that I've begun to sort of apply the research that I did in graduate school to some of the applications that that Nirvana is is developing that now of course as Intel we have the opportunity to scale that out quite significantly across all fronts Hardware software algorithms great great and you gave a talk here yesterday at the conference that's right I think I mainly focused on how do we do that exact same process that that I had just described of you know taking research these sort of algorithms and models that you see in the scientific literature and then begin to apply them and deploy them into production settings okay there are sort of unique challenges that you face when trying to do something like that why don't we have you walk us through that I know a lot of myself and a lot of our listeners will read papers and walk through you know the latest Cutting Edge research and try to understand how to implement it but putting it into a production is a whole another issue so how did you frame that up in your talk I think I mainly focused around three key aspects so the first one being the lack of data so I think we've often heard that there is a flood of data in the world today and certainly with Fortune 500 companies and government agencies there's a large Corpus of data but all that data has to be funnel through a very small pipe of of manual annotations because existing methods we need a human to actually go through and put you know boxes around all of the cars for thousands of images before a model can learn to do it so we're data rich but labeled data poor that's right and being able to navigate that environment with either in heavy investments in data or some of the newer techniques and generating synthetic data from what you already have is sort of quite critical in building applications that perform well because deep learning particularly requires a large amount of data to reach the level of performance that sort of exceeds what humans can do mhm so on your first point then with regards to data you know we there's clearly you know there are ways to take this on manually by you know just investing in labeling data but on the synthetic data side what's happening in that part of the what kind of activity is happening there so one great example is is from Intel Labs where they have used video games mhm to generate some realistic imagery by sort of getting Graphics artists and such to build out a video game environment to use that sort of build a synthetic data in order to train many of the autonomous driving applications okay or alternatively there have also been advances in using generative ad serial networks to also generate realistic imagery that could be used during the training process oh interesting I think I've seen examples of the using video game data to train autonomous driving programs at the time I thought the results that I saw suggested that for whatever reason the results didn't transfer very well did you guys in the lab research that you're referring to find some ways to address that I think that's still an active area of research is how to generalize but they did find that if you're able to augment your existing rear World data set with the synthetic data set you do get better performance overall because this transferability problem exists for real data sets as well or you may collect large amounts of data in one city but not able to generalize to other cities or or other environments okay oh interesting yeah the other sort of aspect that I highlighted was was building a feedback loop into the into your systems to have annotation occur on the edge and what I mean by that is if you're building say a Aviation Security application when you have sort of detectors at the scanning sites looking for you know dangerous objects and baggage you also want to build in a system for the agents to provide feedback on how the algorithm is doing and in that way you build a sort of cycle of colle collecting data and monitoring data in production right and we've seen that to be quite critical because the world and the can change underneath you so objects that may be more popular during the summer may be less popular in the winter MH so being able to monitor those changes of the distribution of object that you expect to see and modify the algorithm appropriately is quite critical that's an interesting point I know a lot of startups are you know founded basically around this idea of collecting data and allowing consumers to to basically annotate models for them annotate data for them but I can imagine Enterprises building these systems and putting them out and not kind of closing the loop that's right is relatively easily to close the loop when it's just a web interface where the closing the loop is is is quite simple but in autonomous driving or Aviation Security or many of these other applications where physically the inference occurs on the edge you actually have to build in the networking and the storage and the memory and all the sort of components that Intel has in order to close that Loop in many of these many of these scenarios okay all right so you talked about data as one of the first elements of being able to put these systems into production what else did you talk about the other point that I really wanted to highlight was around model selection uhhuh it's a difficult challenge these days because for any particular task such as object localization you will find many models in the literature so faster rcnn single shot detection models R fcn and they're always newer ones coming out you know all the time so Intel recently has PVA net as well and how do you make a decision as a data scientist of what models to choose mhm and I guess what we've seen is that many customers May sort of just choose the latest model and run with that where sometimes you have to make very fine grain speed accuracy tradeoffs mhm around your particular use case so a particular model may be more performant but also take longer to train in which case your iteration cycle is slower right or and your training costs are higher yes and your training costs are higher or some models May perform better than another model on sort of an aggregate performance metric but perhaps one model will perform better at small object compared to large objects and so be able to make that fine grained determination not just on sort of an average metric level but also splitting it up into the individual categories depending on what you're interested in is what we found to be valuable for many of our Enterprise customers do you find that that that understanding the the various tradeoffs is it to what degree is it dependent on a very specific use case and specific data set and I guess the broader question is is it possible to to kind of come up with some standard metrics around you know in a given category like object detection or speech recognition and you know rate the different algorithms according to you know some set of standard metrics I haven't seen anything like that but it would certainly be helpful to folks that are you know coming into a space like object detection and trying to figure out where to get started there are certainly ways to do that so there was a recent paper by many of our colleagues at Google on doing exactly that okay measuring the various types of object detection models on performance and speed mhm and that is sort of valuable work to help guide many of our customers and that determination somewhat General across different use cases within object detection however for your particular use case you need to dive much deeper than that it's not enough just to look at the overall mean average Precision which is the metric that they use you then have to split it out by particular pedestrians or motion large objects small objects and that determination is much more use case specific and then in this paper that you're referring to did they also consider practicalities like training time and you know training cost things like that they did not I think they were mostly focused on the INF side of the equation so that's certainly valuable work moving forward mhm interesting so what else did you cover in your talk I think those were the three sort of main points M I made around data closing the loop and model selection okay I guess if I were to add a fourth one that I had mentioned was knowing your model Provence mhm so going back to the object detection example there that model and many of those models are designed for specific data sets you know as a grass student you you build a model around specific benchmark data sets to help guide you on on how you are doing MH and so in particular two data sets Pascal VOC and Ms Coco have been very popular for object detection MH but those data sets typically have maybe 5 to 10 objects per image mhm if you're trying to transer that same object detection model to do a different application such as satell imagery mhm suddenly you have a scenario where the data set statistics don't match the Benchmark data set that the model was designed for right right in satellite imagery you have hundreds of objects in a particular image right you have rotational symmetry because an aerial image can can be rotated and retain many of the similar properties MH you have boxes around buildings that are no longer you know that that are rotated and so now you need to additionally predict an angle in addition to the coordinates and so we've been actively developing ways to adapt existing models to that application so even though both tasks are object detection knowing where the model came from and what it was designed for can help you sort of maximize the performance on your particular application where the statistics may be completely different MH yeah this comes up all the time in the context of the research community and the practitioner Community kind of you know quote unquote overfitting on image net right and it sounds like some similar things are happening in the object detection realm where there are some standard data sets that you know folks are building models on and then trying to apply all over the place knowing the Providence of your models is one thing it sounds like you're also then coming up with techniques that allow you to generalize and adapt those models to new situations in the case of satellite imagery for example how exactly do you what's kind of the underlying techniques that you're using to enable the model to adapt to a different use case it's really doing surgery on the model itself so in this particular case existing object detection models in the literature mostly just predict the XY coordinate of the upper left corner and then the width and the height of of a box mhm and so we are working on modifying that for to additionally predict rotation angle for example or dealing with multispectral satellite data where it's not just red green blue in the image but IR near IR as well so there also these other spectrums that we can begin to build a model to to take advantage of okay so we're here we're talking about evolving the models themselves as opposed to we're not training a model on the original data set and then using some techniques with a a train model to give it better inference in a new scenario it's you know we're talking about how do we build a new model that is better adapted to this situation yes exactly I mean don't get me wrong these existing object localization models are are very powerful oh absolutely you know five years ago as a grass I would never have imagined a world where these sort of applications can be done and not only be perform well but at real time speed right is quite incredible and so we're really sort of standing on the shoulders of sort of these papers but then iterating further for particular use cases where you need to make some some changes mhm okay and then I think in your presentation you also talked a little bit about the Nirvana stack and what was happening in that area it's an exciting time for us we've spent the last two and a half years building a full stack for for deep learning and I think as Nirvana now as Intel we still firmly believe in that philosophy that you need the full stack in order to get maximal performance and ease of use out of what we're building so it's everything from the custom silicon to our software work up to a cloud or platform service and then finally to to applications and now that we're part of Intel we have an opportunity to supercharge those efforts in a sense mhm and so as naven had mentioned we've released neon 2.0 which neon previously was known you know when we were startup as one of the fastest Frameworks on gpus due to the custom assembly kernels that we wrote and some of the algorithmic innovations that we introduced with winter grad convolution and now we're you know have we've been working very hard with other engineers at Intel to also optimize neon for Intel architecture and so by integrating into Intel's math kernel Library we can achieve quite significant speed UPS so on an image classification model such as googlet inference is about 98 times faster than previously so these are very serious optimizations that Intel has done for other Frameworks as well such as tensorflow and mxnet and we're excited to work with them to now bring many of these optimizations into neon as well okay for those that aren't familiar with neon can you walk us through the the design philosophy and how it differs from some of the other Frameworks that are out in the market of course so neon we really designed for Enterprise use cases where speed was quite important and many of our customers don't really need the model of the week they want a stable and fast and optimized object localization model or speech recognition model and that's really what we provide to many of our customers MH in addition we pay a lot of attention to data loading because the you know the folks that we talk to at O'Reilly you know those that run companies that build applications they don't train their models on imet MH or on Ms coco or on pentry bank they train on their own custom data sets right and so we put a lot of effort into designing modular data loaders that are fast but also flexible okay and easily provide a data API for users to switch between different models so if I'm doing an object detection model we have a couple that we've implemented there's a common data API for loading that kind of data so that you can test different models relatively easily mhm okay are there particular use cases beyond the ones that you've mentioned that you found to be the sweet spot for neon relative to other Frameworks I think we found use cases across a variety of domains so not just an image that I mostly focused on mhm but also in speech recognition where we've developed a model based on by state-ofthe-art Deep speech 2 model OKAY in natural language processing which many of our financial customers are using I think one of our sort of MOS is to keep track of the quickly changing literature for new models coming out that bring new level of capabilities and then porting them into neon and then optimizing them for Speed and for stability and for ease of of data loading and that's really where we find the value to provide to many of the folks that that we work with one of the the challenges for Enterprises putting these types of machine learning models into production is monitoring their performance over time and then building a a feedback loop that allows them to improve and enhance their models does neon have anything in particular to offer in that scenario yes so in neon we've built in callbacks okay that allow the model to report back its progress either during the training process MH but also actually mostly during during the training process we don't have anything currently for sort of specially built for monitoring an inference okay but that's certainly a good idea that we can look into okay so in addition to the neon 2.0 announcement you also announced the Nirvana graph product can you talk to us a little bit about that the Nirvana graph project really started even before we were acquired okay at Nirvana when we realized that many of the newer models coming out attention based models for example were much easier expressed in a computational graph mhm and that's really the core of the effort that we're doing where we've really rethought the back end of neon and with the nirvan graph effort we built and we're in in the process of Designing a a nervon graph inmunity representation mhm which different Frameworks can then hook into okay and then on the back and different Hardware backends will take the graph emitted from Nirvana graph and then apply their Hardware specific optimization passes to eventually build an executable execution graph that can run on different devices so at Intel we're fortunate to have a variety of Hardware targets Zeon Zeon fi future Lake Crest movidas fpgas and having a common tool chain to allow folks to train on one Hardware device and Deploy on another one or train on a heterogeneous mixture of Hardware devices we think will really change how models are being developed and make it much easier for industry to transfer those models you know between these different devices can you talk a little bit more about graph as a paradigm for building out these models you mentioned attention based models as an example how what's the relationship between you know a graph based view of the world and attention models and and how does a graph framework support building that kind of model better fundamentally neural networks are graphs of computations M mhm and representing that directly in the Nirvana graph framework and allowing folks to interact with the graph in building these models will allow much more flexibility than what we had originally envisioned with neon okay and so is the idea that for an arbitrary Network my different layers represent nodes in the graph or what does a node in the graph represent so a node in a graph represents a fundamental operation like adding or exponential function or Matrix multiply okay and so layers which is a higher level concept each layer implements a graph of operations okay and when we string together layers we're essentially adding components and nodes to the graph itself and that's during the construction phase MH and after that construction is complete we now have a complete view of the exact operations that you need to do to train your model and from that we can then apply optimization passes to reduce unnecessary computations to optimize memory usage and also take into account the specific data layout requirements of different Hardware mm so how does the developer experience change in using the graph project would you call it a project or a product I would call it a project okay okay but I'm an engineer not a not a product person so I'm not quite sure what the exact semantics are yeah yeah so it's clear that you know thinking about a neural network as a computational graph and kind of having that explicitly laid out allows us to you know perform optimizations just like you know a compiler M or like a query planner and a database might is this something that's happening at the developer level or is it kind of dimed in underneath an experience that the developer might already be using like neon or tensor flow or something else if you're a developer that principally uses components that have already been written out so layers like convolution and pooling MH then your experience will be relatively similar mhm if you have to develop new layers or apply some custom computation then you're able to access this node Lev level directly okay and develop on that so you can compose the Ops yourself okay into into a graph and in many cases we find that the ladder brings a lot of value because not everyone is just sort of applying the sort of vanilla models and layers that have already come out there for in part for the reasons that we previously talked about they don't apply as well to you know some problem sets or data sets than a model that's been you know augmented to meet the specific needs of that data set right and in many cases it's not just the different layers that they're composing together but also the way they're composing them together so I have a bunch of layers coming in and I want to Fork that into outputs that get broadcast to multiple streams or I may have multiple streams of data coming in whether it be images and video and text I want able to concatenate that together mhm so the graph also allows that to be much easier than before because if know the graph we know how to do the forward and the backward passes during the training regime whereas before a neon you would have to explicitly guide the model towards okay I have this topology do the forward pass this way and then pass you know the the outputs across this fork and then collect the gradients ETC so the graph takes care a lot of that so you can it's much more composable mhm is one of the key points okay oh super interesting how can folks learn more about the Neon and the graph projects I definitely encourage listeners to check out our GitHub page okay Nirvana systems there is a repository for neon and also for Eng graph the graph okay additionally if you go to www.el nirvana.com we have a couple blog posts that introduce the sort of the framework itself how to use it links to the model Zoo where we have a lot of pre-trained models that folks can easily get started with m and we definitely encourage the community to contribute as well the Nirvana graph effort is still in the early stages we sort of releasing our sort of latest commits quite often but definitely if if users look into that and see a feature that they like that they're missing you know we definitely welcome external contributions awesome awesome anything else you'd like to mention I think that's it I'm excited to be here at Ed O'Reilly talking to you and happy to continue the conversation in future future meetings fantastic thank you hen thank you all right everyone that is our show thanks so much for listening and for your continued support comments and feedback a special Thanks goes out to our series sponsor Intel Nirvana if you didn't catch the first show in this series definitely check it out if you didn't catch the first show in this series where I talk to navine raal the head of Intel's AI product group about how they plan to Leverage their leading position in proven history in Silicon Innovation to transform the world of AI you're going to want to check that out next for more information about Intel Nirvana's AI platform visit Intel nirvana.com remember that with this series we've kicked off our giveaway for tickets to the AI conference to enter just let us know what you think about any of the podcast in this series or post your favorite quote from any of them on the show notes page on Twitter or via any of our social media channels make sure to mention at twiml AI at Intel aai and at the AI conf so that we know you want to enter the contest full details can be found on the series page and of course all entrance get one of our slick twiml laptop stickers speaking of the series page you can find links to all of the individual show notes Pages by visiting twiml ai.com oilly a i n y thanks so much for listening and catch you next time\n"

Intel Nervana Update + Productizing AI Research with Naveen Rao and Hanlin Tang - #31

Random Videos