Feature Visualization and Its Implementation: A New Frontier in Neural Network Interpretability
The concept of feature visualization has gained significant attention in recent years, particularly among researchers and practitioners in the field of deep learning. The goal of feature visualization is to provide insights into the inner workings of complex neural networks, allowing us to better understand how they process input data and make predictions. In this article, we will delve into the world of feature visualization and explore its implementation using a cutting-edge library called Captain.
One of the key challenges in feature visualization is ensuring that the resulting images are not only informative but also meaningful. Ideally, we want the output to be a high-frequency dominated image that captures the essence of the input data. However, this can be achieved through a series of pre-conditioning steps, including the addition of a pre-conditioner and robustness transforms. A pre-conditioner is essentially an image parameterization module that modifies the input data in a way that helps the neural network learn more effectively.
The pre-conditioner is typically a deep neural network with parameters that are wrapped in a specific way to enable efficient computation. The resulting output is an image that has been modified by the pre-conditioner, which can help improve the conditioning of the optimization objective. In addition to pre-conditioning, we can also use robustness transforms to further improve the interpretability of the feature visualization. These transformations include rotate and scale operations, which help us avoid high-frequency artifacts and ensure that the output is more robust to small changes in the input data.
One of the most significant advantages of using robustness transforms is that they allow us to define a notion of "robustness" that is tied to the specific problem we are trying to solve. For example, if we want the feature visualization to be invariant to rotations, we can use a rotation transformation that preserves the semantic content of the image. This is particularly useful when working with images that have different orientations or scales.
To implement these techniques in practice, we need to integrate them into an optimization loop. The key idea here is to use a combination of pre-conditioning and robustness transforms to modify the input data before passing it through the neural network. We can then define a loss function that encourages the model to learn from this modified input, which can help improve the feature visualization.
In terms of implementation, we set up a pre-trained net, specify a loss function, and add a channel offset. We also create a robustness transform, which can be a simple rotation or scale operation, and wrap it in a sequential module. Finally, we define an image parameterization module that uses a Fourier transform and color D correlation to modify the input data.
Once we have all these components in place, we can run an optimization loop for a couple of hundred steps, which allows us to refine the feature visualization and improve its interpretability. The output of this process is an optimized input image that has been modified by both pre-conditioning and robustness transforms. We can then use this optimized input to generate further visualizations or explore different aspects of the neural network's behavior.
The resulting feature visualization can be a powerful tool for understanding complex neural networks, providing insights into their inner workings and helping us identify potential areas for improvement. By building on existing techniques like Captain and PyTorch, we can create more interpretable and informative visualizations that help us better understand the strengths and weaknesses of our models.
In conclusion, feature visualization is a rapidly evolving field that holds significant promise for improving the interpretability of deep learning models. By integrating pre-conditioning and robustness transforms into an optimization loop, we can generate high-quality feature visualizations that capture the essence of complex neural networks. As researchers and practitioners continue to push the boundaries of this field, we can expect even more innovative techniques to emerge, further advancing our understanding of these powerful tools.
Thank you to Captain team for making a production quality library out of what was once cutting-edge research. Your work has been instrumental in enabling us to explore new frontiers in neural network interpretability. We also want to express our gratitude to the PyTorch core team for their responsiveness to our requests about small changes to the API that can make interpretability research easier for all of us.
Lastly, we would like to thank you for your attention and for listening to this talk. We hope that you have a lovely rest of your virtual GTC conference experience, and we look forward to seeing the insights and innovations that you will produce by using these techniques on your own models.
"WEBVTTKind: captionsLanguage: enhello everyone this talk is about model understanding with captain and Python I love to start a talk by briefly motivating model interpretability and introduced in a novel interpretability library for python I'll give a brief overview of the algorithms that we currently support in the library and also show how you can distribute the computations across multiple GPUs then I'll talk about how to debug more complex models and visualize attributions lastly I'll briefly go over the limitations and challenges about revision methods and future directions the definition of model interpretability is rather elusive and many of you probably have heard about various different definitions of it but in order to bring more clarity let's define it as the ability to describe AI model internals and their predictions in human understandable terms you might ask why is model interpretability important it is important because it helps us to better understand our models predictions and understand how our models reason most importantly it facilitates with debugging misclassified predictions and lastly the better we understand our models the more likely it is that we'll improve them and they're of the much more likely it is that we'll push the boundaries of cutting-edge research so how can we make interpretability algorithms available for high-performing models and accessible to all fighters model developers to do so we developed model interpretability library called captain captain means comprehension the in Latin and the library has three main focus areas the first area focus area is that it is multi model meaning that the library can be used for any type of models and features secondly it is extensible meaning that you can extend it add new features and algorithms and thirdly it is it to use this means that you can use it with couple lines of code a current version zero point two point zero contains a number of well tested gradient and perturbation based attribution algorithms those algorithms allow us to interpret output predictions with respect to the inputs the output predictions with respect to all neurons in the layers and the neurons with respect to the inputs we plan to expand the library and add more algorithms beyond attribution basement approaches the following diagram summarizes all attribution algorithms that we currently have in captain library divided into two groups the first group on the left side of the diagram features the algorithms that are to put the output predictions or internal neurons to the inputs of the model the second group on the right side features the algorithms that allow to attribute the outputs of the network to the internal layers some of the algorithms in there on the right side represent slight variations of the ones on the left side besides that it we can say that the algorithms are highlighted using different colors the gradient based attribution approaches are highlighted in orange the perturbation based attribution approaches are highlighted in green and the third group that is neither gradient nor perturbation based approaches are highlighted in blue within these algorithms you'll recognize a number of them that are well known simple baseline approaches such as well-known 7c Maps and layer activations there are also a number of algorithms that are well known from computer vision community although in the literature these approaches are well known and used for computer vision models our implementation our implementations are generic and they can be used for any model that meets the criteria for the model to be used with that specific attribution algorithm for example for example if the models are from the CNN families then they can be used with grad cam and guided grad cam most algorithms require also so-called baseline which is also known as reference or background in the literature baseline is an important concept in a world of attribution attributions and I will spend couple minutes talking about it oftentimes when we want to understand what properties characterize the best certain object we compare it with other objects and seek for differences and contrast baseline or reference is based on that concept it helps helps us to blame particular parts of our inputs for a prediction based on the comparison with the reference or baseline in case of an image and this particular example if we want to attribute to the dark glass it is obvious to compare it with an image where there is no dog oftentimes people will ugly the dog and replace it with a rectangle and observe prediction drops for dog class but one would think that this can create a baseline sample that out of that is out of training and test data distributions a more nature way of operating the dog could be to replace it with a background and like in this example similar concept applies to text we can compare the text with a sequence of uninformative tokens or replace one of the tokens with a random token or a pad in Dhaka in a general case when we have a numerical representation of any input we can think of setting some of those features to constant values such as zeros or permuting them or performing other operations on as we can see the choice of baseline it's very challenging because there are many different ways of choosing baseline and there is no ideal way of choosing it in this part of the presentation I'd like to walk you through a simple neural network and demonstrate how we can apply model interpretability algorithms from captain library on it this network takes an input of 3 features followed by a linear layer reloj and an output of 2 target let's say we would like to attribute one of our targets namely target 0 to the inputs of our feature of our network namely input features to do so we choose an attribution algorithm in this case integrated gradients and imported from captain dot attr package and the gradient radians is similar to almond shaft limited from cooperative game theory for infinite games and non-atomic measures where each feature is in a single point in a feature space but it is an infinitesimal subset of it this approach has many interesting properties and I'll talk about one of those properties in my next lights as next we create an instance of integrated gradients using the forward function of our model and define our input in this case the input is a random tensor to perform attributions we call attribute on the attribution algorithm by providing our inputs and the target index that we would like to attribute to and also we can provide baseline but in this case I chose to use default baseline which is zero baseline the greatest gradients performs forward and backward passes and computes the path integral of our gradients from baseline to input their eternal attributes have the same shape and dimensionality as the inputs the magnitude of education score signifies the strength of the important signal which in this case is color coded in green and red green means that those particular features are contribute in predicting the target zero red means that those particular features are negatively correlated with target zero we can also return integral approximation error which is which we can compute it using return converges Delta argument the Delta is computed based on one of the properties of integrated gradients that property is called a completeness property which states that the sum of the attribution is equal to the differences of our function at its input and baseline if our delta is large in this case it is zero point zero on 1 and if we think that it's large we can reduce the Delta by increasing the number of integral approximation steps as I did on this slide we can also choose to define a baseline in this case I decided to use a random baseline instead of zero baseline and as we can see when we change the baseline than our attribution also changes and one of our important picture features now became a less important in general case we can similarly switch from one attribution algorithm to another and compare their performance and different properties all attribution algorithms have the same signature we can use Python model data parallel to distribute the computations across multiple GPUs we can do this also for later attribution approaches by setting a hook on a layer and aggregating final results in the hook I will show on this example how we use data parallel for layer activation method so activation sets a hook on a particular layer of interest and allows to access the activations of that cell in this case since we want to use layer activations with a the parallel we wrap our model with data parallel and also define the GPUs the GPUs in this case we use 3d GPU devices let's say that we'd like to look into layer activations of the first linear layer in this case we create an instance of layer activation pass our model which is wrapped with data parallel and also passed our first linear layer then we define our inputs I chose three input examples and we call attribute on our layer activation algorithm and pass our random inputs when we pass the random input under the hood data parallel split our three examples on three different GPUs and it distributes those three examples on three different views and computes performs the forward passes on each GPU for each example and also activations are available for each example on its separate GPU so then we can ultimately collect those activations from all the GPUs and return final activations as you can see on this slide so for all examples the first activation is negative and the second activation is positive similarly if an algorithm requires both forward and backward passes then both of those passes are performed on separated use and their results are ultimately aggregated some of the algorithms internally expand the inputs depending on some input we don't set as number of steps for integrated gradients conductance internal influence to avoid out of memory situations we perform internal batching using internal bytes as parameter similar to similarly for the perturbation algorithms that require only forward pass and perturbations of all features we perform perturbations patch wise for multiple features together and we do it distributed across multiple GPUs in one of our experiments we run integrated gradient using pre trained vgg 19 model that we wrapped with data parallel and adjusted internal batch size accordingly and we use fixed number of approximation steps which was a 2990 and as we can see the execution time is declining as we increase the number of GPUs gradually in this in this example we also use a pre trained vgg model and a single image to perform feature appellation using different number of operations per forward pass we can see that the execution time decreases as we increase the number of operations per forward pass meaning that we oblate multiple features together in a badge and distribute it across multiple GPUs okay so until now we were talking about simple toy models and how we could distribute competitions across multiple GPUs now let's look into more complex models and see how we can apply captain on those models and visualize attributions in this example we use pre-trained resnet 152 model and occlusion algorithm to attribute to the top class red pixels corresponds to negative attribution negging that those pixels on the image they pull away from the dock class or negatively contribute to the attributed class what means that those pixels do not contribute to the prediction and green means that those pixels are very important for predicting dock and as we can see the green pixels are concentrated on the head of the dock that means that they are important pixels predicting dog now when we attribute to the cat class we observe that the pixels corresponding to dog on an image turned red because they pull away from the cat class and they are all red and the pixels which are on the cat they turned green that means that they pull towards categorize those are important pixels for predicting cat we can also perform feature appellation based on image segmentation in this example we segmented the image into three segments bottles monitors and the background we constructed the feature mask based on those three segments and attribute it to monitor class in this example we can see that the background is neutral bottles pull away from them attributed monitor class and the pixels on the monitors are very important for predicting the monitor it is also interesting to observe that the borders of the monitor that separate one segment from another are also red meaning that they pull away from one third of course to make those visualizations more interactive and being able to debug our models we developed an interactive model debugging and understanding tool called captain insights captain insight support different types of models and input features the visualizations can be also embedded into Jupiter notebooks or collapse and books captain means ice has built in renderers for some feature types image is one of those we can interpret the predictions of any image and any computer vision model using our API and visualization tool in this case we use pre-trained ResNet 15 model and the tool also allows to attribute the predicted classes to different input pixels it helps to understand incorrectly predicted samples using its interactive functionality similarly we have renders for text captain insights can help us to debug and understand which tokens in text are important for the prediction this particular example is a model that we trained on IMDB data set the magnitude of attribution scores ranges in a color spectrum from blue very important to red least important intensity of the colors signifies the strength of the important signal for the features that currently do not have any building renders we use bar charts to visualize that revisions in this case we visualize the attributions of the vigils from Titanic data set for a simple three layer MLP model similar to a previous example we color code the importance of each feature in each prediction predicted sample we can also extend the list of renderers and define new ones for any specific type of feature that we choose to and most importantly we can also visualize the attributions of multi modal models in this case visual question answering this is especially interesting because we can where is the important signal coming from from which modality and the tool can also help us dig deeper into specific modality and see where is the important important signal coming from and this in this section of the presentation I would like to briefly talk about a case study for bird models we fine-tuned a bird model on squads a question answering dataset and overall we reached f1 score of 86 and staggered my to 78 our goal is to understand the importance of different types of tokens in different layers for our fine-tuned model there has been already some work done in visualizing and understanding bird models that was mostly around visualizing and understanding the attention matrices but in this particular case study would like to use attribution algorithms to understand different layers for our predictions we choose a simple text I would like to use for our question answering squad classification model and that text is it is important to us to include empower and support humans of all kinds and a simple question that we would like to ask about the text and that question is what is important to us we want that our model finds the answer to our question in that text more specifically like it predicts the start and end tokens of the answer in order to use the text and a question as input to our model we need to concatenate them together and tokenize them as shown on this slide then we load our fine-tuned question-answering model which you can see on the following slide on the right side we have a section of the loaded model now let's predict the start and end positions using our model and we can see that the model is able to correctly predict start and end positions of our answer now we are ready to apply some of our attribution algorithms on birth layers and understand which layers are important for our predictions to do so we need to set interpretations hooks on all 12 layers and we set hook per layer then we use layer conductors to add to compute attributions for each layer both for start and end position predictions that allows us to generate heat maps shown on this slide each cell on the heat map represent represents the accumulated importance score of a layer for a given input token for predicting the start position as we can see the question token what an answer token to have high importance course the answer token became especially important for the last three layers we can perform some similar exercise for the prediction of end token position and we can see that here the question token what is still also very important and the end token kind has a very high a important score especially in the last two layers as you can see we can do many interesting experiments and understand our models using attribution methods however attribution methods come also with their limitations they do not capture feature correlations and interactions finding good baselines is challenging as I also showed in on my previous slides other Bayesian methods are very difficult to evaluate and compare with each other and they do not explain the model globally in the future we plan to expand a captain library beyond attribution approaches and add captain robust package which will focus on the adversary fastness and attacks and studying the intersection between model robustness and interpretability captain magic that will focus on model sensitivity trust in fidelity matrix for both the model and attribution and captain benchmarks that will provide benchmarking for different data sets using different methodologies including sanity checks and captain Optum that will focus on optimization algorithms and one of the first approaches that we would like to add here is the optimization based visualizations and the second part of the talk Ludwig will talk about optimization based feature visualizations in Python which will become part of captain optimal packing thank you Thank You Nina my name is Loic I work at open the I in a team led by Christopher Ola called clarity that broadly tries to understand the inner workings of neural networks a vital part of our tool train in that endeavor has been featured visualization and so in the remainder of this talk I hope to explain to you what we believe feature realization is right what we mean by the word I will then try to convince you that it is useful by showing you how we've been using feature visualization in a variety of projects and finally I hope to give you a sneak preview of how we plan to implement feature visualization entitled captain in a way that both works robustly but also isn't surprising to use from an API point of view right it should it should work like another tool in Python now to help explain what something is it can sometimes be helpful to contrast it with what it's not and when you hear about interpretability research and interpretability methods often times you'll see attribution methods right methods that for a specific example try to explain what parts of that example caused the network to behave in a particular way feature visualization on the other hand is slightly different in that it's trying to answer questions about network behavior or any like what specific parts of network are looking for but they're detecting by generating examples synthetic examples through optimization right in the simplest case that might be a question as simple as like what is a specific neuron looking for but they can be more complicated questions and I'll show you some when we're going over some for break so fundamentally feature visualization is an optimization process right we're optimizing an input turn your network with an objective function that's a function of the activations of that network right so if you look at the diagram you have an input image that's fed to the network it produces some activations see activations become part of a loss function which is then back propagated through the network into the input image let me show you one example of what you can do with it you can maximize the activation of a single neuron and here I compare them to data set examples that also strongly activate the same neuron now those are both valid approaches and can can give you different insights and can be used in a variety of scenarios but the optimization approach can sometimes be helpful in situations where data set examples are not available now our team we're not the first ones to think of that I would say that in the modern deep learning eras they are broadly speaking two papers that ought to be mentioned as like one of the first ones to think of this idea that is Mahendra and Fidel D in 2014 over at VG they fed images into networks and then try to restore those images from the activations of the network very similar approach and then about a year later J's Nia sinski and co-authors here really inspected the representations at writers depth of a network using these types of optimizer techniques but what really got the team excited that I would later join was deep dream I am sure if you remembered deep dream hitting the internet but people really enjoyed it I believe there there seem to be these fantastic worlds of representations inside our neural nets and we just got to see them we created the funniest dog slug pictures and I believe there were lots of like artistic approaches of using deep dream but what excited us about it is that it allowed us a look into the representations of network it just wasn't very steerable right we couldn't really tell it where to look but from that point on we were sort of hooked and try to get it to work for more precise questions to write if you read this diagram from right to left you'll see a deep dream which was just maximizing the square of the activations of all the channels in an entire layer and we managed slowly to make it more precise to be able to look for an individual channel right so that's a neuron in all spatial positions or even an individual neuron right at a single spatial position but that's just the background let me try to convince you that this technique is actually helpful in understanding what neural nets do by showing you some of the things that we've been able to do with it the simplest idea that you can have I would argue is optimizing for the activation of a single neuron in one of these networks right that's what you see here you start from random noise you take the activations of this orange highlighted neuron in one of the branches of the 4b module of the inception or even architecture and you stochastic gradient descent your way to the image you see in the upper right-hand corner now on its own that might look more like an idle curiosity but you can combine it with for example an actual input image and look at the activations of the network at every spatial position of that input image and explain what those numbers mean by using these little little icons these little icons for each neuron alright so if you look at the animation in the bottom you see a sort of a blur a dog ear with a strong activation here and then you get more of a fox snout animal snouts behavior right and those represent individual neurons let's wait until we see the foreground a little bit right so there's this green perspective grass but of course you don't need to just optimize for single neurons behavior the simplest next step is maybe to optimize to for linear combinations of nodes right here in fact you see like two neurons at a time as you can see that they combine in somatically reasonable seeming ways so look at this furriness and art for example or the squareness at art or maybe the black and white neuron that turns the art on the right black and white and in fact you can optimize for whole linear combinations right so rather than representing each of these activations as a combination of these individual neurons you can optimize for the entire activation vector at once these activation vectors can be taken from real images right so you take an image protein network for a given spatial position you can extract an activation vector can turn it back into a feature visualization and if you do that for all the spatial positions of the image at once you can get an interesting overview so if you look at these visualizations from left to right those are lower layers coming to deeper layers you see that the network represents edges and curves in the lower layers and then transitions to these snout parts and maybe a little bit of fur textures where towards the right which is maybe like two thirds through the network you have these fully formed snout parts and cat heads and maybe dog legs now let me attempt to categorize a little bit what I've been showing you when we think of the activation space of a neural network as a vector space then these individual neurons and the visualizations that I've been showing you would correspond to a basis direction of that vector space the pairwise interaction diagram I showed you would correspond to planes of those neurons in activation space and these special activation grids that I just showed you would a set of points in activation space specifically ones that lie on a sub manifold of likely activations right coming from the natural data distribution the distribution of networks also trained on now if you want to understand that entire manifold and not just some points from it we use a technique we call activation Atlas the attempt to explain to create an activation Atlas we run a data set through a network and sample activation vectors from random spatial positions on those images right so you end up with a million of these activation vectors we then run those through a dimensionality reduction algorithm and project them down into a two dimensional plane I don't remember that the activation space might be hundreds of dimensions so it's nice to get them laid out in a 2d plane here we then make a histogram over it right where it's a 2d histogram so it's like a grid within which we then average all of these activations and that average activation we then run through feature visualization so for each grid cell we create one single feature visualization when we put them together we end up with an overview of the representations of the network that we call an activation atlas here's one in all its glory and you can do that at different scales actually can zoom into those and use a finer gridding so let me zoom it to the top right corner there a little bit forgive some human heads and maybe in the center there's more like fingers and arms let's examine some some points in the neck Direction is less here are dog snouts and you can see them sort of smoothly go from our various orientations and various fur colors in a very different corner there's this gas pump scoreboard riding maybe a little bit of water marks in there and the very right other areas will have fruit of Marius texture and color and you can even find these Nidal interpolations through the activation space that go from like fruits in the distance to fruits close up and like many people in the distance to a single person close up and if you do one of these for each layer you can even see how the representations evolve through the layers right going from green grass and some high-frequency textures through those actual leaf like shapes to fully formed zucchinis and cabbages and green bell peppers and stuff now of course you can subset an activation Atlas to only those activations found on images with a certain label okay so you have a snorkeler activation Atlas on the left here and a scuba diver activation levels on the right and one thing to maybe notice that these snorkeler in the bottom left corner has these transparent snorkels while the scuba diver in the top right as it is like black respirators in tools that actual divers use and nothing you can do is that you're not bound to project the activations down to 2d you can project them down to 1d and then use the dimension that you freed up to layout any other attribute in this particular case attribution to the to put either snort ler or scuba diver and the two features that I just talked about in the bottom row here you can really see there's these transparent snort offs that have high attribution to slow play and these non transparent black respirators that have high attribution to scuba data there's a highlight itself on the right hand side something that looks a little bit round and metallic it's probably one of those tanks that like scuba divers wear on their back but to us it looked a little bit like a locomotive like a steam locomotive one of these old-timey black ones and so we tried to see how good our understanding was and see whether we could adversary you flip the classification and it turns out we can so you take an interval snorkeler yeah classify it with only 55 percent confidence but hey you add a tiny little locomotive and we're able to flip it to scuba diver with an height with a higher confidence than the original convention okay that was an overview of some of the applications that we've used feature realization for if you're interested in reading more we have all these papers out on distilled pub a journal that allows us to publish these interactive articles where you can actually play with the techniques yourself but now let me try to focus on how we plan to implement feature realization in pi torch in a way that feels natural to pi torch users so if you remember the core setup we're optimizing an image based on objective that is a function of the activations of a neural network that means we'll need to extract activations from a neural network for which will be using hooks fell away and we'll be optimizing an input image so we'll need something that has a parameter that will output an image and allows us to back problem to it now I'm gonna try to translate this diagram into code and the API that I'll be showing you right it's not finalized or anything like that but it's trying to give you an idea so as your model you'll you'll load a normal pre train model in in PI George and for your objective function the way we'll set it up is we'll have an objective model in which there is a simple function what's called an your own activation right so it'll maximize the excavation of a single neuron and you point it at a specific module of your network all right so when I type net dot mix 3 a dot 3 by 3 that's a 3 by 3 branch in one of the inception modules and I'm pointing it at channel 17 then we have this input optimization objectives that takes the net and the loss function and can run the optimization for a number of steps and the call to optimize here really just hides like a normal optimization loop you know optimizer dot 0 grad and things like that and at the end you want to get a result out of it right so there's Gallegly this kind of try station that we're actually optimizing that will output an image which we can then look at so if you put these parts together that's that's actual working code um at the moment it's in a preview stage and we'll make sure to adhere to the caption API better but unfortunately it's also a little bit of a simplification if you do that you get the result on the top now if grading descent gives you a somewhat high frequency dominated image for reasons that we partly exploring in the papers that I showed you earlier but the high-level takeaway is that we can improve the conditioning of that optimization objective by adding a pre-conditioner which we would call an image parameter ization here and I'll go into detail about that in the next slides as well as something we call robustness transforms I stochastic transformation that we put the image through before we feed it into the network and compute the gradients so here's how that look like in height wash will have a module whose parameters are just the coefficients of some Fourier parameters and it outputs an image right but in a differentiable way so that if we get grains with respect to the image the parameter ization knows how to apply those gradients to its parameters internally and the other stuff that we'll add are those robustness transforms and let me explain what I mean by those what we do here is we did a rotate or scale an image before we put it into the network and it helps us avoid these high frequency artifacts conceptually what it means at least that's the intuition that I use in my head is to say that look I want the input that we're optimizing for here to robustly activate or to robustly maximize my objective function even if it's moved around a little bit even if it's rotated a little bit even if it's a little bit smaller or bigger right and the reason these seem like good invariants to ask for is that those transformations don't change the semantic content of an image right a dog stays a dog whether it's one pixel further to the left or further to the right and I was really glad that we didn't have to reinvent the wheel here we're using transformations straight from the wonderful library called cornea so here's how we're going to translate this into code your robustus transforms are just a normal and in module they take an image they output a slightly transformed image you can even put them in in a sequential if you've got multiples of those and our privatization is also just going to be an NN module it'll have parameters that are wrapped and parameters and those are the in our case for your coefficients that will actually get optimized and when you call it on its forward pass it outputs an image these to you then put into the optimization loop that we already looked at let's put it all together on one slide you set up a pre trained net you set up a loss function in this case neural activation pointed at one of the modules of your net with a channel offset you set up a proboscis transformation that you want your optimization result to be invariant to in this case we're wrapping it in sequential because we maybe want multiple of those we specify an input parameter ization here I'm just saying natural image behind the scenes that uses a Fourier transform and a color D correlation I then put them in this optimization loop which I run for a couple hundred steps and I get out my optimized input image just by calling the input parameter ization and here's how all of that looks together in a diagram now that code isn't quite ready it'll be on a branch and the captain engineers will look at it and probably improve it a lot but I promise that despite looking a little bit like pseudocode all the code you saw it's actually already running today wrapping up I've been talking about feature visualization and its implementation but I want to I want to point out that the result of that code are these visualizations these little glimpses inside the working of those complicated neural nets and we can put those glimpses together we can assemble larger explanations and stories about how they work and I can't wait to see the insights that you'll produce by using these techniques on your own models by thinking about better techniques more insightful ideas build on those primitives or fully on their own I hope you'll try it out you can read all about captain Matt captain that I wear on github github coms left Pytor slash captain is where you can contribute to the library read the code report issues give us your ideas and you can also participate in online discussion forums thats at discuss dot pi toward org / c / captain I'd like to thank the captain team for making a production quality library out of what just a couple years ago was was still cutting edge research and even today I don't think the story's over and also say thank you to the PI torch core team they have been very responsive to our requests about small changes to the API that might make interpretability research easier for us all in the future lastly thank you thank you for your attention for listening to this talk and I hope you have a lovely rest of this virtual GTChello everyone this talk is about model understanding with captain and Python I love to start a talk by briefly motivating model interpretability and introduced in a novel interpretability library for python I'll give a brief overview of the algorithms that we currently support in the library and also show how you can distribute the computations across multiple GPUs then I'll talk about how to debug more complex models and visualize attributions lastly I'll briefly go over the limitations and challenges about revision methods and future directions the definition of model interpretability is rather elusive and many of you probably have heard about various different definitions of it but in order to bring more clarity let's define it as the ability to describe AI model internals and their predictions in human understandable terms you might ask why is model interpretability important it is important because it helps us to better understand our models predictions and understand how our models reason most importantly it facilitates with debugging misclassified predictions and lastly the better we understand our models the more likely it is that we'll improve them and they're of the much more likely it is that we'll push the boundaries of cutting-edge research so how can we make interpretability algorithms available for high-performing models and accessible to all fighters model developers to do so we developed model interpretability library called captain captain means comprehension the in Latin and the library has three main focus areas the first area focus area is that it is multi model meaning that the library can be used for any type of models and features secondly it is extensible meaning that you can extend it add new features and algorithms and thirdly it is it to use this means that you can use it with couple lines of code a current version zero point two point zero contains a number of well tested gradient and perturbation based attribution algorithms those algorithms allow us to interpret output predictions with respect to the inputs the output predictions with respect to all neurons in the layers and the neurons with respect to the inputs we plan to expand the library and add more algorithms beyond attribution basement approaches the following diagram summarizes all attribution algorithms that we currently have in captain library divided into two groups the first group on the left side of the diagram features the algorithms that are to put the output predictions or internal neurons to the inputs of the model the second group on the right side features the algorithms that allow to attribute the outputs of the network to the internal layers some of the algorithms in there on the right side represent slight variations of the ones on the left side besides that it we can say that the algorithms are highlighted using different colors the gradient based attribution approaches are highlighted in orange the perturbation based attribution approaches are highlighted in green and the third group that is neither gradient nor perturbation based approaches are highlighted in blue within these algorithms you'll recognize a number of them that are well known simple baseline approaches such as well-known 7c Maps and layer activations there are also a number of algorithms that are well known from computer vision community although in the literature these approaches are well known and used for computer vision models our implementation our implementations are generic and they can be used for any model that meets the criteria for the model to be used with that specific attribution algorithm for example for example if the models are from the CNN families then they can be used with grad cam and guided grad cam most algorithms require also so-called baseline which is also known as reference or background in the literature baseline is an important concept in a world of attribution attributions and I will spend couple minutes talking about it oftentimes when we want to understand what properties characterize the best certain object we compare it with other objects and seek for differences and contrast baseline or reference is based on that concept it helps helps us to blame particular parts of our inputs for a prediction based on the comparison with the reference or baseline in case of an image and this particular example if we want to attribute to the dark glass it is obvious to compare it with an image where there is no dog oftentimes people will ugly the dog and replace it with a rectangle and observe prediction drops for dog class but one would think that this can create a baseline sample that out of that is out of training and test data distributions a more nature way of operating the dog could be to replace it with a background and like in this example similar concept applies to text we can compare the text with a sequence of uninformative tokens or replace one of the tokens with a random token or a pad in Dhaka in a general case when we have a numerical representation of any input we can think of setting some of those features to constant values such as zeros or permuting them or performing other operations on as we can see the choice of baseline it's very challenging because there are many different ways of choosing baseline and there is no ideal way of choosing it in this part of the presentation I'd like to walk you through a simple neural network and demonstrate how we can apply model interpretability algorithms from captain library on it this network takes an input of 3 features followed by a linear layer reloj and an output of 2 target let's say we would like to attribute one of our targets namely target 0 to the inputs of our feature of our network namely input features to do so we choose an attribution algorithm in this case integrated gradients and imported from captain dot attr package and the gradient radians is similar to almond shaft limited from cooperative game theory for infinite games and non-atomic measures where each feature is in a single point in a feature space but it is an infinitesimal subset of it this approach has many interesting properties and I'll talk about one of those properties in my next lights as next we create an instance of integrated gradients using the forward function of our model and define our input in this case the input is a random tensor to perform attributions we call attribute on the attribution algorithm by providing our inputs and the target index that we would like to attribute to and also we can provide baseline but in this case I chose to use default baseline which is zero baseline the greatest gradients performs forward and backward passes and computes the path integral of our gradients from baseline to input their eternal attributes have the same shape and dimensionality as the inputs the magnitude of education score signifies the strength of the important signal which in this case is color coded in green and red green means that those particular features are contribute in predicting the target zero red means that those particular features are negatively correlated with target zero we can also return integral approximation error which is which we can compute it using return converges Delta argument the Delta is computed based on one of the properties of integrated gradients that property is called a completeness property which states that the sum of the attribution is equal to the differences of our function at its input and baseline if our delta is large in this case it is zero point zero on 1 and if we think that it's large we can reduce the Delta by increasing the number of integral approximation steps as I did on this slide we can also choose to define a baseline in this case I decided to use a random baseline instead of zero baseline and as we can see when we change the baseline than our attribution also changes and one of our important picture features now became a less important in general case we can similarly switch from one attribution algorithm to another and compare their performance and different properties all attribution algorithms have the same signature we can use Python model data parallel to distribute the computations across multiple GPUs we can do this also for later attribution approaches by setting a hook on a layer and aggregating final results in the hook I will show on this example how we use data parallel for layer activation method so activation sets a hook on a particular layer of interest and allows to access the activations of that cell in this case since we want to use layer activations with a the parallel we wrap our model with data parallel and also define the GPUs the GPUs in this case we use 3d GPU devices let's say that we'd like to look into layer activations of the first linear layer in this case we create an instance of layer activation pass our model which is wrapped with data parallel and also passed our first linear layer then we define our inputs I chose three input examples and we call attribute on our layer activation algorithm and pass our random inputs when we pass the random input under the hood data parallel split our three examples on three different GPUs and it distributes those three examples on three different views and computes performs the forward passes on each GPU for each example and also activations are available for each example on its separate GPU so then we can ultimately collect those activations from all the GPUs and return final activations as you can see on this slide so for all examples the first activation is negative and the second activation is positive similarly if an algorithm requires both forward and backward passes then both of those passes are performed on separated use and their results are ultimately aggregated some of the algorithms internally expand the inputs depending on some input we don't set as number of steps for integrated gradients conductance internal influence to avoid out of memory situations we perform internal batching using internal bytes as parameter similar to similarly for the perturbation algorithms that require only forward pass and perturbations of all features we perform perturbations patch wise for multiple features together and we do it distributed across multiple GPUs in one of our experiments we run integrated gradient using pre trained vgg 19 model that we wrapped with data parallel and adjusted internal batch size accordingly and we use fixed number of approximation steps which was a 2990 and as we can see the execution time is declining as we increase the number of GPUs gradually in this in this example we also use a pre trained vgg model and a single image to perform feature appellation using different number of operations per forward pass we can see that the execution time decreases as we increase the number of operations per forward pass meaning that we oblate multiple features together in a badge and distribute it across multiple GPUs okay so until now we were talking about simple toy models and how we could distribute competitions across multiple GPUs now let's look into more complex models and see how we can apply captain on those models and visualize attributions in this example we use pre-trained resnet 152 model and occlusion algorithm to attribute to the top class red pixels corresponds to negative attribution negging that those pixels on the image they pull away from the dock class or negatively contribute to the attributed class what means that those pixels do not contribute to the prediction and green means that those pixels are very important for predicting dock and as we can see the green pixels are concentrated on the head of the dock that means that they are important pixels predicting dog now when we attribute to the cat class we observe that the pixels corresponding to dog on an image turned red because they pull away from the cat class and they are all red and the pixels which are on the cat they turned green that means that they pull towards categorize those are important pixels for predicting cat we can also perform feature appellation based on image segmentation in this example we segmented the image into three segments bottles monitors and the background we constructed the feature mask based on those three segments and attribute it to monitor class in this example we can see that the background is neutral bottles pull away from them attributed monitor class and the pixels on the monitors are very important for predicting the monitor it is also interesting to observe that the borders of the monitor that separate one segment from another are also red meaning that they pull away from one third of course to make those visualizations more interactive and being able to debug our models we developed an interactive model debugging and understanding tool called captain insights captain insight support different types of models and input features the visualizations can be also embedded into Jupiter notebooks or collapse and books captain means ice has built in renderers for some feature types image is one of those we can interpret the predictions of any image and any computer vision model using our API and visualization tool in this case we use pre-trained ResNet 15 model and the tool also allows to attribute the predicted classes to different input pixels it helps to understand incorrectly predicted samples using its interactive functionality similarly we have renders for text captain insights can help us to debug and understand which tokens in text are important for the prediction this particular example is a model that we trained on IMDB data set the magnitude of attribution scores ranges in a color spectrum from blue very important to red least important intensity of the colors signifies the strength of the important signal for the features that currently do not have any building renders we use bar charts to visualize that revisions in this case we visualize the attributions of the vigils from Titanic data set for a simple three layer MLP model similar to a previous example we color code the importance of each feature in each prediction predicted sample we can also extend the list of renderers and define new ones for any specific type of feature that we choose to and most importantly we can also visualize the attributions of multi modal models in this case visual question answering this is especially interesting because we can where is the important signal coming from from which modality and the tool can also help us dig deeper into specific modality and see where is the important important signal coming from and this in this section of the presentation I would like to briefly talk about a case study for bird models we fine-tuned a bird model on squads a question answering dataset and overall we reached f1 score of 86 and staggered my to 78 our goal is to understand the importance of different types of tokens in different layers for our fine-tuned model there has been already some work done in visualizing and understanding bird models that was mostly around visualizing and understanding the attention matrices but in this particular case study would like to use attribution algorithms to understand different layers for our predictions we choose a simple text I would like to use for our question answering squad classification model and that text is it is important to us to include empower and support humans of all kinds and a simple question that we would like to ask about the text and that question is what is important to us we want that our model finds the answer to our question in that text more specifically like it predicts the start and end tokens of the answer in order to use the text and a question as input to our model we need to concatenate them together and tokenize them as shown on this slide then we load our fine-tuned question-answering model which you can see on the following slide on the right side we have a section of the loaded model now let's predict the start and end positions using our model and we can see that the model is able to correctly predict start and end positions of our answer now we are ready to apply some of our attribution algorithms on birth layers and understand which layers are important for our predictions to do so we need to set interpretations hooks on all 12 layers and we set hook per layer then we use layer conductors to add to compute attributions for each layer both for start and end position predictions that allows us to generate heat maps shown on this slide each cell on the heat map represent represents the accumulated importance score of a layer for a given input token for predicting the start position as we can see the question token what an answer token to have high importance course the answer token became especially important for the last three layers we can perform some similar exercise for the prediction of end token position and we can see that here the question token what is still also very important and the end token kind has a very high a important score especially in the last two layers as you can see we can do many interesting experiments and understand our models using attribution methods however attribution methods come also with their limitations they do not capture feature correlations and interactions finding good baselines is challenging as I also showed in on my previous slides other Bayesian methods are very difficult to evaluate and compare with each other and they do not explain the model globally in the future we plan to expand a captain library beyond attribution approaches and add captain robust package which will focus on the adversary fastness and attacks and studying the intersection between model robustness and interpretability captain magic that will focus on model sensitivity trust in fidelity matrix for both the model and attribution and captain benchmarks that will provide benchmarking for different data sets using different methodologies including sanity checks and captain Optum that will focus on optimization algorithms and one of the first approaches that we would like to add here is the optimization based visualizations and the second part of the talk Ludwig will talk about optimization based feature visualizations in Python which will become part of captain optimal packing thank you Thank You Nina my name is Loic I work at open the I in a team led by Christopher Ola called clarity that broadly tries to understand the inner workings of neural networks a vital part of our tool train in that endeavor has been featured visualization and so in the remainder of this talk I hope to explain to you what we believe feature realization is right what we mean by the word I will then try to convince you that it is useful by showing you how we've been using feature visualization in a variety of projects and finally I hope to give you a sneak preview of how we plan to implement feature visualization entitled captain in a way that both works robustly but also isn't surprising to use from an API point of view right it should it should work like another tool in Python now to help explain what something is it can sometimes be helpful to contrast it with what it's not and when you hear about interpretability research and interpretability methods often times you'll see attribution methods right methods that for a specific example try to explain what parts of that example caused the network to behave in a particular way feature visualization on the other hand is slightly different in that it's trying to answer questions about network behavior or any like what specific parts of network are looking for but they're detecting by generating examples synthetic examples through optimization right in the simplest case that might be a question as simple as like what is a specific neuron looking for but they can be more complicated questions and I'll show you some when we're going over some for break so fundamentally feature visualization is an optimization process right we're optimizing an input turn your network with an objective function that's a function of the activations of that network right so if you look at the diagram you have an input image that's fed to the network it produces some activations see activations become part of a loss function which is then back propagated through the network into the input image let me show you one example of what you can do with it you can maximize the activation of a single neuron and here I compare them to data set examples that also strongly activate the same neuron now those are both valid approaches and can can give you different insights and can be used in a variety of scenarios but the optimization approach can sometimes be helpful in situations where data set examples are not available now our team we're not the first ones to think of that I would say that in the modern deep learning eras they are broadly speaking two papers that ought to be mentioned as like one of the first ones to think of this idea that is Mahendra and Fidel D in 2014 over at VG they fed images into networks and then try to restore those images from the activations of the network very similar approach and then about a year later J's Nia sinski and co-authors here really inspected the representations at writers depth of a network using these types of optimizer techniques but what really got the team excited that I would later join was deep dream I am sure if you remembered deep dream hitting the internet but people really enjoyed it I believe there there seem to be these fantastic worlds of representations inside our neural nets and we just got to see them we created the funniest dog slug pictures and I believe there were lots of like artistic approaches of using deep dream but what excited us about it is that it allowed us a look into the representations of network it just wasn't very steerable right we couldn't really tell it where to look but from that point on we were sort of hooked and try to get it to work for more precise questions to write if you read this diagram from right to left you'll see a deep dream which was just maximizing the square of the activations of all the channels in an entire layer and we managed slowly to make it more precise to be able to look for an individual channel right so that's a neuron in all spatial positions or even an individual neuron right at a single spatial position but that's just the background let me try to convince you that this technique is actually helpful in understanding what neural nets do by showing you some of the things that we've been able to do with it the simplest idea that you can have I would argue is optimizing for the activation of a single neuron in one of these networks right that's what you see here you start from random noise you take the activations of this orange highlighted neuron in one of the branches of the 4b module of the inception or even architecture and you stochastic gradient descent your way to the image you see in the upper right-hand corner now on its own that might look more like an idle curiosity but you can combine it with for example an actual input image and look at the activations of the network at every spatial position of that input image and explain what those numbers mean by using these little little icons these little icons for each neuron alright so if you look at the animation in the bottom you see a sort of a blur a dog ear with a strong activation here and then you get more of a fox snout animal snouts behavior right and those represent individual neurons let's wait until we see the foreground a little bit right so there's this green perspective grass but of course you don't need to just optimize for single neurons behavior the simplest next step is maybe to optimize to for linear combinations of nodes right here in fact you see like two neurons at a time as you can see that they combine in somatically reasonable seeming ways so look at this furriness and art for example or the squareness at art or maybe the black and white neuron that turns the art on the right black and white and in fact you can optimize for whole linear combinations right so rather than representing each of these activations as a combination of these individual neurons you can optimize for the entire activation vector at once these activation vectors can be taken from real images right so you take an image protein network for a given spatial position you can extract an activation vector can turn it back into a feature visualization and if you do that for all the spatial positions of the image at once you can get an interesting overview so if you look at these visualizations from left to right those are lower layers coming to deeper layers you see that the network represents edges and curves in the lower layers and then transitions to these snout parts and maybe a little bit of fur textures where towards the right which is maybe like two thirds through the network you have these fully formed snout parts and cat heads and maybe dog legs now let me attempt to categorize a little bit what I've been showing you when we think of the activation space of a neural network as a vector space then these individual neurons and the visualizations that I've been showing you would correspond to a basis direction of that vector space the pairwise interaction diagram I showed you would correspond to planes of those neurons in activation space and these special activation grids that I just showed you would a set of points in activation space specifically ones that lie on a sub manifold of likely activations right coming from the natural data distribution the distribution of networks also trained on now if you want to understand that entire manifold and not just some points from it we use a technique we call activation Atlas the attempt to explain to create an activation Atlas we run a data set through a network and sample activation vectors from random spatial positions on those images right so you end up with a million of these activation vectors we then run those through a dimensionality reduction algorithm and project them down into a two dimensional plane I don't remember that the activation space might be hundreds of dimensions so it's nice to get them laid out in a 2d plane here we then make a histogram over it right where it's a 2d histogram so it's like a grid within which we then average all of these activations and that average activation we then run through feature visualization so for each grid cell we create one single feature visualization when we put them together we end up with an overview of the representations of the network that we call an activation atlas here's one in all its glory and you can do that at different scales actually can zoom into those and use a finer gridding so let me zoom it to the top right corner there a little bit forgive some human heads and maybe in the center there's more like fingers and arms let's examine some some points in the neck Direction is less here are dog snouts and you can see them sort of smoothly go from our various orientations and various fur colors in a very different corner there's this gas pump scoreboard riding maybe a little bit of water marks in there and the very right other areas will have fruit of Marius texture and color and you can even find these Nidal interpolations through the activation space that go from like fruits in the distance to fruits close up and like many people in the distance to a single person close up and if you do one of these for each layer you can even see how the representations evolve through the layers right going from green grass and some high-frequency textures through those actual leaf like shapes to fully formed zucchinis and cabbages and green bell peppers and stuff now of course you can subset an activation Atlas to only those activations found on images with a certain label okay so you have a snorkeler activation Atlas on the left here and a scuba diver activation levels on the right and one thing to maybe notice that these snorkeler in the bottom left corner has these transparent snorkels while the scuba diver in the top right as it is like black respirators in tools that actual divers use and nothing you can do is that you're not bound to project the activations down to 2d you can project them down to 1d and then use the dimension that you freed up to layout any other attribute in this particular case attribution to the to put either snort ler or scuba diver and the two features that I just talked about in the bottom row here you can really see there's these transparent snort offs that have high attribution to slow play and these non transparent black respirators that have high attribution to scuba data there's a highlight itself on the right hand side something that looks a little bit round and metallic it's probably one of those tanks that like scuba divers wear on their back but to us it looked a little bit like a locomotive like a steam locomotive one of these old-timey black ones and so we tried to see how good our understanding was and see whether we could adversary you flip the classification and it turns out we can so you take an interval snorkeler yeah classify it with only 55 percent confidence but hey you add a tiny little locomotive and we're able to flip it to scuba diver with an height with a higher confidence than the original convention okay that was an overview of some of the applications that we've used feature realization for if you're interested in reading more we have all these papers out on distilled pub a journal that allows us to publish these interactive articles where you can actually play with the techniques yourself but now let me try to focus on how we plan to implement feature realization in pi torch in a way that feels natural to pi torch users so if you remember the core setup we're optimizing an image based on objective that is a function of the activations of a neural network that means we'll need to extract activations from a neural network for which will be using hooks fell away and we'll be optimizing an input image so we'll need something that has a parameter that will output an image and allows us to back problem to it now I'm gonna try to translate this diagram into code and the API that I'll be showing you right it's not finalized or anything like that but it's trying to give you an idea so as your model you'll you'll load a normal pre train model in in PI George and for your objective function the way we'll set it up is we'll have an objective model in which there is a simple function what's called an your own activation right so it'll maximize the excavation of a single neuron and you point it at a specific module of your network all right so when I type net dot mix 3 a dot 3 by 3 that's a 3 by 3 branch in one of the inception modules and I'm pointing it at channel 17 then we have this input optimization objectives that takes the net and the loss function and can run the optimization for a number of steps and the call to optimize here really just hides like a normal optimization loop you know optimizer dot 0 grad and things like that and at the end you want to get a result out of it right so there's Gallegly this kind of try station that we're actually optimizing that will output an image which we can then look at so if you put these parts together that's that's actual working code um at the moment it's in a preview stage and we'll make sure to adhere to the caption API better but unfortunately it's also a little bit of a simplification if you do that you get the result on the top now if grading descent gives you a somewhat high frequency dominated image for reasons that we partly exploring in the papers that I showed you earlier but the high-level takeaway is that we can improve the conditioning of that optimization objective by adding a pre-conditioner which we would call an image parameter ization here and I'll go into detail about that in the next slides as well as something we call robustness transforms I stochastic transformation that we put the image through before we feed it into the network and compute the gradients so here's how that look like in height wash will have a module whose parameters are just the coefficients of some Fourier parameters and it outputs an image right but in a differentiable way so that if we get grains with respect to the image the parameter ization knows how to apply those gradients to its parameters internally and the other stuff that we'll add are those robustness transforms and let me explain what I mean by those what we do here is we did a rotate or scale an image before we put it into the network and it helps us avoid these high frequency artifacts conceptually what it means at least that's the intuition that I use in my head is to say that look I want the input that we're optimizing for here to robustly activate or to robustly maximize my objective function even if it's moved around a little bit even if it's rotated a little bit even if it's a little bit smaller or bigger right and the reason these seem like good invariants to ask for is that those transformations don't change the semantic content of an image right a dog stays a dog whether it's one pixel further to the left or further to the right and I was really glad that we didn't have to reinvent the wheel here we're using transformations straight from the wonderful library called cornea so here's how we're going to translate this into code your robustus transforms are just a normal and in module they take an image they output a slightly transformed image you can even put them in in a sequential if you've got multiples of those and our privatization is also just going to be an NN module it'll have parameters that are wrapped and parameters and those are the in our case for your coefficients that will actually get optimized and when you call it on its forward pass it outputs an image these to you then put into the optimization loop that we already looked at let's put it all together on one slide you set up a pre trained net you set up a loss function in this case neural activation pointed at one of the modules of your net with a channel offset you set up a proboscis transformation that you want your optimization result to be invariant to in this case we're wrapping it in sequential because we maybe want multiple of those we specify an input parameter ization here I'm just saying natural image behind the scenes that uses a Fourier transform and a color D correlation I then put them in this optimization loop which I run for a couple hundred steps and I get out my optimized input image just by calling the input parameter ization and here's how all of that looks together in a diagram now that code isn't quite ready it'll be on a branch and the captain engineers will look at it and probably improve it a lot but I promise that despite looking a little bit like pseudocode all the code you saw it's actually already running today wrapping up I've been talking about feature visualization and its implementation but I want to I want to point out that the result of that code are these visualizations these little glimpses inside the working of those complicated neural nets and we can put those glimpses together we can assemble larger explanations and stories about how they work and I can't wait to see the insights that you'll produce by using these techniques on your own models by thinking about better techniques more insightful ideas build on those primitives or fully on their own I hope you'll try it out you can read all about captain Matt captain that I wear on github github coms left Pytor slash captain is where you can contribute to the library read the code report issues give us your ideas and you can also participate in online discussion forums thats at discuss dot pi toward org / c / captain I'd like to thank the captain team for making a production quality library out of what just a couple years ago was was still cutting edge research and even today I don't think the story's over and also say thank you to the PI torch core team they have been very responsive to our requests about small changes to the API that might make interpretability research easier for us all in the future lastly thank you thank you for your attention for listening to this talk and I hope you have a lovely rest of this virtual GTC\n"