Graph Convolutional Operators in the PyTorch JIT _ PyTorch Developer Day 2020

The Use of Graph Neural Networks (GNNs) in High Energy Physics: A Scalable and Efficient Approach

Graph neural networks (GNNs) have become increasingly popular tools in various fields, including high energy physics. In this context, GNNs are used to analyze the relationships between large amounts of point cloud data. The researchers behind this project wanted to explore the potential of GNNs in high energy physics and develop a way to deploy these models at scale without introducing any cumbersome maintenance burden.

One of the challenges in deploying GNNs is their complexity and the need for significant computational resources. To overcome this, the researchers developed extensions to PyTorch Geometric that make it easy to deploy GNNs on GPUs. This allowed them to take advantage of the significant computational power available on NVIDIA GPUs, which resulted in a substantial speedup.

The researchers also utilized inference-as-a-service tools like NVIDIA Triton to deploy their GNN models at scale. By doing so, they were able to process large amounts of data quickly and efficiently, without having to worry about maintaining complex software stacks. The deployment of GNNs on GPUs was made possible by the use of PyTorch Geometric's extensions, which simplified the process of deploying these models.

The researchers tested their approach using a proof-of-concept that demonstrated the potential of GNNs in high energy physics. They showed that it is possible to deploy GNNs on GPUs without having to add significant maintenance burden or dependencies. The use of inference-as-a-service tools like NVIDIA Triton also allowed them to scale their models easily, which was essential for processing large amounts of data.

To further improve the performance of their GNN models, the researchers plan to experiment with different architectures and optimize their execution. They also want to explore the use of more advanced machine learning techniques to tackle complex problems in high energy physics. The development of efficient and scalable GNN-based solutions is crucial for advancing fundamental science and pushing the boundaries of what is possible in this field.

In conclusion, the researchers have demonstrated the potential of GNNs in high energy physics and developed a scalable and efficient approach to deploying these models at scale. By utilizing PyTorch Geometric's extensions and inference-as-a-service tools like NVIDIA Triton, they were able to take advantage of the significant computational power available on GPUs. The use of GNNs has opened up new avenues for research in high energy physics, and further development is expected to lead to exciting breakthroughs in this field.

High Energy Physics and Imaging Calorimetry: A Real-World Application

One of the applications of GNNs in high energy physics is imaging calorimetry. In this context, point cloud data represents a "giant 3D camera" that captures energy deposits in complex geometries. Each color in the plot corresponds to a specific cluster assignment, which is determined by the reconstruction algorithms used on the input points.

The researchers behind this project have developed an approach that leverages GNNs to analyze these point cloud data and determine the correct cluster assignments. By utilizing PyTorch Geometric's extensions and inference-as-a-service tools like NVIDIA Triton, they were able to deploy their GNN models at scale without introducing any significant maintenance burden.

The use of GNNs in imaging calorimetry has several advantages. It allows researchers to analyze complex geometries quickly and efficiently, which is essential for making accurate predictions in high energy physics experiments. The deployment of GNNs on GPUs also enables the processing of large amounts of data, which is necessary for simulating complex events.

The development of efficient GNN-based solutions is crucial for advancing fundamental science in high energy physics. By leveraging PyTorch Geometric's extensions and inference-as-a-service tools like NVIDIA Triton, researchers can focus on developing new machine learning techniques without worrying about the underlying infrastructure.

Static Analysis of Graph Neural Networks

Another aspect of deploying GNNs at scale involves static analysis of the functions that are used in these models. The goal of this process is to determine the types of variables being used in the model's architecture, which allows for better optimization and deployment.

The researchers behind this project developed a concrete version of all the variables being used in the functions within the GNN's graph convolutional operator. By doing so, they were able to create a class that uses these variables by default instead of inferring them during execution.

This approach has several advantages. It allows for better optimization and deployment of GNN models, as well as improved scalability. The use of static analysis also enables researchers to focus on developing new machine learning techniques without worrying about the underlying infrastructure.

The development of efficient GNN-based solutions is crucial for advancing fundamental science in high energy physics. By leveraging PyTorch Geometric's extensions and inference-as-a-service tools like NVIDIA Triton, researchers can create scalable and efficient models that push the boundaries of what is possible in this field.

Future Directions

The next steps for this project involve moving from a proof-of-concept to a battle-tested solution that meets the requirements of high energy physics. The researchers plan to continue exploring new machine learning techniques and optimizing their GNN-based solutions for deployment on GPUs.

To further improve the performance of their models, they will also focus on experimenting with different architectures and optimizing their execution. The development of efficient and scalable GNN-based solutions is crucial for advancing fundamental science in high energy physics, and further research is expected to lead to exciting breakthroughs in this field.

"WEBVTTKind: captionsLanguage: enhello i'm lindsey gray i'm a scientist at fermi national accelerator laboratory and i've been working with mathias fey to bring graph convolutional operators to the pi torch jit to frame this uh the discussion about this i'll first start with my field of research which is high energy particle physics and give the problem context so we know why we're going after this and why we needed graph convolutional operators in the jit then next we'll talk about how we did it and how we make it deployable at scale and then finally we'll talk about some initial results and what's next for this project in high energy particle physics we're facing a challenge of an assault of information in the coming years soon we'll be taking about an exabyte of data per year of highly structured data with thousands of particles of per event to identify and make use of we've been looking into graph neural networks because they're very good at processing point clouds which can describe a lot of our data and we need to make sure that we can keep up with the compute demand that's coming from that because there's been a trend in machine learning in high energy physics where we've been integrating it integrating machine learning deeper and deeper into how we do our science so we really have to think carefully about how we use these models at scale and make sure that we aren't introducing any efficient any inefficiencies into the system that we propose to use and this is where we get into wanting to use pi torch and then wanting to use the pi torch kit for graph neural networks so graph neural networks are neural networks that operate on structured data of arbitrary form so you have the list of data the x i that you see in this diagram to the left here and then you have the relationships between those data which are the arrows pointing between them and a graph neural network can learn to make associations between each of those data using a neural network at some point of some sort and then scatter and or gather and scatter all of that data between all the nodes over these associations between those data and in pytorch geometric it's very straightforward to write down what a graph neural net or what the pieces of a graph neural network are so you can see here this is an edge convolutional operator and we're generating a message to pass with a multi-layer perceptron we propagate or the only thing we do in order to uh propagate the message and calculate the the forward pass is just uh pass the messages and the message that is generated is uh written down explicitly so you can see that it's taking pairs of no or pairs of pieces of data and looking at the difference between two pieces of data as well as the local features of that data itself and then you make a new message pass it around and try to classify the graph based off of that information this means that pytorch geometric is really following the typical pytorch principles and it's quite straightforwardly pythonic experimentation-centric uh and has a very clear api for writing these graph neural networks and or in addition to all of that it is backed by well-performing code uh on top of that it's well adopted it's got hundreds of citations for people doing experiments on graph neural networks and over a thousand forks to date um with an active community of contributors behind that and then on uh particularly it's useful for high energy physics reconstruction tasks since most of our data can be described as point clouds and we want to learn about the relationships between the points in those point clouds on top of that it's incredibly flexible so you can define almost any graph convolutional operator that you can that you can think of in pytorch geometric and it does this by exploiting a number of features of python that aren't available to the pytorch jet so we needed to think about a way to get from the graph convolutional operators as they were in pipe in pytorch geometric to something that the jet could use the reason that we want to use the jit is that it makes it incredibly simple to go from working on your models and experimenting on them to using your model in a trial and doing some first evaluations and then finally deploying it at scale and the jit on paper is actually very suitable for graph neural networks because it the jit is able of able to capture very rich control flow which is kind of the bread and butter of graph neural networks and if we can get the jit to synthesize a graph neural network then we have this ease of deployment where we can go from experimenting on the model to all the way to deploying it at scale without needing to switch frameworks or change any of the models in substantial ways and it makes it very very easy to maintain the model which is of vast importance if you want to main or if you want to work on a three thousand person experiment how did we achieve this um we started off by using python type hinting and this was in order to stay as pythonic as possible while making the inputs and outputs of the operators more concrete so they could be statically analyzed by the jet and compiled down in order to get there we do a simple static analysis on what the user provides so we dynamically rewrite the user code when they ask for a jittable version of their module and maintain exactly the operations that they expect whilst doing this and then we made sure that there are minimal code changes to the model in order to make it something that you can deploy through the jet and in particular once you have a convolutional operator ready to go all you have to do is type jittable and that model can be interpreted by the pytorch kit so what does this look like in terms of changing a pie torch geometric operator and make to make it jittable essentially we have to obey all the rules and then do a little bit of dynamic rewriting underneath so what you can see right here is that we've type hinted every single variable that's going in and coming out of this particular convolutional operator and we've also added an additional propagate type hint that tells the jet exactly uh what sort of data is being passed in and out of the propagate function which is determined dynamically inside of pytorch gm geometric's message passing class um and that's just the user interface that you see when you're trying to write something so how did we do it uh the first thing that needed to be done that matthias worked extensively on was making a campaign or performing a campaign to get all the custom ops into jittable form so torch geometric uses a huge number of sparse operations and and quite a number of clustering algorithms as well and all the all of those need needed to be made available and then finally let's talk about this static analysis that we do so we go and look at all of the functions that are in a user's graph graph convolutional operator and we make a concrete version of all of the uh variables that are being used in those functions so that we know the types ahead of time and then we rebuild the class such that it uses these uh no or known ahead of time definitions and turn it into a class that uses those definitions by default instead of having to infer them during execution so by doing this quick rewrap of the class we're able to turn or take the exact same uh class and turn it into something that's uh jittable and usable at scale so one example of this from high energy physics is that we have we are looking at imaging calorimetry and what you see here in the plot to the left are a whole bunch of energy deposits inside of effectively a giant 3d camera and each one of these different colors that you see is not known until we run our reconstruction algorithms on all of those input points and now instead of having to write a reconstruction algorithm by hand we can train a graph neural network compile it down into jit code deploy it on nvidia triton pass off our data into an inference as a service engine and then bring back all the cluster assignments uh as executed on gpu and then go go on about our business of doing physics with the outputs and this is really cool because we didn't need to add pi torch to our experiment framework which is already a couple million lines of code and then a large number of dependencies as well and we also didn't need to add cuda or maintain it so this is a significant ease of operation for us whilst also getting these powerful algorithms inside of cmssw so we can use them so we have seamless gpu integration uh through inference as a service and you know the typical 100x speedups that you expect and we also get this factorized software stack that really uses our maintenance for the foreseeable future when trying to use these operators so in conclusion you can see that gnns are becoming more widely used tools and have we have a vast amount of point cloud data that we want to assess the relationships between all those data and we've been looking into gnns with pi torch and it's widely accessible by via pytorch geometric we needed a way to make these models deployable at scale in a very easy way that didn't and incur anything cumbersome in terms of uh maintenance burden and the pie torch is one way towards that um and as it was before gnns were really difficult to deploy so we wrote extensions to pytorch geometric that make this transition almost seamless and you can check out that example if you follow the link in the slides all of the currently implemented convolutional operators are there and it's very easy to add more so we can really cover all of the literature for gnns already out of the box and make them deployable at scale we've tested this in inference as a service tools like nvidia triton to make the gnn deployable and easy or the the serialized gnn easy to scale we only had to add one additional container layer on top of nvidia triton in order to make this work and this gives us scalable event processing on gpus that's factorized from our typical compute which is good for us so the next steps are to move from this proof of concept that i showed you into something that can really be battle tested that does exactly what we want um because that wasn't the whole story that i showed you in in particular in high energy physics we on infer on thousands of times more events than we train on so we really have to make sure that we have a robust algorithm and then we also need to make sure that it can scale because we have to run this on tens of thousands of workers so really understanding the scaling behavior is key to our success and that kind of feedback can also help pi torch and triton grow on top of that we're going to start adopting more models into the inference as a service architecture since it eases maintenance burden and very straightforwardly exposes gpu as accessible compute for our processing systems and then finally we want to optimize the execution of these high energy physics reconstruction machine learning models and we're really only at the beginning of building the first efficacious models for this and we have to now take our time experimenting on those models working with the jit working with the structure of the models themselves and how we're presenting the data in order to get what we really want in the end and that means that there's a ton of really exciting developments ahead ahead of us for using cutting edge machine learning at or for fundamental science thank you\n"

Graph Convolutional Operators in the PyTorch JIT _ PyTorch Developer Day 2020

Random Videos