State of PyTorch - Community & Partner Talks at PyTorch Conference 2022

The State of PyTorch: A Year in Review

Welcome back everyone, and thank you for joining us today! I'm Joe, an engineering manager at Meta on the PyTorch team, and I'm excited to co-present the state of PyTorch with Guido Charon. As we continue our annual tradition of giving a lightning overview of some of our major feature launches, we're thrilled to share with you all the exciting developments that have taken place in the world of PyTorch over the past year.

First and foremost, I'd like to acknowledge the incredible community that has made PyTorch so successful. From active contributors on GitHub to enthusiastic users around the world, we couldn't do it without your support and engagement. As we look back on the past 12 months, we're proud of the progress we've made towards our goals of making PyTorch faster, more efficient, and more accessible to everyone.

One of the most significant updates we've made is the launch of Trainium ML Chips for training large language models. This exciting new development has shown near-linear scaling across training clusters, and we're thrilled to see the impact it's already having on the AI community. We're also proud to announce that our work on distributed training has continued to gain momentum, with improvements like the PyTorch Distributed PS3 plugin and our support for fully sharded data parallelism.

In addition to these technical advancements, we've made significant strides in collaboration with leading companies across the industry. From cloud providers like AWS and Google Cloud to startups like Stability.ai and Predibase, we're committed to building a vibrant ecosystem around PyTorch that enables innovation and growth. Our partnerships have resulted in exciting new applications of PyTorch, such as Amazon's use of large-scale models across multiple modalities to power their discovery engine.

But what about the community itself? How have you all been using PyTorch over the past year? We're excited to share some highlights from our own work on GitHub, including 20,000 replies to community questions and concerns, and 3,000 contributors who've made significant contributions to the project. We also want to recognize the many organizations and individuals who have demonstrated exceptional dedication to PyTorch, whether through open-source projects or innovative applications.

As we wrap up our presentation today, I'd like to extend a heartfelt thank you to each and every one of you for being part of the PyTorch community. Your passion, creativity, and hard work are what make PyTorch so special, and we're honored to be on this journey with all of you.

---

Industry Usage: Collaborations and Success Stories

As I hand it over to Guido Charon to talk about industry usage, I'd like to acknowledge the incredible progress we've made in collaboration with leading companies across the industry. From cloud providers like AWS and Google Cloud to startups like Stability.ai and Predibase, our partnerships have resulted in exciting new applications of PyTorch that are driving innovation and growth.

One of the most notable successes we've seen is from Stability.ai, a startup that's focused on building open AI tools for developing cutting-edge AI models. By leveraging PyTorch, they're able to accelerate their development process while maintaining the highest standards of quality and accuracy. We're thrilled to have collaborated with them on Integrations for FSTP and PyTorch.

Another standout success story is from Predibase, a startup that's developed an alternative to AutoML using a declarative approach. By harnessing the power of PyTorch, they've been able to reduce development time while maintaining the highest standards of model quality. We're excited to see their work and look forward to continuing our collaboration.

We've also seen significant progress from Microsoft Azure, who have launched PyTorch Azure containers and the DeepSpeed MII Library. This exciting new development has enabled faster model inference at a lower cost point, with built-in support for over 24,000 models. We're thrilled to see their work and look forward to continuing our collaboration.

In addition to these success stories, we've also seen many other organizations and startups leveraging PyTorch in innovative ways. From Tesla's use of PyTorch in their self-driving car project to Amazon's deployment of large-scale models across multiple modalities, the applications of PyTorch are endless and exciting.

As I hand it back to Joe, I'd like to take a moment to acknowledge the many organizations and individuals who have demonstrated exceptional dedication to PyTorch over the past year. Your hard work and passion are what make PyTorch so special, and we're honored to be on this journey with all of you.

---

The Future of PyTorch: Trends, Opportunities, and Looking Ahead

As I hand it back to John to summarize our presentation, I'd like to take a moment to reflect on the incredible progress we've made as a community. From technical advancements like Trainium ML Chips and distributed training to exciting collaborations with leading companies across the industry, we're proud of what we've accomplished.

Looking ahead to the future, there are several trends that we can expect to see continue to shape the world of PyTorch. One area of significant focus will be on accelerating the development of large-scale models, which have already shown tremendous promise in areas like computer vision and natural language processing. We're excited to explore new applications of PyTorch in these areas and to collaborate with leading companies and startups to drive innovation.

Another trend that we can expect to see continue is the growth of open-source projects and communities around PyTorch. With 20,000 replies to community questions and concerns on GitHub, it's clear that our community is passionate and engaged. We're committed to continuing to support this growth and to providing tools and resources that enable developers to build amazing things with PyTorch.

As we look ahead to the future, I'd like to leave you all with a few final thoughts. Firstly, thank you again to every single one of you for being part of the PyTorch community. Your passion, creativity, and hard work are what make PyTorch so special, and we're honored to be on this journey with all of you.

Secondly, I'd like to emphasize the importance of collaboration and community in driving innovation and growth. By working together, we can achieve far more than we ever could alone. Whether through open-source projects or innovative applications, your contributions are invaluable and will shape the future of PyTorch for years to come.

Finally, I'd like to leave you all with a message of hope and excitement for what's to come. The world of AI is rapidly evolving, and PyTorch is at the forefront of this revolution. With your help, we can build an even more amazing future together.

"WEBVTTKind: captionsLanguage: enthank you hey everyone welcome back excited to be here today my name is Joe and I'm an engineering manager at meta on the pie torch team and I'm joined virtually by Guido Charon who will be joining us for the second half of the talk um so excited to co-present the state of Pi torch which has become a little bit of an annual tradition to give a lightning overview of some of our major feature launches um talk a little bit about some metrics and growth and help celebrate all of you who have helped to make pytorch the success that it is today um so kind of in that order talk about some feature launches do a little Deep dive into research Community using pytorch uh back out to the community at large and then finally hand it over to Guida for industry um so first up we're excited to have had three major releases this year uh one 11 12 and 13. uh constituting somewhere around 10 000 commits from hundreds of Developers uh thank you all for that um you've heard about many of these major feature releases throughout the day today in talks and in the poster sessions but just to re-highlight we have torch data torch Arrow you know new libraries for common modular data loading Primitives for constructing flexible and performant data pipelines and for pre-processing of our batch data we've Upstream Funk torch a library that adds composable function transforms into Pi torch for familiar apis like vmap and grad jvp and vjp enabling things like per sample gradients uh we're excited of our launch of fsdp fully sharded data parallel enable training of large AI models by sharding model parameters gradients and Optimizer States across multiple machines one of our most highly requested and commented on issues on all of GitHub Apple M1 support so we're excited to launch acceleration on Mac to now take advantage of Apple native silicon gpus for significantly faster model training we had a poster session on better Transformer so this is kind of out of the box performance wins on CPU and GPU for fastpath implementations of Transformer encoder layer and for multi-head attention and the fuser is our new deep learning compiler for pi torch supporting a wider range of operations uh faster speeds and torch script has been updated to use it by default and finally we're incredibly excited about our new domain Library launches for torch Rec and torch multimodal on the rec side having a common sparsity and parallels and Primitives for high-speed recommendation systems and on the multimodal side having this composable building blocks for training state-of-the-art mtml models so backing up a little bit we're excited that we can now observe over 350 000 GitHub repos that use pytorch available and that's about a 45 growth year over year I don't think it would be fair to be adjacent to the nurse conference without taking a little time to dive into researcher usage um so we see over 9 000 GitHub repos that can be associated with AI research papers from this year via papers with code and that constitutes somewhere around 63 percent of all AI research implementations choosing pytorch and a four percent growth with the Cambrian explosion of AI would be impossible to get into every single domain that is currently using pi torch but just to highlight a few we're excited about on the generative modeling side we have researchers at Fair who have built make a video uh to be able to take his input a text prompt and generate these kind of whimsical one-of-a-kind videos pytorch is being used in Next Generation Therapeutics with researchers at Asimov using pytorch to generate optimal programs for living cells to produce Next Generation Therapeutics and on the graph neural network side Kumo AI is using pytorch alongside the popular pytorch geometric to build train and scale large-scale graph neural networks so talking a bit about the community as I mentioned earlier we have over 11 000 commits uh over the past 12 month period with more than 3 000 unique contributors which is crazy and a 20 increase year over year um so I just want to thank all of you uh for kind of coming here and contributing to pytorch there's a little bit of a technical issue so we aren't able to have a scroll with everyone's username who's contributed over the last year so this is a random sample size of individuals but you know whether you've contributed documentation or bug fixes you know low level performance optimizations high level API design we appreciate all of your contributions and finally there's contributors from major organizations as well who have helped with everything from Fast kernel development to supporting different devices Integrations into Cloud providers to make it easier for AI researchers and developers around the world to be able to use pytorch so thank you all for coming in mostly I just want to highlight one of my favorite aspects of all of pytorch which is community engagement it's incredible to me that we have over 2 million views of our PI torque discussion forum there's been over 20 000 replies to community questions and concerns and bugs and so thank you for being active on there and getting out there and helping the community um and over 3000 contributors to that so with that let me hand it over to Gita to talk about industry usage thanks Joe I think we collaborate with leading companies across the industry for pie torch adoption and Integrations I will share some of the key highlights from our work in 2022. from the latest LinkedIn Talent report we are seeing more than 44 000 professionals have tagged pytosh as a skill set on their LinkedIn profile and there are 2500 plus related jobs for pytorch on LinkedIn now overall this is a 50 year-over-year increase in pytos professionals just this year with the cloud providers our Focus this year has been on large Foundation models with AWS we enabled the launch of the trainium ml chips for training large language models and seeing near linear scaling across the training clusters further in a three-way partnership with Google AWS contributed the xla backend to the pytos distributed to ease migration of training workloads onto the training chips with Microsoft Azure we launched the pytosh Azure containers and the Deep speed Mii Library a new open source library that speeds up model inference and at a lower cost point with built-in support for more than 24 000 models with Google Cloud we launched the pytosh xla profiler for the cloud tpus to help with troubleshooting and profiling large training runs we also added support for the fully sharded data parallel the pytos xla and launched the open xla project for community-based MN compiler and infrastructure projects which includes the XLE compiler and the stable HML we saw many exciting startups emerge in the AI space I would like to highlight a few stability.ai best known for the open sourcing of stable diffusion models this building open AI tools for developing cutting-edge AI models or generative art music language videos 3D and biology predibase offers an alternative to automl and uses a declarative approach to building AI models the Ludwig Library a Linux Foundation project has migrated to pytos from tensorflow and we collaborated with them to add support for scriptable tokenizers to torch text fashionable is a startup focusing on AI generated fashion for clothing design for the fashion industry a faster time to Market and reducing waste we help their team to speed up their Gan based model training in collaboration with Microsoft azure we are seeing a rise in the software vendors building libraries and Solutions on turn off by torch hugging phase standing partner known for the Transformer libraries this year they added support for accelerate to speed up distributed training and entrance and diffusers for the diffusion models we have collaborated with them to integrate better Transformers and fully shaded data parallel elasticsearch the search engine platform has added embedded python support for NLP models by integrating lip torch into the elastic surge stack making it easy for developers to perform tasks like named entity recognition sentiment analysis and more provide inside elastic search itself without having to make an external call Mosaic ml is another emerging startup which focus on efficient model training the pie Dodge based Library composer enables training neural networks faster and at a lower cost point they recently launched their own cloud offering and they are collaborating with them on Integrations for fstp and Dodge data we are seeing a rise in the production use cases of pytorch across the industry I would like to highlight a few on the Tesla Aid Tesla shared details of how they are using pytorch not just in the self-driving car but also as part of their dojo superclusters and the new AI accelerator with those two extensions for pytorch and their new compiler engine with Justine we collaborated on their new multi-modal based activity recognition solution for the media production field where they combined insights from audio video and subtitles to get better results Amazon search is using large-scale models across multiple modalities to power their discovery engine for Amazon their M5 team runs thousands of experiments and training jobs each month and is quickly able to roll them into production they make use of pythons distributed PS3 plug-in pie torch propeller and are migrating to fstb we are excited to see these great uptakes of pytorch across the industry you will hear from many of our partners on the talks today along with the poster sessions for more case studies please check out the pie Tosh Community stories and medium Channel I will now hand it back to John thanks thanks Peter and thank you all for coming out today and for being part of the pytorch community I hope to see you all on GitHub\n"