Mapping Dark Matter with Bayesian Neural Networks w_ Yashar Hezaveh - TWiML Talk #250

### Article: Gravitational Lensing and Machine Learning: A Conversation with Yasser Hisab

---

#### Introduction to Gravitational Lensing and Machine Learning

In a recent episode of *Twirl Talk*, Sam Carrington welcomed Dr. Yasser Hisab, an assistant professor at the University of Montreal and a research fellow at the Center for Computational Astrophysics at Flatiron Institute. The discussion centered on the intersection of machine learning and astrophysics, with a focus on gravitational lensing—a phenomenon where the gravity of massive objects bends the light from distant galaxies, creating distorted or magnified images of those objects.

Dr. Hisab explained that gravitational lensing is not just a scientific curiosity but a powerful tool for understanding the distribution of mass in the universe, including dark matter. He shared his journey from studying astrophysics as an undergraduate at the University of Victoria to pursuing a Ph.D. at McGill University, and eventually conducting postdoctoral research at Stanford University. His recent work has shifted toward applying machine learning methods to analyze astronomical data.

---

#### The Role of Machine Learning in Gravitational Lensing

One of the key challenges in gravitational lensing is determining the properties of the lensing object (e.g., a galaxy or black hole) and the source (e.g., a background galaxy). Traditionally, this involves generating simulations to match observed data, which can be computationally expensive. Dr. Hisab described how machine learning, particularly convolutional neural networks (CNNs), is revolutionizing this process.

He provided an analogy: "Imagine you have a candle flame and a wineglass. If you look at the flame through the wineglass, the light bends around the glass, creating rings or arcs. In astronomy, the wineglass is the massive object causing the lensing effect, and the candle flame is the distant galaxy we're observing. Using machine learning, we can train models to predict the shape of the wineglass (the lens) based on the distorted image of the flame."

Dr. Hisab emphasized that this approach not only speeds up the analysis but also improves accuracy, especially when dealing with large datasets expected from upcoming surveys like the Large Synoptic Survey Telescope (LSST).

---

#### Data Processing and Simulation Challenges

Before applying machine learning, researchers must preprocess data to account for various distortions caused by telescopes and atmospheric effects. For example, cosmic rays can leave streaks in images, and the point spread function of a telescope determines how light is spread out. Dr. Hisab explained that these challenges are addressed by carefully processing raw data to isolate the lensing signal.

He also discussed the importance of simulations in training machine learning models. While real-world data is valuable, it is often limited in quantity and quality. Simulations allow researchers to generate large datasets with known properties, enabling robust training of CNNs and other algorithms. However, he noted that domain adaptation techniques are necessary to ensure these models generalize well to real-world observations.

---

#### Recurrent Neural Networks for Lensing Analysis

In addition to CNNs, Dr. Hisab explored the use of recurrent neural networks (RNNs) in gravitational lensing research. Unlike CNNs, which process spatial information, RNNs can model sequences and are particularly useful for tasks like predicting the distribution of matter in a lensing system.

He described an experiment where his team trained an RNN to refine initial guesses about the lensing object's properties. Starting with random parameters, the network iteratively improved its predictions by incorporating physical constraints from simulations. The results were promising: the RNN-generated images of background galaxies were more accurate than those produced by traditional methods.

---

#### Challenges and Future Directions

Dr. Hisab highlighted several challenges in applying machine learning to astrophysics. One issue is ensuring interpretability—researchers need to understand why a model makes a particular prediction, especially when dealing with complex phenomena like dark matter. Another challenge is computational cost, as training sophisticated models requires significant resources.

Looking ahead, he expressed optimism about the field's potential. With upcoming surveys expected to generate unprecedented amounts of data, machine learning will play a critical role in extracting meaningful insights. He also mentioned collaborations between astrophysicists and computer scientists as key to advancing the field.

---

#### Performance Metrics and Case Studies

When evaluating machine learning models for lensing analysis, Dr. Hisab focuses on two main metrics: speed and accuracy. Traditional methods can take weeks or even years to analyze a single dataset, while machine learning models often achieve results in seconds or minutes. Additionally, the accuracy of these models has proven to be comparable or superior to human experts in certain cases.

He shared a case study where his team used CNNs to analyze data from the Hubble Space Telescope. The model not only matched the accuracy of manual analyses but also identified subtle features that were previously overlooked. This success has led to increased adoption of machine learning techniques among astrophysicists.

---

#### Conclusion: The Future of Gravitational Lensing and Machine Learning

In conclusion, Dr. Yasser Hisab's work demonstrates how machine learning is transforming gravitational lensing research. By leveraging advanced algorithms like CNNs and RNNs, researchers can process large datasets more efficiently and gain new insights into the universe's structure.

As Dr. Hisab noted, "The combination of astrophysics and machine learning is still in its early stages, but the potential is enormous. We're just beginning to scratch the surface of what's possible."

For those interested in learning more about Dr. Hisab's research or the topics discussed in this episode, visit [Twirl Talk](https://twirltalk.com) for additional resources and updates.

---

This article provides a detailed exploration of gravitational lensing and machine learning, drawing on insights from Dr. Yasser Hisab. It highlights the challenges, opportunities, and future directions of this exciting field at the intersection of astrophysics and artificial intelligence.

"WEBVTTKind: captionsLanguage: enhello and welcome to another episode of twirl talk the podcast why interview interesting people doing interesting things in machine learning and artificial intelligence I'm your host Sam Carrington you may have seen the news yesterday that MIT researcher Katie Bowman produced the first image of a black hole what's been less reported is that the algorithm she developed to accomplish this is based on machine learning machine learning is having a huge impact in the fields of astronomy and astrophysics and I'm excited to bring you interviews with some of the people innovating in this area today we're joined by yasser hisab a assistant professor at the university of montreal and research fellow at the center for computational astrophysics at Flatiron Institute yeah sure and I caught up to discuss his work on gravitational lensing which is the bending of light from distant sources due to the effects of gravity in our conversation Joshua and I discussed how the machine learning can be applied to undistorted including some of the various techniques used and how the data is prepared to get the best results we also discussed the intertwined roles of simulation and machine learning and generating images incorporating other techniques such as domain transfer or Gans and how he assesses the results of this project for more of our astronomy and astrophysics coverage be sure to check out the following interviews - a mole talk number 117 with Chris shaloo where we discussed the discovery of exoplanets when we'll talk number 184 with Viviana aqua vive where we explore dark energy and star formation and if you want to go way back to when we'll talk number 5 with Joshua bloom which provides a great overview of the application of machine learning in astronomy I'll be sure to link to these episodes in the show notes I'd like to thank everyone who entered our AI conference and tensorflow edge device giveaways today I'm excited to announce the winner of our AI conference giveaway mark tee from Indiana mark I'm looking forward to seeing you in New York next week today's show is sponsored by our friends at Pegasus Tom's pega World the company's annual digital transformation conference will be held at the MGM Grand in Las Vegas from June 2nd through 5th I'll be attending the event as I did last year and I'm looking forward to presenting again in addition to hearing from me the event is a great opportunity to learn how a AI is being applied to the customer experience and real pega systems customers as a twin will listener you can use the promo code 1219 at Pegah WorldCom for $200 off of your registration again that code is 2019 hope to see you there enjoy all right everyone I am on the line with Yasser hey szávay Yassir is an assistant professor at the University of Montreal and a research fellow at the Center for computational astrophysics at Flatiron Institute yeah sure welcome to this week in machine learning and AI thanks thank you very much for inviting me let's start by talking a little bit about your background you recently joined University of Montreal as an assistant professor but tell us a little bit about the arc of your studies and research yeah so I'm an astrophysicist and for most of my research career I've been primarily doing researching as for physics I did my undergrad in physics and astrophysics and Universal Victoria in Canada and then my PhD at McGill University in Montreal yeah I got my PhD in 2013 and I moved to Stanford as a hobby fellow until recently I just moved from Stanford about three months ago and during this whole period of you know 10 years you know in as a you know graduate student and a researcher I've been working on specifically Astrophysical data analysis and I need it past couple years you know with all the you know buzz about machine learning I kind of like started you know to look into the application of machine learning methods to Astrophysical data analysis and so now a good fraction of my research has kind of like you know focused on on developing new machine learning methods for the analysis of Astrophysical data so like telescope data and your particular research area is focused on strong gravitational lensing what's the strong lensing is the distortion in the images of distant objects done by the gravity of intervening object structures so just think about it that you know gravity actually bends light so here on the earth we don't notice it because it's such a tiny effect that you know you don't notice it but in reality you know if you had a flashlight and you know it's flashlight the light rays instead of going like straight they would actually blend a little bit because of the gravity of the earth it's the same reason that black holes are black because they can absorb all this slide because the light falls into the black hole so at cosmological distances you can have two galaxies one sitting far away say five billion light years from us and if second out it could be like much farther away it's a 12 billion light rays but right behind this middle galaxy so we have this scenario it's us a galaxy we call it the middle galaxy we call it like the foreground and a background galaxy and so the light rates of that background galaxy as they come and they pass near the foreground galaxy they get benched they get deflected because of its gravity and so they come to us from these different angles different directions because of you know the spending of light and as a result what happens here is that here on the earth we see these distorted images of that background galaxy that look like rings and arcs around the middle galaxy so you can have you know instead of like one galaxy completely being in front of the other one you would see one galaxy and around it you would see these rings and arcs which are actually the images of the distant background galaxies and so how does the use of machine learning play into their study of these these lensing effects so there are two things so given an analogy about this you know I like to make an analogy to lensing of a candle flame with a wineglass so think about it you know you have a candle sitting on the table and you have a wineglass if you look at the candle flame through the foot of the wineglass you can see that the image of the flame kind of wraps around the foot of the wineglass and it makes like rings around that so that's why this is called gravitational lensing because it acts you know the galaxies like acts like a lens and so this is kind of the thing that we have and we usually have two questions in each of these for each of these observations the first thing that we want to figure out is what is the shape of the foot of the wineglass what is the distortion that is caused to the image and so this relates to you know how much matter there is in the middle galaxy so we're trying to use these image distortions to learn about how matter you need to map the distribution of matter in these lens in galaxies and then the second question is that I see a distorted image of this background source but what does it truly look like you know if I'm looking this arc that is a stretched-out image of the candle flame I'm interested to know what what does the kind of flame truly look like how could i undestood this image and so you can see that all of this kind of relates to image analysis and image processing so so one thing that you know works really well you know for us is you know this development of convolutional neural networks that are specifically tailored to image analysis problems and so we've been kind of like you know hijacking them and using them for this application to answer these two questions you know if I get it and a distorted image of these background things can I predict what you know the distortion is that's been casted and can i undistorted struct the true image of this background galaxy and so what are some of the techniques that you're using to do this so traditionally this is done by something called like I'll just throw the name and I'll explain what it is you know using like maximum likelihood lens modeling so the way that this works is that you say well let's think about it I have this you know candle in the background I am putting a lens in front of it so I see this distorted image you know magnified image of the candle and and I have an observation so I see that image but I need to know what that candle truly looks like nor do I know what is the soren that's been added to its image right so so like if I see that the data I cannot predict these two together but what I can do is that I can produce a lot of simulations so if you gave me an image of a candle and you gave me a lens it's easy to simulate it and say you know because you can go from one to the other so I can get a lens and predict what is the distortion that it calls us to space and then I can put the image of the candle and I can make a simulation so say you know it's a lens damaged a distorted image of the candle would look like that it's the problem is that you know it's difficult to undo this so the way to stop traditionally has been that you know I would just produce a lot of different simulations with like you know perhaps random candles and random lenses and try to find out which one of them really looks like the data and so if I find one simulation that really looks like the data you know I can kind of infer that the parameters that are assumed for it you know the background source addition for the foreground lens that ice should be a kind of a correct description of their you know realities and so this kind of like falls into this you know general umbrella like you know inverse problems where it's easy to go from the ingredients of the model like if I knew the truth about the truth about the candle and the lens I can go forward and make a simulation but the inverse problem is difficult if I he gave me the output then I cannot figure out you know what was the initial ingredients that it was made from so with machine learning what is exciting about it is that we can construct these inverse you know mappings so but using a lot of simulated data or true data I can learned to just kind of like directly predict what these background sources or the foreground lenses look like just directly looking at the data we doesn't need to produce in a lot of simulations for every data analysis in describing the the simulation based approach there's something kind of intuitively unsatisfying about that the idea that you're going to just randomly generate a bunch of candles and randomly a bunch of lenses and if you get something that kind of looks like or looks very close to the result that you've got you assume that it's that specific lens and candle configuration is it that the the chance that you get a good match you know without the candle and the lens being exactly right or or close so small that that gives you comfort and choosing that particular configuration or I guess this part of me that says you know there could be any number of configurations of candles and lenses that yeah that's a great question that's a great question so in a statistical way what you really do is that you would say I'm actually gonna find out every candle and every lens that will kind of produce so there might be as you said there might be like multiple answers and I'm gonna find every one of them that will match the data within my uncertainties and so it means that you have to have a statistical description of your data to understand your whether you're on certain pieces on certain these could come from for example noise and you would write a statistical model to say well this particular candle and this particular glass it fits my data to some you know you know in a probabilistic way so and this other one could also match it and that's one of the things that makes it even more difficult because the problem is not only even finding a single answer the problem is that now you have to kind of explore all the different answers and kind of give a range of these answers that are consistent with the data so it works well it's just that computationally it's very expensive because now it means that you have to try millions and millions of these simulations to give an answer for one specific data set and what are you assuming as as known in the way you've formulated the problem here is you kind of know what a candle looks like and you know what a lens kind of looks like how what assumptions are you making about the candle and the lens yeah that's again that's a really good question so these assumptions the language of you know statistical analysis you would call them priors so these are the prior information the prior knowledge that we have about these things and so the way that these priors are specifically encoded in the analysis could different from you know mildly different but you can imagine that for example in the case of strong lensing you want the background source to be something looking like a galaxy you know so you would impose some sort of like you know prior knowledge you would say I have a prior knowledge of the background source you know it's a galaxy so for example when I make images of this thing I will you know kind of enforce that for example it is you know it's blobby or it's concentrated at the center or that you know it's Peaks you know it's density at this you know its brightness at the center but the way that you define it could be tricky and specifically when you were doing it's you know people kind of like forward you know lens modeling procedures it is difficult to actually specify these priors one of the cool things about machine learning is that these prior information you can actually learn them from data itself and that's one of the really like great advantages of the machine learning approach is that if I had a large data set of these lenses or galaxies in general if I think that the background source is a galaxy I can get millions of images of other galaxies in the universe from you know all sorts of telescopes and I can put it through a machine learning procedure and I can actually learn what are the kind of structural priors that I need to respect and then try to kind of like find out what are the solutions within that kind of like in a range of you know possibilities that match this particular data set and so maybe share a little bit about the data collection and preparation aspect of these types of problems assuming the data that you're working with comes from these large radio telescopes and you're able to collect that very fairly readily but you have to do a lot of processing to it yes so we work both with radio telescopes and and right optical telescopes like you know how about Space Telescope so throughout the first like machine-learning papers that we wrote we're basically just using you know how about Space Telescope images so these images usually there's like a few steps of Korea processing that you need to do with it the telescope that comes you know from the the image that comes from the telescope might for example have a lot of cosmic rays these are just like particles you knowing around the earth that hit did you know the cameras on the telescope and they just leave these traces their high-energy particles and so you know you might need to remove those you might need to for example subtract the light from the lens and galaxy itself because remember what we're interested in is a distortion of this background galaxy and how it's been distorted one of the nuisance phase here is that the middle get up see that is distorting also has a lot of stars it has a lot of light that's added to this image so a lot of times you would try to kind of subtract this light first and and then kind of like look at the remaining the arcs that come from the backgrounds or so how do you do that you know you might take advantage of the fact that the background galaxies and the foreground galaxies have different colors and so use the color information from these things you might need to estimate what is the blurring of the telescope's so the images are never perfectly sharp there's some amount of blurring that's your resolution so if you have a bigger telescope you know big that our camera leaning you're going to get sharper images so that's kind of the blurring of the telescope or the point spread function so you might want to you know estimate that for their houses and all of these things so when you're doing foreign for radio data it's kind of different things but typically there's a lot of these like steps and pre-processing steps that you need to do before you even move on to that final stage where we're actually doing these simulations and comparing them I got the impression earlier that the simulation was kind of the way you used to do it and you're using machine learning as an alternative is that it sounds like that's not the case you're using the simulations and machine learning to get there kind of in conjunction with one another to solve this problem is that right yes sir they said their roles have kind of changed so in the so in in a in what I call like maximum likes with lens modeling just you know lens modeling in the traditional way if you gave me a specific data set you know one image of the gravitational lens and I wanted to analyze it I need to produce you know millions of simulations and one by one compare them to the data and then based on that comparison I will pick my next simulation to produce so you know there is a systematic kind of you know way to kind of like go through these simulations and just say well this one is not good in this particular way so the next simulation that I need to produce will look to be you know will have to be something that looks like this if candle X looks like this or whatever that's kind of the corrector you know direction for me to go so and then once you're done with that procedure so let's say you got your answer and you say well these are the ranges of answers from you know background candles and the foreground lenses that are consistent with the data so if you move on the next that you come and you have a new example a new data set from a new telescope and you want to get the answer so you need to go through the whole thing one more time so you need to produce like millions of simulations again for this specific system to analyze it with machine learning what we do is that we produce a lot of these simulations in one go we train a machine learning model and then we're done forever because this machine learning model from all these simulations in one go it learned how to do that mapping how to get how to predict these parameters of very interesting from the data set and then so I can apply to this data set and then tomorrow can apply to another data set and I never ever have to run another simulation again so we're using the same sort of simulations to train the machine learning methods but we only need to do it once returning to the machine learning models that you're using to predict the lensing you know one of the things that kind of immediately comes to mind when I think of imaging or processing images is deep learning and convolutional neural nets is that a part of the solution here yeah yeah so we're using a lot of deep learning and convolution neural nets for the analysis of these data that's right is what the simulation is doing is allowing you to kind of set up standard supervised learning training of CNN's or is there is that the right way to think of what you're doing yeah yeah yeah absolutely so what we were doing is that we were producing training sets where so I can I can for example pick you know a particular image of a background galaxy and I pick you know a galaxy that's suing the lens in the lensing galaxy with certain parameters for example its electricity or how massive it is or you know where it is in the sky and so I know the truth because this is a simulation it's a control simulation I know what the truth is and I will produce an image and so then I will put this image as the input of my convolutional neural network and train it to predict the particular outputs that I have because this is yes supervised training because I know what the truth is for this particular case and so this is you know one of the approaches that we've done for this is is exactly that for just training sets from simulated data and and trained convolutional neural nets in a supervised way with with these things and the reason that we use simulations by the way one of the things to say is that there are two reasons why we use simulations one is that we could produce really realistic looking simulations so and so in a really realistic looking simulation the good thing is that the labels that we have the truth that we have are the apps are the truth because these are actually producing a you know controlled simulation so whereas if I actually get a realistic you know a real data set from this side that has been analyzed they prediction for that itself had some error in it depending on various parameters and the second thing is that currently we only know of a few hundred gravitation lenses all together so strong gravity so we probably know off about like you know something I've ordered like five hundred lens which is really really not enough for training these like large deep networks so when we were doing simulations we would produce like half a million of these things and it would only take like you know about a day to produce this and so that gives us you know a lot of data set to avoid you know issues like overfitting do you find that the models that you produce as a result of this simulation and training process apply well to real-world images or do you need to incorporate something like domain transfer or some of these other techniques yeah we do so it's so this the lensing simulation aspect of in itself is fine it's just that telescopes usually have a lot of funny things happening today so there's you know various forms of noise it could be cosmic rays like what I described earlier it could be various so first of noise and element in the cosmic rays yeah so cosmic rays is another kind of like you know corruption that happens to the data so the first time that we tried it's on real data that that's what happened was that we trained it you know as CNN and then and then we looked at its predictions for real data for the first time and we knew the answer for this real data said because we had modeled it before so we had a kind of a rough comparison and we're like it's you know the answer was complete garbage and so what we did is that we took these saliency mats which means that we took kind of like we looked at the gradient of our predictions with respect to the data set so anywhere in the data that is you know making a huge contribution to our decision for what the truth you know what the what the prediction is it would kind of like you know shiny bright and immediately we notice that any word that there was a hint of kind of like a low intensity kind of dot in the image these are cosmic rays that kind of like you know put kind of like a dots or you know imprints in the images it was shining bright and we were like okay like that's kind of a sign that you know all these other corruptions that are in the data you really have impact the decisions of this key and then so the challenging aspect was exactly that that domain adaptation to to try to or you know in this case like we really simulate you know realistic looking images that bring every aspect representative of the data that we were going to analyze another technique that comes to mind for this type of a problem it sounds in in many ways to some of the problems like you know correcting missing pixels and or distortion and photographic images is generative models is that something that you've been looking at for this problem we have been discussing this what we haven't we haven't done anything about it so in terms of the generative part of this problem there is like to two parts of it that could be very interesting so the first one is the background sources so the background source that's being lens is the image of an actual real galaxy so the thing that we did so far was a via actually got a bunch of a large data set of real images of galaxies and so these are from you know these are gases that are not strong lines so some of them are galaxies you know in the local universe you know we have in beautiful images from the Hubble Space Telescope's some of them are more distant galaxies in the universe but you know we got like you know a hundred thousand images of galaxies and then we put those through simulations and made these arcs and you know lense them but you know we were still limited you know by these datasets so one thing is that that we've been discussing is using these generative models to produce an unless image of a galaxy just produce an image of a galaxy and then putting that through a simulation lens it hmm so for the lensing aspect of it you could also do this you can imagine the well I'll just train a generative model that produces a lens image to begin with what the thing about that is that the lensing aspect of things is easy is relatively easy you know it just involves running is something called it ray tracing simulation which is not the most efficient thing in the universe but it's not too bad it's it's not the bottleneck and but then the third aspect of it that could also get interesting with generative models is exactly the point that you brought up about all the other source corruptions that goes into data you know can I produce a generative model that actually gives these simple stimulations and adds the various effects of you know different instruments and telescopes and gives me images that are representative of that so that I use it for training of the other machine learning methods like lieutenants how do you kind of envision the future application of machine learning in the space you know obviously we just talked about some of the generative models and the applications of those techniques but are there other areas that you see as being interesting ones to explore here oh yeah so this is becoming you know kind of a popular topic in astrophysics now there are a lot of young people looking into the application of this for for different things you know there are so many things and as for position you could kind of use a neural network to to answer the question but it's it's really the question of you know is it particularly useful in this particular you know case so one thing that really made it worth it for us was that in the next few years we were expecting to discover about 200,000 strong grab that here in lenses there are a few new surveys that are planned to to to the operation also like LSST is a huge project and you click telescope you know it's a European satellite so these are expected to be surveyed so they will like map huge chunks of the sky that lets the scene particularly well map the whole sky the visible part of the sky every three nights and it will produce you know a ridiculous amount of data and we're expecting to discover up forward of like you know two hundred thousand lenses now with my traditional methods if I wanted to go and fill the lens model to every one of these you know two hundred thousand lenses even if you took me like you know two three days to come up with the answer for one lens which is optimistic actually you know it would take me like fourteen you know hundred years to do that so for strong lensing it really looks like you know it's just like you know it's it's it's a matter of speed and and and the number of lenses that we have to analyze there so in other fields like you know in imaging of the Cosmic Microwave Background they're they're you know papers being written right now people looking in chapter you know at the application of machine learning but you know for example in that field the problem is not really speed it's sometimes about accuracy and could you train neural networks that can be more accurate than these maximum likelihood methods because they can deal with you know complex noise for example so but my general feeling is that you know it's it's it's becoming an active field in astrophysics and more and more people are getting excited about it like you know when I started giving talks about this like you know two years ago I kind of saw you know some level of skepticism primarily because you know people have this worried that these are black boxes and you know you cannot control exactly what happens you don't understand why they're making the decisions that they're making but as you know people have seen the results of you know they excellent performance of these methods I think now more and more people are kind of like warm-up today yeah and a lot of research is going that way awesome and and maybe a quick note before we close out you mentioned the excellent performance of these methods in your case where you're looking at gravitational lensing how do you characterize that performance and what are the types of results you've seen so for us there's there's me I kind of like to kind of metrics of performance and the most important one has been speed and so when we training you on network you know the training itself takes just like a couple of days and after that you know for the estimation of his parameters of a single-lens you know you take us a hundredth of a second on a single GPU and if you assume that the analysis of you know a lens of a typical Bleck city would take you know a few days you know and actually the experts time that you know we'd have to sit on this and run you know quite a few CPUs it's something about you know ten million times improvement in analysis speed so you know the analysis of the lf-cc data said that i discussed with you earlier so this could be done half an hour on single laptop which is completely like orders of magnitude you know faster than whatever you get from traditional modeling and then in terms of the accuracy of the models themselves you know you're showing that you can get accuracies that are within basically the uncertainties of the parameters so we can get excellent accuracies on these predictions so remember that we're actually looking at the on series of weird insane well I'm interested about a range of answers that match my thing and so the precision of these models are are really comfortable to the you know kind of uncertainties that we get from maximum lack of modeling anyways and the other thing is that in there are certain cases where they can actually outperform these methods so another direction we've done is recently we're using recurrent neural networks so these are networks that are typically used for you know speech analysis because they're good at modeling sequences of data what we are teaching them here is actually try to model a sequence of steps in these images so man you know imagine so we're interested in for example predicting the distribution of matter in the lens analyses and so we'll start from you know an unknown guess something you know a random guess so we don't know what it is and we'll have to take a series of steps to get closer to our answer so maybe the first guess is gonna be that this galaxy you know looks like something you know it has some electricity in some direction and some mass then I will refine my answer as I go so we're putting these through a recurrent neural network that you know so this particular architecture is called the recurrent inference machine and so the recurrent name transmission every time looks at its own prediction and then I and then puts that through arcing that uses they actually like physical model and updates its answer and one thing that we've been showing is that this can actually like you know predict background source images so the lens or the undistorted image of the background sources that are better representation of the data than these maximum likelihood models and the reason for that is the same thing that you mentioned at the very beginning of the talk about priors and the reason is that this network can learn the complex fire off of water background source what a galaxy really looks like from the training data set whereas when you try to define that in you know a statistical way from you know just you know in it in a statistical way it's just difficult it's really difficult to define on a pixel per pixel basis what is a galaxy what does a galaxy look like right if I show you something that you know adds a little bit of you know more fluctuations here or a little bit spread out you know what kind of score do you give it to say how galaxy you know galaxy looking like this is or this is not or as the neuron the network's they they can learn that from the training data and perform really really well well yeah sure thanks so much for taking the time to share this with us is really interesting really interesting work I always love talking to folks that are working on astrophysics and cosmology in general the use cases or so this the scale of them is just enormous yeah it's really fun talking to you I didn't expect this to become more like kind of a high-level I never thought you know about throwing the word CNN or RN in or things like that but it was just fun for me - nice fantastic thanks so much yes sir all right thank you all right everyone that's our show for today for more information on Yasir or any of the topics covered in this show visit Twilio calm / talk / 2 5 0 make sure to register for pega world using the code 12 19 for $200 off of registration at pega world calm as always thanks so much for listening and catch you next timehello and welcome to another episode of twirl talk the podcast why interview interesting people doing interesting things in machine learning and artificial intelligence I'm your host Sam Carrington you may have seen the news yesterday that MIT researcher Katie Bowman produced the first image of a black hole what's been less reported is that the algorithm she developed to accomplish this is based on machine learning machine learning is having a huge impact in the fields of astronomy and astrophysics and I'm excited to bring you interviews with some of the people innovating in this area today we're joined by yasser hisab a assistant professor at the university of montreal and research fellow at the center for computational astrophysics at Flatiron Institute yeah sure and I caught up to discuss his work on gravitational lensing which is the bending of light from distant sources due to the effects of gravity in our conversation Joshua and I discussed how the machine learning can be applied to undistorted including some of the various techniques used and how the data is prepared to get the best results we also discussed the intertwined roles of simulation and machine learning and generating images incorporating other techniques such as domain transfer or Gans and how he assesses the results of this project for more of our astronomy and astrophysics coverage be sure to check out the following interviews - a mole talk number 117 with Chris shaloo where we discussed the discovery of exoplanets when we'll talk number 184 with Viviana aqua vive where we explore dark energy and star formation and if you want to go way back to when we'll talk number 5 with Joshua bloom which provides a great overview of the application of machine learning in astronomy I'll be sure to link to these episodes in the show notes I'd like to thank everyone who entered our AI conference and tensorflow edge device giveaways today I'm excited to announce the winner of our AI conference giveaway mark tee from Indiana mark I'm looking forward to seeing you in New York next week today's show is sponsored by our friends at Pegasus Tom's pega World the company's annual digital transformation conference will be held at the MGM Grand in Las Vegas from June 2nd through 5th I'll be attending the event as I did last year and I'm looking forward to presenting again in addition to hearing from me the event is a great opportunity to learn how a AI is being applied to the customer experience and real pega systems customers as a twin will listener you can use the promo code 1219 at Pegah WorldCom for $200 off of your registration again that code is 2019 hope to see you there enjoy all right everyone I am on the line with Yasser hey szávay Yassir is an assistant professor at the University of Montreal and a research fellow at the Center for computational astrophysics at Flatiron Institute yeah sure welcome to this week in machine learning and AI thanks thank you very much for inviting me let's start by talking a little bit about your background you recently joined University of Montreal as an assistant professor but tell us a little bit about the arc of your studies and research yeah so I'm an astrophysicist and for most of my research career I've been primarily doing researching as for physics I did my undergrad in physics and astrophysics and Universal Victoria in Canada and then my PhD at McGill University in Montreal yeah I got my PhD in 2013 and I moved to Stanford as a hobby fellow until recently I just moved from Stanford about three months ago and during this whole period of you know 10 years you know in as a you know graduate student and a researcher I've been working on specifically Astrophysical data analysis and I need it past couple years you know with all the you know buzz about machine learning I kind of like started you know to look into the application of machine learning methods to Astrophysical data analysis and so now a good fraction of my research has kind of like you know focused on on developing new machine learning methods for the analysis of Astrophysical data so like telescope data and your particular research area is focused on strong gravitational lensing what's the strong lensing is the distortion in the images of distant objects done by the gravity of intervening object structures so just think about it that you know gravity actually bends light so here on the earth we don't notice it because it's such a tiny effect that you know you don't notice it but in reality you know if you had a flashlight and you know it's flashlight the light rays instead of going like straight they would actually blend a little bit because of the gravity of the earth it's the same reason that black holes are black because they can absorb all this slide because the light falls into the black hole so at cosmological distances you can have two galaxies one sitting far away say five billion light years from us and if second out it could be like much farther away it's a 12 billion light rays but right behind this middle galaxy so we have this scenario it's us a galaxy we call it the middle galaxy we call it like the foreground and a background galaxy and so the light rates of that background galaxy as they come and they pass near the foreground galaxy they get benched they get deflected because of its gravity and so they come to us from these different angles different directions because of you know the spending of light and as a result what happens here is that here on the earth we see these distorted images of that background galaxy that look like rings and arcs around the middle galaxy so you can have you know instead of like one galaxy completely being in front of the other one you would see one galaxy and around it you would see these rings and arcs which are actually the images of the distant background galaxies and so how does the use of machine learning play into their study of these these lensing effects so there are two things so given an analogy about this you know I like to make an analogy to lensing of a candle flame with a wineglass so think about it you know you have a candle sitting on the table and you have a wineglass if you look at the candle flame through the foot of the wineglass you can see that the image of the flame kind of wraps around the foot of the wineglass and it makes like rings around that so that's why this is called gravitational lensing because it acts you know the galaxies like acts like a lens and so this is kind of the thing that we have and we usually have two questions in each of these for each of these observations the first thing that we want to figure out is what is the shape of the foot of the wineglass what is the distortion that is caused to the image and so this relates to you know how much matter there is in the middle galaxy so we're trying to use these image distortions to learn about how matter you need to map the distribution of matter in these lens in galaxies and then the second question is that I see a distorted image of this background source but what does it truly look like you know if I'm looking this arc that is a stretched-out image of the candle flame I'm interested to know what what does the kind of flame truly look like how could i undestood this image and so you can see that all of this kind of relates to image analysis and image processing so so one thing that you know works really well you know for us is you know this development of convolutional neural networks that are specifically tailored to image analysis problems and so we've been kind of like you know hijacking them and using them for this application to answer these two questions you know if I get it and a distorted image of these background things can I predict what you know the distortion is that's been casted and can i undistorted struct the true image of this background galaxy and so what are some of the techniques that you're using to do this so traditionally this is done by something called like I'll just throw the name and I'll explain what it is you know using like maximum likelihood lens modeling so the way that this works is that you say well let's think about it I have this you know candle in the background I am putting a lens in front of it so I see this distorted image you know magnified image of the candle and and I have an observation so I see that image but I need to know what that candle truly looks like nor do I know what is the soren that's been added to its image right so so like if I see that the data I cannot predict these two together but what I can do is that I can produce a lot of simulations so if you gave me an image of a candle and you gave me a lens it's easy to simulate it and say you know because you can go from one to the other so I can get a lens and predict what is the distortion that it calls us to space and then I can put the image of the candle and I can make a simulation so say you know it's a lens damaged a distorted image of the candle would look like that it's the problem is that you know it's difficult to undo this so the way to stop traditionally has been that you know I would just produce a lot of different simulations with like you know perhaps random candles and random lenses and try to find out which one of them really looks like the data and so if I find one simulation that really looks like the data you know I can kind of infer that the parameters that are assumed for it you know the background source addition for the foreground lens that ice should be a kind of a correct description of their you know realities and so this kind of like falls into this you know general umbrella like you know inverse problems where it's easy to go from the ingredients of the model like if I knew the truth about the truth about the candle and the lens I can go forward and make a simulation but the inverse problem is difficult if I he gave me the output then I cannot figure out you know what was the initial ingredients that it was made from so with machine learning what is exciting about it is that we can construct these inverse you know mappings so but using a lot of simulated data or true data I can learned to just kind of like directly predict what these background sources or the foreground lenses look like just directly looking at the data we doesn't need to produce in a lot of simulations for every data analysis in describing the the simulation based approach there's something kind of intuitively unsatisfying about that the idea that you're going to just randomly generate a bunch of candles and randomly a bunch of lenses and if you get something that kind of looks like or looks very close to the result that you've got you assume that it's that specific lens and candle configuration is it that the the chance that you get a good match you know without the candle and the lens being exactly right or or close so small that that gives you comfort and choosing that particular configuration or I guess this part of me that says you know there could be any number of configurations of candles and lenses that yeah that's a great question that's a great question so in a statistical way what you really do is that you would say I'm actually gonna find out every candle and every lens that will kind of produce so there might be as you said there might be like multiple answers and I'm gonna find every one of them that will match the data within my uncertainties and so it means that you have to have a statistical description of your data to understand your whether you're on certain pieces on certain these could come from for example noise and you would write a statistical model to say well this particular candle and this particular glass it fits my data to some you know you know in a probabilistic way so and this other one could also match it and that's one of the things that makes it even more difficult because the problem is not only even finding a single answer the problem is that now you have to kind of explore all the different answers and kind of give a range of these answers that are consistent with the data so it works well it's just that computationally it's very expensive because now it means that you have to try millions and millions of these simulations to give an answer for one specific data set and what are you assuming as as known in the way you've formulated the problem here is you kind of know what a candle looks like and you know what a lens kind of looks like how what assumptions are you making about the candle and the lens yeah that's again that's a really good question so these assumptions the language of you know statistical analysis you would call them priors so these are the prior information the prior knowledge that we have about these things and so the way that these priors are specifically encoded in the analysis could different from you know mildly different but you can imagine that for example in the case of strong lensing you want the background source to be something looking like a galaxy you know so you would impose some sort of like you know prior knowledge you would say I have a prior knowledge of the background source you know it's a galaxy so for example when I make images of this thing I will you know kind of enforce that for example it is you know it's blobby or it's concentrated at the center or that you know it's Peaks you know it's density at this you know its brightness at the center but the way that you define it could be tricky and specifically when you were doing it's you know people kind of like forward you know lens modeling procedures it is difficult to actually specify these priors one of the cool things about machine learning is that these prior information you can actually learn them from data itself and that's one of the really like great advantages of the machine learning approach is that if I had a large data set of these lenses or galaxies in general if I think that the background source is a galaxy I can get millions of images of other galaxies in the universe from you know all sorts of telescopes and I can put it through a machine learning procedure and I can actually learn what are the kind of structural priors that I need to respect and then try to kind of like find out what are the solutions within that kind of like in a range of you know possibilities that match this particular data set and so maybe share a little bit about the data collection and preparation aspect of these types of problems assuming the data that you're working with comes from these large radio telescopes and you're able to collect that very fairly readily but you have to do a lot of processing to it yes so we work both with radio telescopes and and right optical telescopes like you know how about Space Telescope so throughout the first like machine-learning papers that we wrote we're basically just using you know how about Space Telescope images so these images usually there's like a few steps of Korea processing that you need to do with it the telescope that comes you know from the the image that comes from the telescope might for example have a lot of cosmic rays these are just like particles you knowing around the earth that hit did you know the cameras on the telescope and they just leave these traces their high-energy particles and so you know you might need to remove those you might need to for example subtract the light from the lens and galaxy itself because remember what we're interested in is a distortion of this background galaxy and how it's been distorted one of the nuisance phase here is that the middle get up see that is distorting also has a lot of stars it has a lot of light that's added to this image so a lot of times you would try to kind of subtract this light first and and then kind of like look at the remaining the arcs that come from the backgrounds or so how do you do that you know you might take advantage of the fact that the background galaxies and the foreground galaxies have different colors and so use the color information from these things you might need to estimate what is the blurring of the telescope's so the images are never perfectly sharp there's some amount of blurring that's your resolution so if you have a bigger telescope you know big that our camera leaning you're going to get sharper images so that's kind of the blurring of the telescope or the point spread function so you might want to you know estimate that for their houses and all of these things so when you're doing foreign for radio data it's kind of different things but typically there's a lot of these like steps and pre-processing steps that you need to do before you even move on to that final stage where we're actually doing these simulations and comparing them I got the impression earlier that the simulation was kind of the way you used to do it and you're using machine learning as an alternative is that it sounds like that's not the case you're using the simulations and machine learning to get there kind of in conjunction with one another to solve this problem is that right yes sir they said their roles have kind of changed so in the so in in a in what I call like maximum likes with lens modeling just you know lens modeling in the traditional way if you gave me a specific data set you know one image of the gravitational lens and I wanted to analyze it I need to produce you know millions of simulations and one by one compare them to the data and then based on that comparison I will pick my next simulation to produce so you know there is a systematic kind of you know way to kind of like go through these simulations and just say well this one is not good in this particular way so the next simulation that I need to produce will look to be you know will have to be something that looks like this if candle X looks like this or whatever that's kind of the corrector you know direction for me to go so and then once you're done with that procedure so let's say you got your answer and you say well these are the ranges of answers from you know background candles and the foreground lenses that are consistent with the data so if you move on the next that you come and you have a new example a new data set from a new telescope and you want to get the answer so you need to go through the whole thing one more time so you need to produce like millions of simulations again for this specific system to analyze it with machine learning what we do is that we produce a lot of these simulations in one go we train a machine learning model and then we're done forever because this machine learning model from all these simulations in one go it learned how to do that mapping how to get how to predict these parameters of very interesting from the data set and then so I can apply to this data set and then tomorrow can apply to another data set and I never ever have to run another simulation again so we're using the same sort of simulations to train the machine learning methods but we only need to do it once returning to the machine learning models that you're using to predict the lensing you know one of the things that kind of immediately comes to mind when I think of imaging or processing images is deep learning and convolutional neural nets is that a part of the solution here yeah yeah so we're using a lot of deep learning and convolution neural nets for the analysis of these data that's right is what the simulation is doing is allowing you to kind of set up standard supervised learning training of CNN's or is there is that the right way to think of what you're doing yeah yeah yeah absolutely so what we were doing is that we were producing training sets where so I can I can for example pick you know a particular image of a background galaxy and I pick you know a galaxy that's suing the lens in the lensing galaxy with certain parameters for example its electricity or how massive it is or you know where it is in the sky and so I know the truth because this is a simulation it's a control simulation I know what the truth is and I will produce an image and so then I will put this image as the input of my convolutional neural network and train it to predict the particular outputs that I have because this is yes supervised training because I know what the truth is for this particular case and so this is you know one of the approaches that we've done for this is is exactly that for just training sets from simulated data and and trained convolutional neural nets in a supervised way with with these things and the reason that we use simulations by the way one of the things to say is that there are two reasons why we use simulations one is that we could produce really realistic looking simulations so and so in a really realistic looking simulation the good thing is that the labels that we have the truth that we have are the apps are the truth because these are actually producing a you know controlled simulation so whereas if I actually get a realistic you know a real data set from this side that has been analyzed they prediction for that itself had some error in it depending on various parameters and the second thing is that currently we only know of a few hundred gravitation lenses all together so strong gravity so we probably know off about like you know something I've ordered like five hundred lens which is really really not enough for training these like large deep networks so when we were doing simulations we would produce like half a million of these things and it would only take like you know about a day to produce this and so that gives us you know a lot of data set to avoid you know issues like overfitting do you find that the models that you produce as a result of this simulation and training process apply well to real-world images or do you need to incorporate something like domain transfer or some of these other techniques yeah we do so it's so this the lensing simulation aspect of in itself is fine it's just that telescopes usually have a lot of funny things happening today so there's you know various forms of noise it could be cosmic rays like what I described earlier it could be various so first of noise and element in the cosmic rays yeah so cosmic rays is another kind of like you know corruption that happens to the data so the first time that we tried it's on real data that that's what happened was that we trained it you know as CNN and then and then we looked at its predictions for real data for the first time and we knew the answer for this real data said because we had modeled it before so we had a kind of a rough comparison and we're like it's you know the answer was complete garbage and so what we did is that we took these saliency mats which means that we took kind of like we looked at the gradient of our predictions with respect to the data set so anywhere in the data that is you know making a huge contribution to our decision for what the truth you know what the what the prediction is it would kind of like you know shiny bright and immediately we notice that any word that there was a hint of kind of like a low intensity kind of dot in the image these are cosmic rays that kind of like you know put kind of like a dots or you know imprints in the images it was shining bright and we were like okay like that's kind of a sign that you know all these other corruptions that are in the data you really have impact the decisions of this key and then so the challenging aspect was exactly that that domain adaptation to to try to or you know in this case like we really simulate you know realistic looking images that bring every aspect representative of the data that we were going to analyze another technique that comes to mind for this type of a problem it sounds in in many ways to some of the problems like you know correcting missing pixels and or distortion and photographic images is generative models is that something that you've been looking at for this problem we have been discussing this what we haven't we haven't done anything about it so in terms of the generative part of this problem there is like to two parts of it that could be very interesting so the first one is the background sources so the background source that's being lens is the image of an actual real galaxy so the thing that we did so far was a via actually got a bunch of a large data set of real images of galaxies and so these are from you know these are gases that are not strong lines so some of them are galaxies you know in the local universe you know we have in beautiful images from the Hubble Space Telescope's some of them are more distant galaxies in the universe but you know we got like you know a hundred thousand images of galaxies and then we put those through simulations and made these arcs and you know lense them but you know we were still limited you know by these datasets so one thing is that that we've been discussing is using these generative models to produce an unless image of a galaxy just produce an image of a galaxy and then putting that through a simulation lens it hmm so for the lensing aspect of it you could also do this you can imagine the well I'll just train a generative model that produces a lens image to begin with what the thing about that is that the lensing aspect of things is easy is relatively easy you know it just involves running is something called it ray tracing simulation which is not the most efficient thing in the universe but it's not too bad it's it's not the bottleneck and but then the third aspect of it that could also get interesting with generative models is exactly the point that you brought up about all the other source corruptions that goes into data you know can I produce a generative model that actually gives these simple stimulations and adds the various effects of you know different instruments and telescopes and gives me images that are representative of that so that I use it for training of the other machine learning methods like lieutenants how do you kind of envision the future application of machine learning in the space you know obviously we just talked about some of the generative models and the applications of those techniques but are there other areas that you see as being interesting ones to explore here oh yeah so this is becoming you know kind of a popular topic in astrophysics now there are a lot of young people looking into the application of this for for different things you know there are so many things and as for position you could kind of use a neural network to to answer the question but it's it's really the question of you know is it particularly useful in this particular you know case so one thing that really made it worth it for us was that in the next few years we were expecting to discover about 200,000 strong grab that here in lenses there are a few new surveys that are planned to to to the operation also like LSST is a huge project and you click telescope you know it's a European satellite so these are expected to be surveyed so they will like map huge chunks of the sky that lets the scene particularly well map the whole sky the visible part of the sky every three nights and it will produce you know a ridiculous amount of data and we're expecting to discover up forward of like you know two hundred thousand lenses now with my traditional methods if I wanted to go and fill the lens model to every one of these you know two hundred thousand lenses even if you took me like you know two three days to come up with the answer for one lens which is optimistic actually you know it would take me like fourteen you know hundred years to do that so for strong lensing it really looks like you know it's just like you know it's it's it's a matter of speed and and and the number of lenses that we have to analyze there so in other fields like you know in imaging of the Cosmic Microwave Background they're they're you know papers being written right now people looking in chapter you know at the application of machine learning but you know for example in that field the problem is not really speed it's sometimes about accuracy and could you train neural networks that can be more accurate than these maximum likelihood methods because they can deal with you know complex noise for example so but my general feeling is that you know it's it's it's becoming an active field in astrophysics and more and more people are getting excited about it like you know when I started giving talks about this like you know two years ago I kind of saw you know some level of skepticism primarily because you know people have this worried that these are black boxes and you know you cannot control exactly what happens you don't understand why they're making the decisions that they're making but as you know people have seen the results of you know they excellent performance of these methods I think now more and more people are kind of like warm-up today yeah and a lot of research is going that way awesome and and maybe a quick note before we close out you mentioned the excellent performance of these methods in your case where you're looking at gravitational lensing how do you characterize that performance and what are the types of results you've seen so for us there's there's me I kind of like to kind of metrics of performance and the most important one has been speed and so when we training you on network you know the training itself takes just like a couple of days and after that you know for the estimation of his parameters of a single-lens you know you take us a hundredth of a second on a single GPU and if you assume that the analysis of you know a lens of a typical Bleck city would take you know a few days you know and actually the experts time that you know we'd have to sit on this and run you know quite a few CPUs it's something about you know ten million times improvement in analysis speed so you know the analysis of the lf-cc data said that i discussed with you earlier so this could be done half an hour on single laptop which is completely like orders of magnitude you know faster than whatever you get from traditional modeling and then in terms of the accuracy of the models themselves you know you're showing that you can get accuracies that are within basically the uncertainties of the parameters so we can get excellent accuracies on these predictions so remember that we're actually looking at the on series of weird insane well I'm interested about a range of answers that match my thing and so the precision of these models are are really comfortable to the you know kind of uncertainties that we get from maximum lack of modeling anyways and the other thing is that in there are certain cases where they can actually outperform these methods so another direction we've done is recently we're using recurrent neural networks so these are networks that are typically used for you know speech analysis because they're good at modeling sequences of data what we are teaching them here is actually try to model a sequence of steps in these images so man you know imagine so we're interested in for example predicting the distribution of matter in the lens analyses and so we'll start from you know an unknown guess something you know a random guess so we don't know what it is and we'll have to take a series of steps to get closer to our answer so maybe the first guess is gonna be that this galaxy you know looks like something you know it has some electricity in some direction and some mass then I will refine my answer as I go so we're putting these through a recurrent neural network that you know so this particular architecture is called the recurrent inference machine and so the recurrent name transmission every time looks at its own prediction and then I and then puts that through arcing that uses they actually like physical model and updates its answer and one thing that we've been showing is that this can actually like you know predict background source images so the lens or the undistorted image of the background sources that are better representation of the data than these maximum likelihood models and the reason for that is the same thing that you mentioned at the very beginning of the talk about priors and the reason is that this network can learn the complex fire off of water background source what a galaxy really looks like from the training data set whereas when you try to define that in you know a statistical way from you know just you know in it in a statistical way it's just difficult it's really difficult to define on a pixel per pixel basis what is a galaxy what does a galaxy look like right if I show you something that you know adds a little bit of you know more fluctuations here or a little bit spread out you know what kind of score do you give it to say how galaxy you know galaxy looking like this is or this is not or as the neuron the network's they they can learn that from the training data and perform really really well well yeah sure thanks so much for taking the time to share this with us is really interesting really interesting work I always love talking to folks that are working on astrophysics and cosmology in general the use cases or so this the scale of them is just enormous yeah it's really fun talking to you I didn't expect this to become more like kind of a high-level I never thought you know about throwing the word CNN or RN in or things like that but it was just fun for me - nice fantastic thanks so much yes sir all right thank you all right everyone that's our show for today for more information on Yasir or any of the topics covered in this show visit Twilio calm / talk / 2 5 0 make sure to register for pega world using the code 12 19 for $200 off of registration at pega world calm as always thanks so much for listening and catch you next time\n"