Develop an AI to play Connect Four

Develop an AI to play Connect Four - Python Tutorial

The Development of Alpha Beta Pruning in the Minimax Algorithm

Creating an AI for a game like Connect Four involves implementing a minimax algorithm with alpha-beta pruning to optimize performance and efficiency. In this article, we will explore how to implement alpha-beta pruning in the context of Connect Four.

Defining the Mini Max Function

---------------------------------

The first step is to define a mini-max function that takes into account the current state of the game board and the possible moves that can be made. This function should return the maximum or minimum value of the game state, depending on whether it's the maximizing player's turn or not. In our case, we are implementing the minimax algorithm from the perspective of the minimizing player.

We start by initializing `alpha` and `beta` to negative infinity and math infinity, respectively. These values will be updated as we iterate through the game tree.

Implementing Alpha Beta Pruning

---------------------------------

The key insight behind alpha-beta pruning is that we don't need to evaluate the entire game tree in order to determine whether a particular move is suboptimal. By keeping track of `alpha` and `beta`, we can prune branches of the game tree that are guaranteed not to be part of the optimal solution.

Here's how it works: when evaluating a node in the game tree, we update `alpha` and `beta` as follows:

* If `alpha` is greater than or equal to `beta`, then we can prune the branch and move on to the next node.

* Otherwise, we continue evaluating the node.

In our implementation, we use the following code snippet to demonstrate this process:

```c

alpha = max(alpha, value); // Update alpha if necessary

if (alpha >= beta) {

break; // Prune the branch if possible

}

```

The New Mini Max Function with Alpha Beta

------------------------------------------

Now that we have a better understanding of how alpha-beta pruning works, let's implement it in our mini-max function. We'll update the `value` variable to take into account both the `alpha` and `beta` values.

Here's the updated code snippet:

```c

// Define the new mini-max function with alpha-beta

int miniMax(int board[10][7], int depth, bool isMaximizingPlayer) {

// Initialize alpha and beta

double alpha = -math.inf;

double beta = math.inf;

// Evaluate the game tree and update alpha and beta

// ...

}

```

Testing the Alpha Beta Pruning Implementation

-----------------------------------------------

Now that we have implemented alpha-beta pruning, let's test it by running a few iterations of the game. We can use the following code snippet to simulate a game:

```c

int main() {

int board[10][7]; // Initialize the game board

for (int i = 0; i < 10; i++) {

for (int j = 0; j < 7; j++) {

board[i][j] = 0; // Initialize the board with zeros

}

miniMax(board, 5, true); // Run a few iterations of the game

return 0;

}

```

Tuning the Algorithm and Improving Performance

-------------------------------------------------

While alpha-beta pruning is an effective optimization technique for the minimax algorithm, there are still some ways to improve its performance. Here are a few ideas:

* **Weighting lower values**: In our current implementation, we weight all values equally. However, if you think of the game as having "lower" rows that are more likely to be filled first, you can assign higher weights to those values.

* **Watching even-odd Connect Four strategies**: Implementing an even-odd strategy for Connect Four could provide a significant performance boost.

* **Tuning scores and variables**: By adjusting the scores and variables used in the algorithm, we may be able to further improve its performance.

Conclusion

----------

In conclusion, alpha-beta pruning is a powerful optimization technique that can significantly improve the performance of the minimax algorithm. By implementing this technique in our Connect Four AI, we were able to achieve better performance and efficiency. However, there are still some ways to further optimize the algorithm, such as weighting lower values or implementing an even-odd strategy for Connect Four.

Note: The code snippets provided above are simplified examples and may not be suitable for production use without additional modifications and optimizations.

"WEBVTTKind: captionsLanguage: enhow's it going everyone and welcome back to another video this video is long overdue but today we will be programming an expert level connect for ai which is gonna be a lot of fun if you haven't already i recommend you check out the video i did before like a while ago on i guess just programming the connect four game and what we're gonna be doing today is building off of that and actually implementing the ai um if you don't wanna like go through all the effort of what like programming the game from scratch you also can just go to my github page and i'll link to this in the description but there's this file connect for py so if you just click on the raw there and just copy all of this this is going to be the starting point from this video i'm going to be programming in python 3 and i'm using sublime text as my editor so i'm just gonna call this connect4original.py also if you haven't seen it already i really recommend you watch my video that i did called like how does a board game ai work because basically what we're going to be doing in this video is following along kind of the steps that i go through in that video on the theory behind a connect4ai so it really will help you with the background information on like how everything works because it can be a little bit complicated at times once you have the file locally just make sure real quick that it runs properly so when you run this original file you should ah why is it not running you should get a screen that pops up looks like connect four board and you just take turns being the yellow and red player dropping pieces i'll finish this up real quick and it should do something like that when you you win uh you don't have any errors to start off so just make sure that everything's working properly before we actually get into the ai and with this ai we're going to start off just like i did in the how does a board game ai video work with the simplest approach we can think of for the ai so we're just gonna have it the player two just drop a piece randomly somewhere on the screen so that's what we're gonna start out with before we begin take a second to review the code and make sure you kind of understand how all the pieces fit together because as we go through the steps of the ai if you don't really understand how the game is working it's going to be kind of confusing the steps that we're implementing and to start off we are implementing the random dropped piece ai so we're gonna have to go down to our game loop which is here and in the initial game we were checking for the mouse button down event and then based on whose turn it was so turn equals zero was the red piece i think and then the else statement was player 2 and that was the yellow piece we are making the drops based off of that in this random ai situation we are now not going to have player 2 drop when a mouse click happens is gonna just we'll have player two drop whenever um right after player one has made the move so once player one makes the move that should trigger player two to make the move and to kind of make some of this clear i don't like having like turn equals equals zero popping up in the code all over the place these are called magic numbers so what i'm going to do real quick is just to make things a little bit more clear is i'm going to go up here and define a static variable player which equals zero and i'm going to have a static variable or kind of just a yeah a variable called a i equal to one and i'll just when i making these variables will make it a little bit easier for me to read the code and also as we implement some of these functions with the ai it'll make it a little bit easier for us to understand how to implement it so now turn equals equals 0 is turn equals equals player and then for this we don't want it in the let me make this a little bit smaller just for a sec oh this can be too small i'll make it bigger but we don't want it to be in the if statement if mouse button down anymore we want it to be outside of this for loop because it shouldn't depend on event should just depend on when this i guess happens which we can store separately so we're going to move this back a few okay so that now should be in line with the 4 event and instead of an else we can't have an else without an if above it so what we're going to do now is we're going to do if turn equals equals a i and i guess we'll also do and not game over because we don't want this a piece to drop after the game has already been finished so if journey equals equals ai and not game over then we have the random ai making the decision and let's see there might be some other changes we have to make now that we did that one thing we might want to do is throw the print board somewhere differently actually it's good right here and let's see where else it was i didn't actually move it okay so i'm gonna move this stuff outside of the loop too let's see okay i'm thinking about this okay if game over that should be in the loop the turn what we're gonna do with the turns now is only have them change when a valid location was chosen so here the turn is changing and same thing for the player turn so if valid location then we'll change the turn okay uh no not that one less so inside the is valid location okay so now we what we have just done we haven't really done don't worry too much about the i guess logistics of what we're doing but basically we're setting it up so that we can pick a random spot to drop a piece we're just kind of having to do some i guess extra work to clean up the code and set it up for what we're trying to do so if turn equals ai and not game over cool and so now what we want to do is that initially right here we this was looking at where our mouse was and picking the column based off of that what we want to do instead is now just pick a random column so i'm going to do column equals random dot rand int 0 which is the farthest to the left column and we have column count columns right but indexing is gonna put that off by one so we're gonna do a column count minus one that'll pick a random number uh between zero and i guess six uh inclusive to drop our piece in and i have to i did not import random so i have to do that real quick all right i feel like i screwed something up and maybe we'll get some simple ai right here okay it seems to be working but it's immediate and that's kind of annoying but that's a good uh we're doing well so far oh and it's screwed up i just dropped a piece in a winning location and it screwed up there too so we have some things we need to do so the first thing i noticed that was really annoying was right when we dropped our piece the ai dropped their piece and that just doesn't seem very like it's not a good user experience so the first thing i'm going we're going to add is some time delay for the ai piece so down here we have the ai move and so basically what we're going to do is once this column has been selected and is a valid location so if it selects a column that's like not valid like already filled up or something it will run the loop again like nothing will have changed the turn won't have changed or anything and it will find a column that's valid and so in the is valid location we're going to add a a delay so i'm gonna do pygame.time.weight500 and this is similar to what we did before um when we ended the game just to like hold the player one wins on the screen so just like we did there so we're gonna add that and so let's see how it looks now quickly what the heck okay so i noticed another issue so i think it's working but what's not happening is the board is not getting re-printed after the we select our piece so the delay was there but it was being weird because it wasn't actually drawing the the board again so what we're going to do here is once we select like a is valid location we should update the board because it will have dropped a piece so i'm going to add this here see what it looks like now yay that looks pretty good and it's moving around okay it seems random but obviously as i mentioned or as yeah as i mentioned in the how does a board game ai work it's when you just select a random column it is such a stupid ai and it will not block my obvious three in a row so i can easily win right now also real quick while i'm thinking about it i don't know if you noticed this but right now i always get to go first so we probably would want it to be random who gets to go first it just makes it a little bit more realistic and we don't want to we want the code to work if the ai goes first so we can really easily add some logic that will allow us to initial randomly initialize who goes first and all i'm going to do is turn equals random.randint01 or i guess we could even do player ai which is going to be zero to one as well so what will happen now is instead of turn where was turn initialized somewhere in here turn equals zero now turn is going to be either one or zero and the reason we can do this just fine is because the only place turn can change is after a valid piece has been dropped so basically what's going to happen now is we're gonna end we're gonna skip over this event this code right here on the first iteration if we get one in that initialization step and then the ai will drop a piece it will switch to the player's turn and then we'll start back up at the top of the loop and we'll wait until the user clicks and then run the code so let's just see if that worked oh shoot yeah it looks like it worked i'll run it one more time just to make sure it's kind of weird that it's starting yellow it's a little bit laggy or something but it is working and i want to just make sure that i can get a i should be able to go first at some point too okay yeah so after three tries we initialized the zero and i get to go first as before one more thing before we move on to the slightly more advanced ai is that when i was running this one thing that i found that was annoying is come on run dude i have to click first is when the yellow piece is first so you see how when the yellow piece is first the yellow block popped up top here it draws the yellow block up here i don't really want that to happen now that only we're always red so i'm going to change this to turn equals player and then i'm going to actually get rid of this drawing the yellow circle on the mouse motion event so let's see what happens now yeah so now it just pops up immediately and i could add another delay but i'm fine with this cool yay all right let's move on to something smarter and actually start implementing the scoring heuristic for our board so what we're going to be doing is evaluating whether or not a board state is good or bad based on you know how many two in rows there are how many three in a rows there are whether or not you're playing in the center column and you know vice versa for the opponent having three in rows two rows etc and so this is exactly what we did in the how does a board game ai work video except the one difference is that a lot of the scores i think i was going through in that video were relative to the piece you dropped so like you'd only look at the three in a rows created or the two in rows created from the specific piece you dropped that turn what we're gonna have to do when we're implementing that so that we can use the minimax algorithm is do that relative the score will be not it will be independent from which piece was just most recently dropped so you'll kind of look at all of the board and count how many three in a rows how many two in rows etc let's define a function called score position which will give the we'll assign the score to our board so i'm going to define this right under the winning move function so score position and if we're scoring position we're going to need to take in two things uh in my opinion we're gonna have to take in the board obviously we'll also want to take in a piece so we have to know whether it's red or yellow going whether it's the player or the ai um okay cool and this is going to be pretty similar to how we did winning move or in a way basically what we're going to check is look at the whole entire board and basically be counting the number of twos and rows three in a rows etc i think to start off and make things easier let's just count whether you win or whether uh you win or lose so we'll just be looking at four in a rows and we'll start off with just horizontal so horizontal or score horizontal i'll say and let's see one thing i want to do real quick just it's on my mind is pieces defined as one or two so one is the player's piece value and two is the ai's piece value and that's another like magic number so i'm gonna just try to like i'm gonna define two more variables player piece and ai piece player piece equals one and this will just probably make our code a little bit easier to read again and aips equals two so let me just go through and try to switch things out so this would be player piece here this is ai piece defined in several locations go down player piece right here drop piece this would be player piece winning move this would be player piece i think this one is something different here let's see what else we have down here would be a eyepiece and i think down here would also be aips i think that might have been it not positive but just cleans up the code a little bit okay so we want to score the horizontal locations the way that i think we should do this is basically what we're going to do is it we could do it the exact same way we did winning move but it gets kind of tough with these if statements to count whether or not like you have two in a row exactly or three in a row exactly because then you'd have to like you have to factor in all the different ways you can have three in a row so what we're gonna do is i'll run the game real quick oh no what did i do sorry i just need to pass this function real quick what we are gonna do instead of that is basically imagine so look at my mouse down here in the corner we're gonna look at with horizontal starting we're gonna look at window sizes of four and just basically in those window sizes of four count how many empty squares and how many filled in squares there are and we'll basically be able to count then the lines of two lines of three lines of four from that so this is how it's going to work so we have our board and that's you know the matrix the row of rows and columns so what we're going to say is for r in range row count so for each row we're going to want to do this and then what we're going to do is we are going to define this row array and yes row array the best way yeah row array so this is going to be all of the pieces so if we're looking at a single row this is just going to be the seven tiles but just in a single list so it's easier and we're going to define that as row array equals into i for i in list board and then so we want the specific row we're on so it's going to be board index of row but then we want all the all the column positions for that row so that's why we use this colon right here i think that's what we want let me just see and the inta is just to make sure that we're always getting something that's can be indexed into the um okay ah so what am i saying sorry and i i think i just added this just to be cautious that we didn't try to index something with a float value or something so that's our row array and so then what we're going to do is get if this is like let's say we looked at the board down here the row array would be like this single list in here what we're going to do is look at all the pairs of all the window sizes of four so to do that we will do for c in range column count minus three and the reason we do minus three is because the last one we want to look at is this so we don't look it like we don't start with these last three as the farthest to the left if that makes sense i'm highlighting great in this area if you missed that okay so we're getting the horizontal windows and so the window then will be a four will be the row array and we're going to index that from our starting position so that's going to be from c to c plus i guess 4 would it be four or three i'm trying to think it's gonna be four yeah c plus four and we can actually make this a variable two just so you know where that four is coming from i'm just gonna call this window length so window length equals four so window length so we're getting four at a time and then what we're going to want to do with that is count the number of ones and twos so ai pieces and you know player pieces what not so what we're going to do with this window is there's a built-in function of lists in python dot count so what i'm going to do is count the piece that we're passing into the score position function and if that equals equals 4 so that would be we win so i guess if window dot count piece equals equals four let's say we have a score that we're defining so score is zero initially we'll say score plus equals like a hundred right so if we ever find four in a row on the board that's a really good board obviously um and so we should try to get that move okay so we have score plus equals 100 and then once we're done we're just going to do horizontal for now and i'll fill in we'll fill in the diagonal and vertical later turn score okay so now we have this score position function that looks at a single board state and gives the score of a hundred if it is a connect four and gives zero otherwise and i guess for the sake of this let's also preference uh three in a rows why not or maybe yeah three in a row so if i'll define this above or i'll define it below lf window.count piece equals equals three and window.count of zero so we should define this as a piece too so empty equals zero this is just how we are storing the values behind the scenes we know that count of empty equals equals one then let's add i don't know 10 to our score so this is three in rows this is four in rows right and then we need to find one more function and that is just like pick best move and that will give taking a board what else does this need to take in i'll start with just the board potentially i guess the piece too yeah why not pick best move so what we're going to do with this is we need to look at all the different columns we can make the move and then basically run the score position on all those columns and then pick the highest score that is returned so we will write that as follows so we have a function called i think valid locations if i'm not mistaken okay no we don't but we can find a function called valid locations real quick so i'm going to do that def valid load or let's just call it get valid locations so this will show us which columns we can actually drop it in and then evaluate from there that is a board we don't need to actually have a piece to figure out if it's a valid location or not and so valid locations is an empty list to begin and for column and range column count we don't have to do the minus one here because this is exclusive that that last value if is valid location of board and column then we want to append that specific column because that is a valid location this is just getting us a list of where we can drop just because we'll be doing a lot this a lot so it's going to abstract it into a function valid locations dot append column and then finally when we're done we'll want to return the valid locations okay so now we have valid occasions so in our pick best move we'll need to use this so valid locations equals get valid locations of the board and let's think what else uh we need to do so what we're going to want to do is for column in valid locations we want to figure out what the row so basically what we're going to do is to make our life easier is we're going to pretty much simulate dropping a piece and then have this new board that actually gets passed into the score position because we actually just taken a board object there so we have to kind of simulate each move to actually evaluate using that function so we're going to do is use our function get next open row and that takes in the board and the column and then this is something that's tricky and not intuitive is we can't just drop the piece into our current board and or we can't even do something like this we can't say like temp board equals board and then say okay let's uh drop drop piece in the temp board and row column and i guess peace we can't use this temp board to simulate the drop piece because what actually happens with the numpy objects that we're using to store all the values is that this temp board ends up pointing to the same exact memory location as the original board so what we actually need to do is make a copy of this board so this is creating a new memory location where our board is sitting and what we're doing is passing that new memory location into our drop piece function so any modifications don't modify our original board hopefully that made sense so drop piece um and then we want to basically find the score of the new board so i guess what happens is this drop piece automatically modifies temp board so we can just go ahead and do score equals score position of the temp board and whatever piece that we dropped and then what we'll want to do is basically keep track of a best score so we'll say best score and we'll set that to zero initially and we'll have best column and that'll just be like we'll just put it as a random choice of the valid locations and i defined valid locations below this so i'm going to just move these remove that down here okay so we have our best score and our best column best score is initially zero best column is just random but what we can do is if this score that we scored with this new board is greater than if score is greater than best score then we can set our new best score to score and we can set the new best column to whatever column we were checking when we got this new score okay does that make sense and then finally once we do this we'll do return best column i think that looks right all right let's try that out real quick so all we should see here is we didn't add that anything too complex we just made it preference winning horizontally and getting three in a rows horizontally so we shouldn't be looking at any sort of vertical maneuver okay so instead of the random location here i'm going to comment this out and do column equals pick best move of the board and i guess we're the ai here so i need to pass in the ai piece come on moment of truth i mean let's just stack on top of okay so now we should see it connect three in a row that's what we're hoping for and it did it cool so i'm going to just stack a top again the way we scored it too it should now preference filling this in after i drop which it did we got something a little bit smarter okay now that we have the score working for the horizontal moves we can kind of expand upon this and make it work for diagonal and vertical as well also do a little probably a little bit of cleaning so it ah where where where's the code okay pick us move struggling okay yeah so this is horizontal so we need to also do vertical vertical is also easy so score vertical do next so that would be force column and range column count the ones that are tricky are the diagonals so we're doing the exact same thing we did with horizontal uh but we're going to do it with vertical and then once we have it for each one we'll actually probably improve how we're actually scoring this make it more like i did in the we did in the other video all right for c and range column count let's see what else we should do we should then get the column array ray which is going to be similar to the last time it's going to be int i for i in list so this time we have the we want to get every row position for a specific column so we can do it like this i believe i think that's what we want and then we'll want to iterate through the windows just like we did last time but now we have to iterate through the the row window so for r in range row count minus three window equals row array c or r to r plus window length and we can copy this once we now that we have the window this code is oh shoot i didn't want to delete it this code is the exact same so i can just add this code and what we're going to ultimately do is abstract away this window counting stuff to its own function so okay so now it should if we run this again it should preference vertical as well so i'm going to block the horizontal and it's still stupid and it's oh it actually did block me that's interesting but i'm going to let it get okay so now it has two verticals in a row it should preference making the move right on top of it next if we did this right and it didn't so i'm guessing we did something screwy it's weird this is weird maybe it's just getting lucky okay it should hopefully win vertically let's see i don't know we did something wrong window count piece score plus equals hundred what did i do wrong here oh shoot i grabbed a row array this needs to now be column array okay sorry about that and also for r and range row count yeah okay that should be good hopefully let's see if it preferences vertical moves now too so i'm blocking the horizontal and now basically once it stacks oh shoot i'm going to block the horizontal again but come on okay let me play it again so you can see exactly what i was trying to show you right so it goes i'm going to block the horizontal and hopefully it stacks on top okay now that it's stacked on top we should because it doesn't have any sort of horizontal options it should stack again and it does and so it now should win vertically okay cool so that we kind of mainly validated that it was getting these windows correctly for the vertical okay so this was getting if we think about the same graph it was looking at like the sliver if we just cut off this one column it it was making a list out of these ah sorry these numbers zero zero two two two two okay now let's do diagonals diagonals are a little bit weird um so we'll start with the positively sloped diagonals so the ones that are low on the left side and rise up to the right side so score positive sloped diagonal all right so let's just run the game real quick just to see that how we're going to do this so basically if you look at my mouse we're going to look at all the possible ways we can do this diagonal slope so we can do like this we can move over do this move over do this move over do this and then rise up one row and do the same thing so basically now we have to um go from like row count minus three as well as column count minus three and do a little bit of manipulating like that and it's a little bit weird to get our windows but it shouldn't be too too bad so let's start with this so we're gonna iterate starting with the rows so for row r and range row count minus three because we're going to be cut off at the top because we're going up each time once and then for c and range column count count minus three because we can't just have a okay so and that's because we're going to the right each time four so we have to cut off early for that as well and then what we need to do is our window is going to now be we're going to do a little list comprehension to do this it's going to be the board at the row position and then at the column position and what we're going to do is we're going to have to add a little bit of value each time so it should start with 0 so we're going to we're going to just add some i plus i and that's going to be for i and range window length so basically what's happening is if you think about what i takes on if window length is 4 i is going to start out at zero so that's going to be just the arth the position at r comma c but then we're gonna this is gonna become one so we're gonna increase positively one and one so we're going to go up and across and that will continue and we'll actually get our window in the proper um we'll get we'll get a windows has a 4 that is a diagonal and this will repeat this window right here we'll repeat for all of these values so we'll end up getting all of the positively sloped diagonals and so we're going to just you can just copy this code here and we can do the same for negatively sloped diagonals so for r and range we'll do the exact same thing oh my gosh stop texting me ah sorry for c in range column count minus three and this is a little bit trickier because now we need to stop start at the top and go downwards so to do that bear with me we're going to want our window to be equal to we'll do a list comprehension again board r so this is going to be the same exact as before but now oh my gosh this is annoying wow i've never what is this crap sorry i'm gonna turn off the volume there we go okay so what i was saying was now the first thing we check is up here so that's gonna be the row plus one plus two plus three to get that position so that's going to be rho plus three and then the c i think will stay the same i know we're going to be going downwards so that's going to be c plus 3 as well to get the position we're looking for because we're going downwards so c plus three do we want c plus three uh yeah that should work sorry i'm just confusing myself now i guess the c stays the same so c is normal and rho plus 3 and then we'll minus i and we will in this case we will plus i for i in range window length i think that should work let's think so if we think about all the different combinations start with this position that's the sear the zeroth column that's good then we go down one the row decreased but the column increased down again row increase but the column decreased down again row increase or row decreases but column increases so i think what we did here is good and we can score that just like we scored the other ones it's a little bit tougher to test so what we'll do to test these is we'll temporarily comment out the horizontal and vertical and just see if it when it has the option and sees the diagonal that it preferences it so i'll make it easier for it so if it puts it here or here we should see on the next time it putting a three in a row okay yep it did i made the moves a little bit quick but as we can continue basically oh shoot don't make that if i put it right here it should if we implemented these diagonals properly connect it yay we can go ahead and uncomment all this now that we've checked that diagonal i guess we should ideally check positively sloped two but i'm pretty sure we're good there okay and now what i recommend we do is we're going to add a couple more different ways we can score this and just so we don't copy all this into each one of these individually let's abstract out a new function called like i don't know uh find it called window evaluate window how about evaluate window and we'll take in the board and the piece to that or i guess not the board we just need the window and i guess the piece for that so this is going to be the exact same thing as we did before so if window dot count piece equals equals four score plus equals 100 alif window.count piece equals equals three and window dot count is empty equals equals one score plus c equals ten let's think what else we should have should also have an lf for window dot count piece equals equals two i guess we it doesn't really matter if we use if or l if i don't think here and window dot count empty actually it does oh no it doesn't i don't think it does empty equals equals two reduced score plus equals five and these values these are kind of arbitrary and i'm fine leaving them as kind of arbitrary guys i don't really need to extract these out into their own variables i guess i could if i wanted to but basically these are the type of values you play around with as you kind of tune and try to make your ai better let's see what else we could do we could also have like i guess inverse piece so this would be the opponent piece so i can just call it component piece equals we'll just say player piece opponent piece equals player piece and if piece equals equals player piece then we just need to switch it so that would be then opponent piece would be equal to ai piece if that made sense what i did i just was saying that okay let's assume we're the ai and set the opponent piece to be player piece but if we're not the ai and we're the player and we're adding the window then the opponent piece should be aips so we could also do is something like if window dot count opponent piece equals equals three and window count dot opponent or win account empty equals equals one and then that means the opponent has a three in a row and we could do like score minus equals eight let's say so it's like we weigh us getting a three in a row more than we weigh the opponent getting a three in a row okay just to finish up this function we should initialize score equals zero we'll return score at the end and this is just going to be so then when we actually go into the score position we can delete these lines and what we'll actually do is do score plus equals evaluate window of window piece and we just copy this line here for each of the other locations that we added these logic cool cool cool all right let's see what happens oh no plus equals none type oh yeah should probably return the score okay so we should be seeing some preference so it should probably want to get this three in a row yep okay will it block me so it doesn't seem to be blocking me let's see if it preferences diagonal downwards i guess it's diagonal that way let's see if it preferences diagonal downwards now cool it's looking pretty good however it's not blocking me so that is an issue so it's not blocking me and the reason it's not blocking me i just realized is if we look at this if we think about how this evaluate window is working we're only calling it for the ai so if the ai ever sees the opponent has three and the window count is one empty is one then the opponent already has three so would be able to win in the next case so in this case it would be preferencing putting three in a row over blocking the opponent so watch what happens if i make this like negative 80. now it's going to preference and i also will have to we'll have to change the best score to be some sort of low negative number to start just so it doesn't screw up if we get a negative here now it should block always block if i ever get three in a row yeah see that cool i don't know if we'll keep this in when we go to mini max and we're gonna go to min max right around right really soon the last thing i want to add before we go to implementing the min max algorithm is just scoring the center columns and you know adding preference for center pieces because that's what's going to create more opportunities with the diagonals and the horizontals is if you have the center the center pieces okay so we can get our center array so score center and i guess this should be i'm going to move this comment right below the score equals zero move this up a bit and now i'm going to add the score center column so we're going to create a center array just like we were kind of making the row or the column arrays within the score vertical so we're going to do equals int i for i and list we want the board and we want every row position but we only want the center column so we can do column count and then we could just you know mainly think about it 0 1 2 3 would be the middle but just to kind of not have magic numbers i'm going to do floor division by 2. so that will get the the middle column and then what we want to do is just do center count i guess equals center array dot count piece and then we'll just add to our score i don't know three points or so let's see i i kind of did some weird numbers here i'm going to add like six for each piece in the center so center or score plus equals six let's say and these numbers are all going to change so don't worry too much about all the score increases this is something you play around with okay let's see if it's preferencing center now it should not have gone there for us referencing center score plus equals six let's really crank up this score real quick and see if it works it's weird yeah something's goofing oh okay sorry i was doing plus equals six but i was it should be plus equals center count times six right so if you had two it'd be twelve plus twelve it would be um if you had three you'd be plus 18 etc okay now it looks like it's preferencing center we'll see should block okay cool now i think it should go center if i'm not mistaken yeah cool and probably should go center again if i'm not mistaken okay i guess that created more diagonals but okay center cool should block and go center cool i think that's accounting for the center all right with that i think we're ready to move on to the min max algorithm and what i think is helpful at least when i was implementing this is i went ahead and just was reviewing the wikipedia page on minimax so you know if you didn't watch the video i posted minibacks is basically a way so that we can look down the branches of any sort of rule based game like chess checkers and in our case connect four so we can basically look down at branches and evaluate using our score function the best branch that we can guarantee getting so what i recommend we do is go ahead and go to the pseudocode and this is all we really have to implement so function minimax node depth maximizing player so the node in the r case is going to be the board depth is how far we want to search down in our game and maximizing player is going to be true for the ai and false when we're looking at the player's move so let's go ahead and implement this and i'll be kind of popping this window back in and out okay so score position is good let's define mini max uh uh okay def mini max so we're going to pass in the board we're going to pass in the depth and we're going to pass up in the maximizing player and i'm going to just do pass for now okay what do we need to do next in this if depth depth equals 0 or node is a terminal node so what is a terminal node in our case well that's whenever a game is one or i guess if we get to the end of the game those would be terminal node conditions so then we want to return the heuristic value of that node and else we want to kind of recursively check our tree like you see in this diagram down in the bottom right and find the best score so that's what's happening down here let's start off though with the i guess if steps equals zero condition that's when we're going to be returning the score position function score okay so what are okay so let's do this if depth equals equals zero or terminal node so this terminal node we haven't defined yet so let's just define it so what we just said was terminal nodes would be us winning the opponent winning or you know you've used all the pieces in the game so let's do those so i'm going to find another function sorry is terminal node and that'll just be true if it is a terminal node and false if it is not now taken aboard so what are the terminal node conditions well we have this winning move function right here so basically we can use this to our advantage so if it's a if this returns true for either of the pieces then it would be a terminal node so those will be two of our conditions so we want to return winning move board player piece or winning move board uh the ai piece or the final condition is that there's no more valid moves in the game so the board is completely filled up so what we can do in this is we can say length get valid locations of the board and that length should be zero if it's a terminal node so this statement as is will return true if it is a terminal node and false otherwise it's nice one liner so what we're now going to do is i guess valid locations equals get valid locations of the board and then we're going to do is terminal will be equal to is terminal node of the board so now we can do if depth equals zero or terminal node or i guess or is terminal okay what do we want to do in this case well there's three conditions so if we are the ai then the first thing we want to do is we want to if it is a winning move for us the ai we want to do if winning move so this is basically just figuring out which case of these three we're in if winning move uh board ai piece then we want to return a high score so return like you know a really big number lf winning move board player we want to return a really our player piece sorry we want to return a very low score and we're going to augment this so it also has a position but you'll see that in a sec okay turn a really low score if it's losing condition and then else this is going to be when we evaluate our score position this is a board that is not a winning condition so and it's not a terminal condition so we actually want to get the score for it so we're going to do and i'm really just following if you're getting confused with what i'm doing i'm doing this step of the mini max algorithm and i'm just finding the heuristic value of the node and these are kind of two edge cases the winning move cases and i'm just handling them separately i guess they could be kind of factored into my mini max but just because you know we're only really looking at the opponents uh it might work if i didn't handle these separately i just in my head it makes it easier if i handle these separately then finally the else condition is when we want to i guess the else condition here would be the game is over so that's just kind of like a weird case where we would return zero i suppose because you can't really do anything from there all right cool so we have the that's part of it and now the final thing is if it is not oh sorry if is terminal that's the is terminal cases either we win the opponent wins or the game is over those are the three cases so i can just even mark this game is over comma no more valid moves the other case is the depth of zero so we can just do else here and that's going to be depth is zero and in that case we want to find the heuristic value of the board so what we're going to do is just do return score positions uh score position of the board and i guess whatever piece we dropped so in our case it would be the ai's piece i guess maybe it would be good to specify this but i'm just going to keep it as the ai piece right now you can fix it if you need to all right so that's the first part of the mini max and sorry if this i hopefully this is not too confusing it'll all kind of fit together once we you know get through the rest of it all right let's finish implementing the minimax so let's do the bottom half of that what i was just highlighting on the wikipedia page and also this should be ai piece the ai and the player without the piece is just only was used for the turn um okay all right so else let's see what does that say if maximizing player then and we returned out of this so this week if you get into this if statement you'll not get into the other ones that we're about to define so if maximizing player and i'm going to just actually i think what will be easiest is if i just keep this on the screen where's the pseudocode pseudocode okay cool all right so we have the pseudocode on the screen and we have the code over here so oh gosh now i can make this the other half okay not too bad all right so if maximizing player i'm following right here it's amazing player then we want to initialize the score to some really low value so what we could do here is i think we actually use if we actually want to specify negative infinity and positive infinity i think we can do instead is we can do value equals math dot infinity that works and i can actually just do negative math dot infinity okay so that covers that and then for each child of node so that's for each position we can drop in our connect four game so what we're going to do here is we're going to do for column invalid locations which i already have to find up here fortunately for us column invalid locations comma row equals get next open row of the board and that specific column that we're looking at and then as before with when we're just doing this the score position function we need to copy the board so board dot copy just because otherwise we'll use the exact same memory location that the board is on so it will really screw up when we're you know recursively doing this okay and we can go ahead and drop a piece in that position v copy row column and we are the if we're the maximizing player then we're the ai piece all right and so what's going to happen here is we're going to have a new score score which is going to be equal to the mini max which is going to be the max so if you look at this it's going to be the max of the current value which is negative math infinity and minimax of depth minus 1. so new score i guess would just be the max of current score which was value and minimax of this is our recursive call here of the board copy depth minus 1 and i guess we do false now because we're no longer the maximizing player now the minimizing player cool and then finally what we'll want to do is return the new score so that's the max of the value and the minimax and we want to do the same thing for the minimizing player so else we know we are the minimizing player and that is the code right here we want to initialize the value to positive infinity and now copy the same thing for column invalid locations same as before row equals get next open row of board and column we need to copy the board again because we don't want to use that same memory location otherwise we're going to have issues when we try to recurse back up the tree drop piece equals oh shoot board copy row column and now we are going to be the minimizing player so what we need to do is actually make this piece the player piece and same before the new score though is going to be because when we are if we remember this the minimizing player is trying to take the lowest value so if you see this node right here this 10 10 and plus infinity 10 is the lower node so it always take that so in the minimizing player case we want to take the the new score would be the min of the value and minimax of word copy depth minus one and then we want to do true for now switching to the maximizing player so this this false and this true is what's allowing us to switch back and forth between this maximizing player and the minimizing player in the context of the mini max algorithm and make sure you understand the mini max algorithm watch the video that i've posted previously on it or just kind of review this wikipedia page to kind of get a feel for how it works because that is crucially important for understanding how this new ai is working okay and we want to return value or the new score cool i think this is about it right i think so however the one issue i have right now as i look at this is we're returning the score which is good to know like how good of a move can we make at a certain location but we also need to augment the score with the column that produces that score so in addition to just producing or just figuring out what the best score is we need to figure out which column produces that score so to do that we're going to just actually get rid of this max and min we're going to just break it out a bit just it's easier to keep track of what we need so we're going to say that the new score is equal to the mini max that same thing for this and instead of doing it in just one line we're going to just bring it to the new line so if new score is greater than the value value equals new score so this is the same thing right now if we kept it like this this is the same thing as the max the one liner we just had but we also want to do something like column so this this column right here is like the best column you can get so we could initialize this to just a random choice of the valid locations random.choicevalid locations and same thing for down here column equals random.choice of the valid locations all right so now what we have is column equals whatever column we are currently iterating on that gave us this new score when we recursively did the mini max algorithm so that would be call and okay that looks i think this is good here we also know need to now implement it down here same thing so this time it would be if new score is less than the value we want to do value equals new score and column equals call and we want to okay and then finally what we have to do is instead of just returning the score we want to return both the score and the column that produces that score so we'll return a two pole so we'll do column and new score or i guess value is our final thing that we return same thing here turn column and value and then up here these are the terminal conditions so if we get a terminal condition we don't actually know which column produces the terminal location because it's just looking at the board particular but we can get that very easily because we would get it through this so basically what we can do yeah so basically when we returned this new score all right yeah so we're gonna augment these none just has to basically all be the same format even though these wouldn't actually produce a column value none zero okay so we don't know which column produces those and basically what we need to do now is if we want to adjust the score we need to get the first index we need to get the first index here is that it i think that might be it and then i guess right here too this would just give us a score so in this case two we need to do none comma score positions so basically we're always because this is a recursive function we always need it to return the same format so we're now just saying turn us the the best score in the column that produces the best score cool i think this is it we can delete this pass so let's go ahead and real quick test out our s you know our first attempt at the mini max and to do that we can just go down to here and instead of best pick best move what we're going to do is do column equals minimax of zero or actually what am i thinking it's bored then it is depth so let's just for sake of simplicity we'll start with depth equals two and then we are the maximizing player so that is true let's see if this works nope damn it is the location where we get the error where do we get this error too many indices for array 267. if is valid location oh shoot let's see oh all right yeah sorry i was just trying to get the column out of here but we actually get the column and the mini max score so i kind of have to unpack the tuple now it's not just one thing it returns okay all right so what we should see now i think at depth of two it should be smart enough to block oh i guess not yet i was hoping that it would block me from getting this situation where i could win on either side i'm gonna try increasing the depth to three and see if it does that all right so what i'm trying to see is if it's looking ahead in the future and how we can tell that is if it's looking ahead in the future it should prevent us from getting winds that can we can kind of have two spots where we win so if i drop it here it needs to block me so that i can't get the horizontal three on the bottom still is not doing it what the heck it might be some sort of error i'm going to just try one more time maybe seeing if it's not looking far enough in the future and do this depth four so this is how far of a branch it's checking yeah it's definitely having some issues here it did win that time but it's always picking zero it's always picking the farthest to the left to move why is that the reason it is always picking the farthest to the left even at depth of four is let's go up and just check why is my mouse being so fidgety today won't scroll up for me is okay i see it if we look at where we are returning this column and value it is inside of the for loop so it's on the first column basically and also we initialize the value to negative infinity so it's basically okay it's figuring out that whatever score is on the farthest to the left column it's going to be bigger than negative infinity so it's setting that column here and then returning it immediately what we need to do is just backspace this once and hopefully now it should be pretty good and i probably don't even need to go depth of four think about i'd go to like depth of three and it will still block just fine okay i kind of want to go first just because i feel like i'll see it easier okay so now it needs to block one of my horizontals which it does now which is good it needs a block cool that's looking good okay let's see if it's looking into the future it should put its piece right here next if i put it up here cool it's looking good it's like smart enough to know that like this position right here where my mouse is was the best spot to put it because that gives us a win here and here so if it looks in the future and knows it's going to win it's not going to be easily blocked by me looking good cool all right uh real quick um just because i'm not positive how i feel about the values that i currently set all these things up here in the score positions function too or i guess evaluate window i'm going to just change these to what i set when i was playing around with this so the window count equals 4. this actually is pretty irrelevant at this point i think the only reason this would play in is if we set the depth equal to zero and it had to use just this evaluate window so i'm going to leave this in here but this know that this is probably not too important anymore i'll still keep this value okay three and one so three in a row i'm going to give this a weight of five two in a row i'm going to give this a weight of two but remember that it's in multiple directions so you could have multiple twos in a row and it adds up uh opponent equaling three and one this is not as important anymore because and also i have to kind of make this a smaller value because it's this is factored into the terminal condition case so it will look ahead in the future and make sure that it's not losing so i'm going to make that minus four then the other thing that i change to is instead of making this times six i'm going to make the center column worth plus three and these are values you should play around with so if you really want an expert level ai play around with these values but honestly i feel like this ai right now like i'm gonna just play a game like try to win i guess it's still not amazing i guess it's depth three right now but yeah like look it like it's making pretty good moves and it's like gonna know to win if i put it there i think this is a pretty good ai at this point uh to make it even better what we can do is increase the depth of how far it's looking uh where is that so if you make it four it's looking if we think about how the mini max works uh it's looking farther down in this tree so this four would represent how far it's looking down it's like right now we're at depth four so that's why it's looking this far down and you can think of this graph as being all the possible ways a connect 4 board can work so depth 4 let's see this would be even better of an ai because it's looking farther in the future and we could even make this better let's just go to depth 5 and see what happens okay i kind of have a problem i don't know if you noticed it but i did depth five and it seems like it's just oh it finally made a move it just takes forever to make a move at this point try to like think about why that is you can even pause the video if you're trying to like think about why that is but yeah i mean if we're going it if we're doing depth five and we think about what the minimax is doing it is branching out at every step so from the top of my hands is the terminal node every time we do a higher oh my god i'm trying to be even every time we do a higher depth we expand exponentially more branches so this is basically like seven to the fifth power or maybe even six power because we also look at zero so it is a lot of branches to expand so it's going to run slowly so 5 is probably the max you can really do otherwise if you go to 6 your computer is just going to explode not really but basically it's probably going to freeze up and crash on you if you go to 6. so this gets us to the final thing i'm going to cover in the video and that is the alpha beta pruning so it's basically allowing us to go to deeper depth by eliminating a lot of the moves we have to look at and i didn't actually cover this in the video i posted about how does a board game ai work so check out a video i posted in the description on kind of the overview of it uh at a very high level yeah basically it is figuring out that there are certain branches that a computer is just or an ai is just not likely to take if it realizes like in a move if it goes a certain position that it loses it's not going to go down that branch so you can just not look further down into the future there yeah so check out that video real quick on alpha beta pruning and then once you watch the video we will implement alpha beta pruding to allow us to more quickly search the depths we want okay alpha beta pruning so just like when we are doing the mini max what i recommend is look going to the alphabetical pruning wikipedia page and there's also pseudocode there so i think that's helpful the pseudocode is helpful to follow but it's just basically an adaptation of what we've already seen with the minimax but now we have these two additional parameters alpha and beta and you should understand what these are now that you've hopefully watched the video or already had some knowledge about how it worked so we're gonna go up to our sorry i have this on another screen because i don't want to screw this step up um you should yeah go to your mini max function and we're going to be adding our alpha and our beta to that function and so we're also going to have to pass this into all of our other calls of it alpha beta and right here alpha beta and we'll also have to do it down here when we actually call this and if we remember the alpha and beta should be initialized to negative math infinity and this is just i'm looking at right here the initial call right here negative infinity negative math infinity math infinity and true so that's the initial call and then all right so let's define this new mini max function with alpha beta this should be a comma here all right so let's see what it's doing so basically all we're doing is that in addition to getting our max value we also are just evaluating the alpha so we can do this below the new score stuff so alpha equals max of the new score or i guess the value at this point we can do value because it would have changed value and alpha and then which we want to do is or i'll just do it in the same format that wikipedia is doing so alpha and the value okay and then what we want to do is if alpha is greater than or equal to beta then we want to break out of our loop and that is good and then same thing down here now we are the minimizing player so we want to do beta equals the min of the beta and the value and if we find that alpha is greater than or equal to beta then we've reached a condition where we're going to just break off because we don't need to look any more further in the tree and we'll break so i think that's all we need to do to implement alpha beta it's pretty straightforward after you have the mini max up and so let's run this real quick i'll probably get an error we'll see ah shoot i need to not use my track pad because keep accidentally dropping pieces okay it has a piece there this should drop quickly if before depth five was taking a while now that we did this it should drop quickly which was pretty good it was pretty good reasonable more reasonable than it was before for sure so i think that is looking pretty good and what we can honestly even do is now that we're searching so far what we can do is we have that initial time delay i can actually just comment that out at this point so this should be pretty quick and we're going at depth 5 which is pretty far down yeah it's not bad i think even if we go to depth six it's not too bad either but like before if we went to depth six it would take forever we'll see it also depends on what type of processing power you have so i'm going to try depth six here it's taken a bit for sure did i do this right it but not nearly as long as it was taken before yeah the the alpha beta pruning looks good and one thing i just realized is that um you're gonna see more performance gains kind of as the game goes on further because there's gonna be at the start of the game there's more possibilities and less places that can break out of that alpha beta thing but as a game goes kind of on you'll have more breaks frequently so the performance gains will be definitely seen as the game continues on all right that's we're going to end um what i would say too is like we have a pretty dang good ai as is but i didn't there are additional things we could do to make it even better so a couple ideas that i'm thinking about is uh one thing is that currently it doesn't weight lower like so if you think of the game i'm gonna run it real quick if you think of these rows down at the bottom it's like this first row here the second row etc it should we should wait um we should wait what am i trying to say we should wait the lower wins so like when you have three in a row that's lower that should probably be weighted higher than if you have a three in a row that's at the top here and that's because you know as the game fills up people have to push them on the lower rows before the upper rows so like if you are strategically playing as an ai you would want to like put your pieces down lower so you get more chances to win also what i recommend is you can make it even better than that by watching the even odd connect four strategy that i posted that's like a really cool strategy that you could try to implement with the ai um you could also let me think i mean there's other things you could play around with off the top of my head i'm kind of spacing right now but the values all could be tuned of the scores um but yeah just have fun with it have fun thank you guys for watching and peace outhow's it going everyone and welcome back to another video this video is long overdue but today we will be programming an expert level connect for ai which is gonna be a lot of fun if you haven't already i recommend you check out the video i did before like a while ago on i guess just programming the connect four game and what we're gonna be doing today is building off of that and actually implementing the ai um if you don't wanna like go through all the effort of what like programming the game from scratch you also can just go to my github page and i'll link to this in the description but there's this file connect for py so if you just click on the raw there and just copy all of this this is going to be the starting point from this video i'm going to be programming in python 3 and i'm using sublime text as my editor so i'm just gonna call this connect4original.py also if you haven't seen it already i really recommend you watch my video that i did called like how does a board game ai work because basically what we're going to be doing in this video is following along kind of the steps that i go through in that video on the theory behind a connect4ai so it really will help you with the background information on like how everything works because it can be a little bit complicated at times once you have the file locally just make sure real quick that it runs properly so when you run this original file you should ah why is it not running you should get a screen that pops up looks like connect four board and you just take turns being the yellow and red player dropping pieces i'll finish this up real quick and it should do something like that when you you win uh you don't have any errors to start off so just make sure that everything's working properly before we actually get into the ai and with this ai we're going to start off just like i did in the how does a board game ai video work with the simplest approach we can think of for the ai so we're just gonna have it the player two just drop a piece randomly somewhere on the screen so that's what we're gonna start out with before we begin take a second to review the code and make sure you kind of understand how all the pieces fit together because as we go through the steps of the ai if you don't really understand how the game is working it's going to be kind of confusing the steps that we're implementing and to start off we are implementing the random dropped piece ai so we're gonna have to go down to our game loop which is here and in the initial game we were checking for the mouse button down event and then based on whose turn it was so turn equals zero was the red piece i think and then the else statement was player 2 and that was the yellow piece we are making the drops based off of that in this random ai situation we are now not going to have player 2 drop when a mouse click happens is gonna just we'll have player two drop whenever um right after player one has made the move so once player one makes the move that should trigger player two to make the move and to kind of make some of this clear i don't like having like turn equals equals zero popping up in the code all over the place these are called magic numbers so what i'm going to do real quick is just to make things a little bit more clear is i'm going to go up here and define a static variable player which equals zero and i'm going to have a static variable or kind of just a yeah a variable called a i equal to one and i'll just when i making these variables will make it a little bit easier for me to read the code and also as we implement some of these functions with the ai it'll make it a little bit easier for us to understand how to implement it so now turn equals equals 0 is turn equals equals player and then for this we don't want it in the let me make this a little bit smaller just for a sec oh this can be too small i'll make it bigger but we don't want it to be in the if statement if mouse button down anymore we want it to be outside of this for loop because it shouldn't depend on event should just depend on when this i guess happens which we can store separately so we're going to move this back a few okay so that now should be in line with the 4 event and instead of an else we can't have an else without an if above it so what we're going to do now is we're going to do if turn equals equals a i and i guess we'll also do and not game over because we don't want this a piece to drop after the game has already been finished so if journey equals equals ai and not game over then we have the random ai making the decision and let's see there might be some other changes we have to make now that we did that one thing we might want to do is throw the print board somewhere differently actually it's good right here and let's see where else it was i didn't actually move it okay so i'm gonna move this stuff outside of the loop too let's see okay i'm thinking about this okay if game over that should be in the loop the turn what we're gonna do with the turns now is only have them change when a valid location was chosen so here the turn is changing and same thing for the player turn so if valid location then we'll change the turn okay uh no not that one less so inside the is valid location okay so now we what we have just done we haven't really done don't worry too much about the i guess logistics of what we're doing but basically we're setting it up so that we can pick a random spot to drop a piece we're just kind of having to do some i guess extra work to clean up the code and set it up for what we're trying to do so if turn equals ai and not game over cool and so now what we want to do is that initially right here we this was looking at where our mouse was and picking the column based off of that what we want to do instead is now just pick a random column so i'm going to do column equals random dot rand int 0 which is the farthest to the left column and we have column count columns right but indexing is gonna put that off by one so we're gonna do a column count minus one that'll pick a random number uh between zero and i guess six uh inclusive to drop our piece in and i have to i did not import random so i have to do that real quick all right i feel like i screwed something up and maybe we'll get some simple ai right here okay it seems to be working but it's immediate and that's kind of annoying but that's a good uh we're doing well so far oh and it's screwed up i just dropped a piece in a winning location and it screwed up there too so we have some things we need to do so the first thing i noticed that was really annoying was right when we dropped our piece the ai dropped their piece and that just doesn't seem very like it's not a good user experience so the first thing i'm going we're going to add is some time delay for the ai piece so down here we have the ai move and so basically what we're going to do is once this column has been selected and is a valid location so if it selects a column that's like not valid like already filled up or something it will run the loop again like nothing will have changed the turn won't have changed or anything and it will find a column that's valid and so in the is valid location we're going to add a a delay so i'm gonna do pygame.time.weight500 and this is similar to what we did before um when we ended the game just to like hold the player one wins on the screen so just like we did there so we're gonna add that and so let's see how it looks now quickly what the heck okay so i noticed another issue so i think it's working but what's not happening is the board is not getting re-printed after the we select our piece so the delay was there but it was being weird because it wasn't actually drawing the the board again so what we're going to do here is once we select like a is valid location we should update the board because it will have dropped a piece so i'm going to add this here see what it looks like now yay that looks pretty good and it's moving around okay it seems random but obviously as i mentioned or as yeah as i mentioned in the how does a board game ai work it's when you just select a random column it is such a stupid ai and it will not block my obvious three in a row so i can easily win right now also real quick while i'm thinking about it i don't know if you noticed this but right now i always get to go first so we probably would want it to be random who gets to go first it just makes it a little bit more realistic and we don't want to we want the code to work if the ai goes first so we can really easily add some logic that will allow us to initial randomly initialize who goes first and all i'm going to do is turn equals random.randint01 or i guess we could even do player ai which is going to be zero to one as well so what will happen now is instead of turn where was turn initialized somewhere in here turn equals zero now turn is going to be either one or zero and the reason we can do this just fine is because the only place turn can change is after a valid piece has been dropped so basically what's going to happen now is we're gonna end we're gonna skip over this event this code right here on the first iteration if we get one in that initialization step and then the ai will drop a piece it will switch to the player's turn and then we'll start back up at the top of the loop and we'll wait until the user clicks and then run the code so let's just see if that worked oh shoot yeah it looks like it worked i'll run it one more time just to make sure it's kind of weird that it's starting yellow it's a little bit laggy or something but it is working and i want to just make sure that i can get a i should be able to go first at some point too okay yeah so after three tries we initialized the zero and i get to go first as before one more thing before we move on to the slightly more advanced ai is that when i was running this one thing that i found that was annoying is come on run dude i have to click first is when the yellow piece is first so you see how when the yellow piece is first the yellow block popped up top here it draws the yellow block up here i don't really want that to happen now that only we're always red so i'm going to change this to turn equals player and then i'm going to actually get rid of this drawing the yellow circle on the mouse motion event so let's see what happens now yeah so now it just pops up immediately and i could add another delay but i'm fine with this cool yay all right let's move on to something smarter and actually start implementing the scoring heuristic for our board so what we're going to be doing is evaluating whether or not a board state is good or bad based on you know how many two in rows there are how many three in a rows there are whether or not you're playing in the center column and you know vice versa for the opponent having three in rows two rows etc and so this is exactly what we did in the how does a board game ai work video except the one difference is that a lot of the scores i think i was going through in that video were relative to the piece you dropped so like you'd only look at the three in a rows created or the two in rows created from the specific piece you dropped that turn what we're gonna have to do when we're implementing that so that we can use the minimax algorithm is do that relative the score will be not it will be independent from which piece was just most recently dropped so you'll kind of look at all of the board and count how many three in a rows how many two in rows etc let's define a function called score position which will give the we'll assign the score to our board so i'm going to define this right under the winning move function so score position and if we're scoring position we're going to need to take in two things uh in my opinion we're gonna have to take in the board obviously we'll also want to take in a piece so we have to know whether it's red or yellow going whether it's the player or the ai um okay cool and this is going to be pretty similar to how we did winning move or in a way basically what we're going to check is look at the whole entire board and basically be counting the number of twos and rows three in a rows etc i think to start off and make things easier let's just count whether you win or whether uh you win or lose so we'll just be looking at four in a rows and we'll start off with just horizontal so horizontal or score horizontal i'll say and let's see one thing i want to do real quick just it's on my mind is pieces defined as one or two so one is the player's piece value and two is the ai's piece value and that's another like magic number so i'm gonna just try to like i'm gonna define two more variables player piece and ai piece player piece equals one and this will just probably make our code a little bit easier to read again and aips equals two so let me just go through and try to switch things out so this would be player piece here this is ai piece defined in several locations go down player piece right here drop piece this would be player piece winning move this would be player piece i think this one is something different here let's see what else we have down here would be a eyepiece and i think down here would also be aips i think that might have been it not positive but just cleans up the code a little bit okay so we want to score the horizontal locations the way that i think we should do this is basically what we're going to do is it we could do it the exact same way we did winning move but it gets kind of tough with these if statements to count whether or not like you have two in a row exactly or three in a row exactly because then you'd have to like you have to factor in all the different ways you can have three in a row so what we're gonna do is i'll run the game real quick oh no what did i do sorry i just need to pass this function real quick what we are gonna do instead of that is basically imagine so look at my mouse down here in the corner we're gonna look at with horizontal starting we're gonna look at window sizes of four and just basically in those window sizes of four count how many empty squares and how many filled in squares there are and we'll basically be able to count then the lines of two lines of three lines of four from that so this is how it's going to work so we have our board and that's you know the matrix the row of rows and columns so what we're going to say is for r in range row count so for each row we're going to want to do this and then what we're going to do is we are going to define this row array and yes row array the best way yeah row array so this is going to be all of the pieces so if we're looking at a single row this is just going to be the seven tiles but just in a single list so it's easier and we're going to define that as row array equals into i for i in list board and then so we want the specific row we're on so it's going to be board index of row but then we want all the all the column positions for that row so that's why we use this colon right here i think that's what we want let me just see and the inta is just to make sure that we're always getting something that's can be indexed into the um okay ah so what am i saying sorry and i i think i just added this just to be cautious that we didn't try to index something with a float value or something so that's our row array and so then what we're going to do is get if this is like let's say we looked at the board down here the row array would be like this single list in here what we're going to do is look at all the pairs of all the window sizes of four so to do that we will do for c in range column count minus three and the reason we do minus three is because the last one we want to look at is this so we don't look it like we don't start with these last three as the farthest to the left if that makes sense i'm highlighting great in this area if you missed that okay so we're getting the horizontal windows and so the window then will be a four will be the row array and we're going to index that from our starting position so that's going to be from c to c plus i guess 4 would it be four or three i'm trying to think it's gonna be four yeah c plus four and we can actually make this a variable two just so you know where that four is coming from i'm just gonna call this window length so window length equals four so window length so we're getting four at a time and then what we're going to want to do with that is count the number of ones and twos so ai pieces and you know player pieces what not so what we're going to do with this window is there's a built-in function of lists in python dot count so what i'm going to do is count the piece that we're passing into the score position function and if that equals equals 4 so that would be we win so i guess if window dot count piece equals equals four let's say we have a score that we're defining so score is zero initially we'll say score plus equals like a hundred right so if we ever find four in a row on the board that's a really good board obviously um and so we should try to get that move okay so we have score plus equals 100 and then once we're done we're just going to do horizontal for now and i'll fill in we'll fill in the diagonal and vertical later turn score okay so now we have this score position function that looks at a single board state and gives the score of a hundred if it is a connect four and gives zero otherwise and i guess for the sake of this let's also preference uh three in a rows why not or maybe yeah three in a row so if i'll define this above or i'll define it below lf window.count piece equals equals three and window.count of zero so we should define this as a piece too so empty equals zero this is just how we are storing the values behind the scenes we know that count of empty equals equals one then let's add i don't know 10 to our score so this is three in rows this is four in rows right and then we need to find one more function and that is just like pick best move and that will give taking a board what else does this need to take in i'll start with just the board potentially i guess the piece too yeah why not pick best move so what we're going to do with this is we need to look at all the different columns we can make the move and then basically run the score position on all those columns and then pick the highest score that is returned so we will write that as follows so we have a function called i think valid locations if i'm not mistaken okay no we don't but we can find a function called valid locations real quick so i'm going to do that def valid load or let's just call it get valid locations so this will show us which columns we can actually drop it in and then evaluate from there that is a board we don't need to actually have a piece to figure out if it's a valid location or not and so valid locations is an empty list to begin and for column and range column count we don't have to do the minus one here because this is exclusive that that last value if is valid location of board and column then we want to append that specific column because that is a valid location this is just getting us a list of where we can drop just because we'll be doing a lot this a lot so it's going to abstract it into a function valid locations dot append column and then finally when we're done we'll want to return the valid locations okay so now we have valid occasions so in our pick best move we'll need to use this so valid locations equals get valid locations of the board and let's think what else uh we need to do so what we're going to want to do is for column in valid locations we want to figure out what the row so basically what we're going to do is to make our life easier is we're going to pretty much simulate dropping a piece and then have this new board that actually gets passed into the score position because we actually just taken a board object there so we have to kind of simulate each move to actually evaluate using that function so we're going to do is use our function get next open row and that takes in the board and the column and then this is something that's tricky and not intuitive is we can't just drop the piece into our current board and or we can't even do something like this we can't say like temp board equals board and then say okay let's uh drop drop piece in the temp board and row column and i guess peace we can't use this temp board to simulate the drop piece because what actually happens with the numpy objects that we're using to store all the values is that this temp board ends up pointing to the same exact memory location as the original board so what we actually need to do is make a copy of this board so this is creating a new memory location where our board is sitting and what we're doing is passing that new memory location into our drop piece function so any modifications don't modify our original board hopefully that made sense so drop piece um and then we want to basically find the score of the new board so i guess what happens is this drop piece automatically modifies temp board so we can just go ahead and do score equals score position of the temp board and whatever piece that we dropped and then what we'll want to do is basically keep track of a best score so we'll say best score and we'll set that to zero initially and we'll have best column and that'll just be like we'll just put it as a random choice of the valid locations and i defined valid locations below this so i'm going to just move these remove that down here okay so we have our best score and our best column best score is initially zero best column is just random but what we can do is if this score that we scored with this new board is greater than if score is greater than best score then we can set our new best score to score and we can set the new best column to whatever column we were checking when we got this new score okay does that make sense and then finally once we do this we'll do return best column i think that looks right all right let's try that out real quick so all we should see here is we didn't add that anything too complex we just made it preference winning horizontally and getting three in a rows horizontally so we shouldn't be looking at any sort of vertical maneuver okay so instead of the random location here i'm going to comment this out and do column equals pick best move of the board and i guess we're the ai here so i need to pass in the ai piece come on moment of truth i mean let's just stack on top of okay so now we should see it connect three in a row that's what we're hoping for and it did it cool so i'm going to just stack a top again the way we scored it too it should now preference filling this in after i drop which it did we got something a little bit smarter okay now that we have the score working for the horizontal moves we can kind of expand upon this and make it work for diagonal and vertical as well also do a little probably a little bit of cleaning so it ah where where where's the code okay pick us move struggling okay yeah so this is horizontal so we need to also do vertical vertical is also easy so score vertical do next so that would be force column and range column count the ones that are tricky are the diagonals so we're doing the exact same thing we did with horizontal uh but we're going to do it with vertical and then once we have it for each one we'll actually probably improve how we're actually scoring this make it more like i did in the we did in the other video all right for c and range column count let's see what else we should do we should then get the column array ray which is going to be similar to the last time it's going to be int i for i in list so this time we have the we want to get every row position for a specific column so we can do it like this i believe i think that's what we want and then we'll want to iterate through the windows just like we did last time but now we have to iterate through the the row window so for r in range row count minus three window equals row array c or r to r plus window length and we can copy this once we now that we have the window this code is oh shoot i didn't want to delete it this code is the exact same so i can just add this code and what we're going to ultimately do is abstract away this window counting stuff to its own function so okay so now it should if we run this again it should preference vertical as well so i'm going to block the horizontal and it's still stupid and it's oh it actually did block me that's interesting but i'm going to let it get okay so now it has two verticals in a row it should preference making the move right on top of it next if we did this right and it didn't so i'm guessing we did something screwy it's weird this is weird maybe it's just getting lucky okay it should hopefully win vertically let's see i don't know we did something wrong window count piece score plus equals hundred what did i do wrong here oh shoot i grabbed a row array this needs to now be column array okay sorry about that and also for r and range row count yeah okay that should be good hopefully let's see if it preferences vertical moves now too so i'm blocking the horizontal and now basically once it stacks oh shoot i'm going to block the horizontal again but come on okay let me play it again so you can see exactly what i was trying to show you right so it goes i'm going to block the horizontal and hopefully it stacks on top okay now that it's stacked on top we should because it doesn't have any sort of horizontal options it should stack again and it does and so it now should win vertically okay cool so that we kind of mainly validated that it was getting these windows correctly for the vertical okay so this was getting if we think about the same graph it was looking at like the sliver if we just cut off this one column it it was making a list out of these ah sorry these numbers zero zero two two two two okay now let's do diagonals diagonals are a little bit weird um so we'll start with the positively sloped diagonals so the ones that are low on the left side and rise up to the right side so score positive sloped diagonal all right so let's just run the game real quick just to see that how we're going to do this so basically if you look at my mouse we're going to look at all the possible ways we can do this diagonal slope so we can do like this we can move over do this move over do this move over do this and then rise up one row and do the same thing so basically now we have to um go from like row count minus three as well as column count minus three and do a little bit of manipulating like that and it's a little bit weird to get our windows but it shouldn't be too too bad so let's start with this so we're gonna iterate starting with the rows so for row r and range row count minus three because we're going to be cut off at the top because we're going up each time once and then for c and range column count count minus three because we can't just have a okay so and that's because we're going to the right each time four so we have to cut off early for that as well and then what we need to do is our window is going to now be we're going to do a little list comprehension to do this it's going to be the board at the row position and then at the column position and what we're going to do is we're going to have to add a little bit of value each time so it should start with 0 so we're going to we're going to just add some i plus i and that's going to be for i and range window length so basically what's happening is if you think about what i takes on if window length is 4 i is going to start out at zero so that's going to be just the arth the position at r comma c but then we're gonna this is gonna become one so we're gonna increase positively one and one so we're going to go up and across and that will continue and we'll actually get our window in the proper um we'll get we'll get a windows has a 4 that is a diagonal and this will repeat this window right here we'll repeat for all of these values so we'll end up getting all of the positively sloped diagonals and so we're going to just you can just copy this code here and we can do the same for negatively sloped diagonals so for r and range we'll do the exact same thing oh my gosh stop texting me ah sorry for c in range column count minus three and this is a little bit trickier because now we need to stop start at the top and go downwards so to do that bear with me we're going to want our window to be equal to we'll do a list comprehension again board r so this is going to be the same exact as before but now oh my gosh this is annoying wow i've never what is this crap sorry i'm gonna turn off the volume there we go okay so what i was saying was now the first thing we check is up here so that's gonna be the row plus one plus two plus three to get that position so that's going to be rho plus three and then the c i think will stay the same i know we're going to be going downwards so that's going to be c plus 3 as well to get the position we're looking for because we're going downwards so c plus three do we want c plus three uh yeah that should work sorry i'm just confusing myself now i guess the c stays the same so c is normal and rho plus 3 and then we'll minus i and we will in this case we will plus i for i in range window length i think that should work let's think so if we think about all the different combinations start with this position that's the sear the zeroth column that's good then we go down one the row decreased but the column increased down again row increase but the column decreased down again row increase or row decreases but column increases so i think what we did here is good and we can score that just like we scored the other ones it's a little bit tougher to test so what we'll do to test these is we'll temporarily comment out the horizontal and vertical and just see if it when it has the option and sees the diagonal that it preferences it so i'll make it easier for it so if it puts it here or here we should see on the next time it putting a three in a row okay yep it did i made the moves a little bit quick but as we can continue basically oh shoot don't make that if i put it right here it should if we implemented these diagonals properly connect it yay we can go ahead and uncomment all this now that we've checked that diagonal i guess we should ideally check positively sloped two but i'm pretty sure we're good there okay and now what i recommend we do is we're going to add a couple more different ways we can score this and just so we don't copy all this into each one of these individually let's abstract out a new function called like i don't know uh find it called window evaluate window how about evaluate window and we'll take in the board and the piece to that or i guess not the board we just need the window and i guess the piece for that so this is going to be the exact same thing as we did before so if window dot count piece equals equals four score plus equals 100 alif window.count piece equals equals three and window dot count is empty equals equals one score plus c equals ten let's think what else we should have should also have an lf for window dot count piece equals equals two i guess we it doesn't really matter if we use if or l if i don't think here and window dot count empty actually it does oh no it doesn't i don't think it does empty equals equals two reduced score plus equals five and these values these are kind of arbitrary and i'm fine leaving them as kind of arbitrary guys i don't really need to extract these out into their own variables i guess i could if i wanted to but basically these are the type of values you play around with as you kind of tune and try to make your ai better let's see what else we could do we could also have like i guess inverse piece so this would be the opponent piece so i can just call it component piece equals we'll just say player piece opponent piece equals player piece and if piece equals equals player piece then we just need to switch it so that would be then opponent piece would be equal to ai piece if that made sense what i did i just was saying that okay let's assume we're the ai and set the opponent piece to be player piece but if we're not the ai and we're the player and we're adding the window then the opponent piece should be aips so we could also do is something like if window dot count opponent piece equals equals three and window count dot opponent or win account empty equals equals one and then that means the opponent has a three in a row and we could do like score minus equals eight let's say so it's like we weigh us getting a three in a row more than we weigh the opponent getting a three in a row okay just to finish up this function we should initialize score equals zero we'll return score at the end and this is just going to be so then when we actually go into the score position we can delete these lines and what we'll actually do is do score plus equals evaluate window of window piece and we just copy this line here for each of the other locations that we added these logic cool cool cool all right let's see what happens oh no plus equals none type oh yeah should probably return the score okay so we should be seeing some preference so it should probably want to get this three in a row yep okay will it block me so it doesn't seem to be blocking me let's see if it preferences diagonal downwards i guess it's diagonal that way let's see if it preferences diagonal downwards now cool it's looking pretty good however it's not blocking me so that is an issue so it's not blocking me and the reason it's not blocking me i just realized is if we look at this if we think about how this evaluate window is working we're only calling it for the ai so if the ai ever sees the opponent has three and the window count is one empty is one then the opponent already has three so would be able to win in the next case so in this case it would be preferencing putting three in a row over blocking the opponent so watch what happens if i make this like negative 80. now it's going to preference and i also will have to we'll have to change the best score to be some sort of low negative number to start just so it doesn't screw up if we get a negative here now it should block always block if i ever get three in a row yeah see that cool i don't know if we'll keep this in when we go to mini max and we're gonna go to min max right around right really soon the last thing i want to add before we go to implementing the min max algorithm is just scoring the center columns and you know adding preference for center pieces because that's what's going to create more opportunities with the diagonals and the horizontals is if you have the center the center pieces okay so we can get our center array so score center and i guess this should be i'm going to move this comment right below the score equals zero move this up a bit and now i'm going to add the score center column so we're going to create a center array just like we were kind of making the row or the column arrays within the score vertical so we're going to do equals int i for i and list we want the board and we want every row position but we only want the center column so we can do column count and then we could just you know mainly think about it 0 1 2 3 would be the middle but just to kind of not have magic numbers i'm going to do floor division by 2. so that will get the the middle column and then what we want to do is just do center count i guess equals center array dot count piece and then we'll just add to our score i don't know three points or so let's see i i kind of did some weird numbers here i'm going to add like six for each piece in the center so center or score plus equals six let's say and these numbers are all going to change so don't worry too much about all the score increases this is something you play around with okay let's see if it's preferencing center now it should not have gone there for us referencing center score plus equals six let's really crank up this score real quick and see if it works it's weird yeah something's goofing oh okay sorry i was doing plus equals six but i was it should be plus equals center count times six right so if you had two it'd be twelve plus twelve it would be um if you had three you'd be plus 18 etc okay now it looks like it's preferencing center we'll see should block okay cool now i think it should go center if i'm not mistaken yeah cool and probably should go center again if i'm not mistaken okay i guess that created more diagonals but okay center cool should block and go center cool i think that's accounting for the center all right with that i think we're ready to move on to the min max algorithm and what i think is helpful at least when i was implementing this is i went ahead and just was reviewing the wikipedia page on minimax so you know if you didn't watch the video i posted minibacks is basically a way so that we can look down the branches of any sort of rule based game like chess checkers and in our case connect four so we can basically look down at branches and evaluate using our score function the best branch that we can guarantee getting so what i recommend we do is go ahead and go to the pseudocode and this is all we really have to implement so function minimax node depth maximizing player so the node in the r case is going to be the board depth is how far we want to search down in our game and maximizing player is going to be true for the ai and false when we're looking at the player's move so let's go ahead and implement this and i'll be kind of popping this window back in and out okay so score position is good let's define mini max uh uh okay def mini max so we're going to pass in the board we're going to pass in the depth and we're going to pass up in the maximizing player and i'm going to just do pass for now okay what do we need to do next in this if depth depth equals 0 or node is a terminal node so what is a terminal node in our case well that's whenever a game is one or i guess if we get to the end of the game those would be terminal node conditions so then we want to return the heuristic value of that node and else we want to kind of recursively check our tree like you see in this diagram down in the bottom right and find the best score so that's what's happening down here let's start off though with the i guess if steps equals zero condition that's when we're going to be returning the score position function score okay so what are okay so let's do this if depth equals equals zero or terminal node so this terminal node we haven't defined yet so let's just define it so what we just said was terminal nodes would be us winning the opponent winning or you know you've used all the pieces in the game so let's do those so i'm going to find another function sorry is terminal node and that'll just be true if it is a terminal node and false if it is not now taken aboard so what are the terminal node conditions well we have this winning move function right here so basically we can use this to our advantage so if it's a if this returns true for either of the pieces then it would be a terminal node so those will be two of our conditions so we want to return winning move board player piece or winning move board uh the ai piece or the final condition is that there's no more valid moves in the game so the board is completely filled up so what we can do in this is we can say length get valid locations of the board and that length should be zero if it's a terminal node so this statement as is will return true if it is a terminal node and false otherwise it's nice one liner so what we're now going to do is i guess valid locations equals get valid locations of the board and then we're going to do is terminal will be equal to is terminal node of the board so now we can do if depth equals zero or terminal node or i guess or is terminal okay what do we want to do in this case well there's three conditions so if we are the ai then the first thing we want to do is we want to if it is a winning move for us the ai we want to do if winning move so this is basically just figuring out which case of these three we're in if winning move uh board ai piece then we want to return a high score so return like you know a really big number lf winning move board player we want to return a really our player piece sorry we want to return a very low score and we're going to augment this so it also has a position but you'll see that in a sec okay turn a really low score if it's losing condition and then else this is going to be when we evaluate our score position this is a board that is not a winning condition so and it's not a terminal condition so we actually want to get the score for it so we're going to do and i'm really just following if you're getting confused with what i'm doing i'm doing this step of the mini max algorithm and i'm just finding the heuristic value of the node and these are kind of two edge cases the winning move cases and i'm just handling them separately i guess they could be kind of factored into my mini max but just because you know we're only really looking at the opponents uh it might work if i didn't handle these separately i just in my head it makes it easier if i handle these separately then finally the else condition is when we want to i guess the else condition here would be the game is over so that's just kind of like a weird case where we would return zero i suppose because you can't really do anything from there all right cool so we have the that's part of it and now the final thing is if it is not oh sorry if is terminal that's the is terminal cases either we win the opponent wins or the game is over those are the three cases so i can just even mark this game is over comma no more valid moves the other case is the depth of zero so we can just do else here and that's going to be depth is zero and in that case we want to find the heuristic value of the board so what we're going to do is just do return score positions uh score position of the board and i guess whatever piece we dropped so in our case it would be the ai's piece i guess maybe it would be good to specify this but i'm just going to keep it as the ai piece right now you can fix it if you need to all right so that's the first part of the mini max and sorry if this i hopefully this is not too confusing it'll all kind of fit together once we you know get through the rest of it all right let's finish implementing the minimax so let's do the bottom half of that what i was just highlighting on the wikipedia page and also this should be ai piece the ai and the player without the piece is just only was used for the turn um okay all right so else let's see what does that say if maximizing player then and we returned out of this so this week if you get into this if statement you'll not get into the other ones that we're about to define so if maximizing player and i'm going to just actually i think what will be easiest is if i just keep this on the screen where's the pseudocode pseudocode okay cool all right so we have the pseudocode on the screen and we have the code over here so oh gosh now i can make this the other half okay not too bad all right so if maximizing player i'm following right here it's amazing player then we want to initialize the score to some really low value so what we could do here is i think we actually use if we actually want to specify negative infinity and positive infinity i think we can do instead is we can do value equals math dot infinity that works and i can actually just do negative math dot infinity okay so that covers that and then for each child of node so that's for each position we can drop in our connect four game so what we're going to do here is we're going to do for column invalid locations which i already have to find up here fortunately for us column invalid locations comma row equals get next open row of the board and that specific column that we're looking at and then as before with when we're just doing this the score position function we need to copy the board so board dot copy just because otherwise we'll use the exact same memory location that the board is on so it will really screw up when we're you know recursively doing this okay and we can go ahead and drop a piece in that position v copy row column and we are the if we're the maximizing player then we're the ai piece all right and so what's going to happen here is we're going to have a new score score which is going to be equal to the mini max which is going to be the max so if you look at this it's going to be the max of the current value which is negative math infinity and minimax of depth minus 1. so new score i guess would just be the max of current score which was value and minimax of this is our recursive call here of the board copy depth minus 1 and i guess we do false now because we're no longer the maximizing player now the minimizing player cool and then finally what we'll want to do is return the new score so that's the max of the value and the minimax and we want to do the same thing for the minimizing player so else we know we are the minimizing player and that is the code right here we want to initialize the value to positive infinity and now copy the same thing for column invalid locations same as before row equals get next open row of board and column we need to copy the board again because we don't want to use that same memory location otherwise we're going to have issues when we try to recurse back up the tree drop piece equals oh shoot board copy row column and now we are going to be the minimizing player so what we need to do is actually make this piece the player piece and same before the new score though is going to be because when we are if we remember this the minimizing player is trying to take the lowest value so if you see this node right here this 10 10 and plus infinity 10 is the lower node so it always take that so in the minimizing player case we want to take the the new score would be the min of the value and minimax of word copy depth minus one and then we want to do true for now switching to the maximizing player so this this false and this true is what's allowing us to switch back and forth between this maximizing player and the minimizing player in the context of the mini max algorithm and make sure you understand the mini max algorithm watch the video that i've posted previously on it or just kind of review this wikipedia page to kind of get a feel for how it works because that is crucially important for understanding how this new ai is working okay and we want to return value or the new score cool i think this is about it right i think so however the one issue i have right now as i look at this is we're returning the score which is good to know like how good of a move can we make at a certain location but we also need to augment the score with the column that produces that score so in addition to just producing or just figuring out what the best score is we need to figure out which column produces that score so to do that we're going to just actually get rid of this max and min we're going to just break it out a bit just it's easier to keep track of what we need so we're going to say that the new score is equal to the mini max that same thing for this and instead of doing it in just one line we're going to just bring it to the new line so if new score is greater than the value value equals new score so this is the same thing right now if we kept it like this this is the same thing as the max the one liner we just had but we also want to do something like column so this this column right here is like the best column you can get so we could initialize this to just a random choice of the valid locations random.choicevalid locations and same thing for down here column equals random.choice of the valid locations all right so now what we have is column equals whatever column we are currently iterating on that gave us this new score when we recursively did the mini max algorithm so that would be call and okay that looks i think this is good here we also know need to now implement it down here same thing so this time it would be if new score is less than the value we want to do value equals new score and column equals call and we want to okay and then finally what we have to do is instead of just returning the score we want to return both the score and the column that produces that score so we'll return a two pole so we'll do column and new score or i guess value is our final thing that we return same thing here turn column and value and then up here these are the terminal conditions so if we get a terminal condition we don't actually know which column produces the terminal location because it's just looking at the board particular but we can get that very easily because we would get it through this so basically what we can do yeah so basically when we returned this new score all right yeah so we're gonna augment these none just has to basically all be the same format even though these wouldn't actually produce a column value none zero okay so we don't know which column produces those and basically what we need to do now is if we want to adjust the score we need to get the first index we need to get the first index here is that it i think that might be it and then i guess right here too this would just give us a score so in this case two we need to do none comma score positions so basically we're always because this is a recursive function we always need it to return the same format so we're now just saying turn us the the best score in the column that produces the best score cool i think this is it we can delete this pass so let's go ahead and real quick test out our s you know our first attempt at the mini max and to do that we can just go down to here and instead of best pick best move what we're going to do is do column equals minimax of zero or actually what am i thinking it's bored then it is depth so let's just for sake of simplicity we'll start with depth equals two and then we are the maximizing player so that is true let's see if this works nope damn it is the location where we get the error where do we get this error too many indices for array 267. if is valid location oh shoot let's see oh all right yeah sorry i was just trying to get the column out of here but we actually get the column and the mini max score so i kind of have to unpack the tuple now it's not just one thing it returns okay all right so what we should see now i think at depth of two it should be smart enough to block oh i guess not yet i was hoping that it would block me from getting this situation where i could win on either side i'm gonna try increasing the depth to three and see if it does that all right so what i'm trying to see is if it's looking ahead in the future and how we can tell that is if it's looking ahead in the future it should prevent us from getting winds that can we can kind of have two spots where we win so if i drop it here it needs to block me so that i can't get the horizontal three on the bottom still is not doing it what the heck it might be some sort of error i'm going to just try one more time maybe seeing if it's not looking far enough in the future and do this depth four so this is how far of a branch it's checking yeah it's definitely having some issues here it did win that time but it's always picking zero it's always picking the farthest to the left to move why is that the reason it is always picking the farthest to the left even at depth of four is let's go up and just check why is my mouse being so fidgety today won't scroll up for me is okay i see it if we look at where we are returning this column and value it is inside of the for loop so it's on the first column basically and also we initialize the value to negative infinity so it's basically okay it's figuring out that whatever score is on the farthest to the left column it's going to be bigger than negative infinity so it's setting that column here and then returning it immediately what we need to do is just backspace this once and hopefully now it should be pretty good and i probably don't even need to go depth of four think about i'd go to like depth of three and it will still block just fine okay i kind of want to go first just because i feel like i'll see it easier okay so now it needs to block one of my horizontals which it does now which is good it needs a block cool that's looking good okay let's see if it's looking into the future it should put its piece right here next if i put it up here cool it's looking good it's like smart enough to know that like this position right here where my mouse is was the best spot to put it because that gives us a win here and here so if it looks in the future and knows it's going to win it's not going to be easily blocked by me looking good cool all right uh real quick um just because i'm not positive how i feel about the values that i currently set all these things up here in the score positions function too or i guess evaluate window i'm going to just change these to what i set when i was playing around with this so the window count equals 4. this actually is pretty irrelevant at this point i think the only reason this would play in is if we set the depth equal to zero and it had to use just this evaluate window so i'm going to leave this in here but this know that this is probably not too important anymore i'll still keep this value okay three and one so three in a row i'm going to give this a weight of five two in a row i'm going to give this a weight of two but remember that it's in multiple directions so you could have multiple twos in a row and it adds up uh opponent equaling three and one this is not as important anymore because and also i have to kind of make this a smaller value because it's this is factored into the terminal condition case so it will look ahead in the future and make sure that it's not losing so i'm going to make that minus four then the other thing that i change to is instead of making this times six i'm going to make the center column worth plus three and these are values you should play around with so if you really want an expert level ai play around with these values but honestly i feel like this ai right now like i'm gonna just play a game like try to win i guess it's still not amazing i guess it's depth three right now but yeah like look it like it's making pretty good moves and it's like gonna know to win if i put it there i think this is a pretty good ai at this point uh to make it even better what we can do is increase the depth of how far it's looking uh where is that so if you make it four it's looking if we think about how the mini max works uh it's looking farther down in this tree so this four would represent how far it's looking down it's like right now we're at depth four so that's why it's looking this far down and you can think of this graph as being all the possible ways a connect 4 board can work so depth 4 let's see this would be even better of an ai because it's looking farther in the future and we could even make this better let's just go to depth 5 and see what happens okay i kind of have a problem i don't know if you noticed it but i did depth five and it seems like it's just oh it finally made a move it just takes forever to make a move at this point try to like think about why that is you can even pause the video if you're trying to like think about why that is but yeah i mean if we're going it if we're doing depth five and we think about what the minimax is doing it is branching out at every step so from the top of my hands is the terminal node every time we do a higher oh my god i'm trying to be even every time we do a higher depth we expand exponentially more branches so this is basically like seven to the fifth power or maybe even six power because we also look at zero so it is a lot of branches to expand so it's going to run slowly so 5 is probably the max you can really do otherwise if you go to 6 your computer is just going to explode not really but basically it's probably going to freeze up and crash on you if you go to 6. so this gets us to the final thing i'm going to cover in the video and that is the alpha beta pruning so it's basically allowing us to go to deeper depth by eliminating a lot of the moves we have to look at and i didn't actually cover this in the video i posted about how does a board game ai work so check out a video i posted in the description on kind of the overview of it uh at a very high level yeah basically it is figuring out that there are certain branches that a computer is just or an ai is just not likely to take if it realizes like in a move if it goes a certain position that it loses it's not going to go down that branch so you can just not look further down into the future there yeah so check out that video real quick on alpha beta pruning and then once you watch the video we will implement alpha beta pruding to allow us to more quickly search the depths we want okay alpha beta pruning so just like when we are doing the mini max what i recommend is look going to the alphabetical pruning wikipedia page and there's also pseudocode there so i think that's helpful the pseudocode is helpful to follow but it's just basically an adaptation of what we've already seen with the minimax but now we have these two additional parameters alpha and beta and you should understand what these are now that you've hopefully watched the video or already had some knowledge about how it worked so we're gonna go up to our sorry i have this on another screen because i don't want to screw this step up um you should yeah go to your mini max function and we're going to be adding our alpha and our beta to that function and so we're also going to have to pass this into all of our other calls of it alpha beta and right here alpha beta and we'll also have to do it down here when we actually call this and if we remember the alpha and beta should be initialized to negative math infinity and this is just i'm looking at right here the initial call right here negative infinity negative math infinity math infinity and true so that's the initial call and then all right so let's define this new mini max function with alpha beta this should be a comma here all right so let's see what it's doing so basically all we're doing is that in addition to getting our max value we also are just evaluating the alpha so we can do this below the new score stuff so alpha equals max of the new score or i guess the value at this point we can do value because it would have changed value and alpha and then which we want to do is or i'll just do it in the same format that wikipedia is doing so alpha and the value okay and then what we want to do is if alpha is greater than or equal to beta then we want to break out of our loop and that is good and then same thing down here now we are the minimizing player so we want to do beta equals the min of the beta and the value and if we find that alpha is greater than or equal to beta then we've reached a condition where we're going to just break off because we don't need to look any more further in the tree and we'll break so i think that's all we need to do to implement alpha beta it's pretty straightforward after you have the mini max up and so let's run this real quick i'll probably get an error we'll see ah shoot i need to not use my track pad because keep accidentally dropping pieces okay it has a piece there this should drop quickly if before depth five was taking a while now that we did this it should drop quickly which was pretty good it was pretty good reasonable more reasonable than it was before for sure so i think that is looking pretty good and what we can honestly even do is now that we're searching so far what we can do is we have that initial time delay i can actually just comment that out at this point so this should be pretty quick and we're going at depth 5 which is pretty far down yeah it's not bad i think even if we go to depth six it's not too bad either but like before if we went to depth six it would take forever we'll see it also depends on what type of processing power you have so i'm going to try depth six here it's taken a bit for sure did i do this right it but not nearly as long as it was taken before yeah the the alpha beta pruning looks good and one thing i just realized is that um you're gonna see more performance gains kind of as the game goes on further because there's gonna be at the start of the game there's more possibilities and less places that can break out of that alpha beta thing but as a game goes kind of on you'll have more breaks frequently so the performance gains will be definitely seen as the game continues on all right that's we're going to end um what i would say too is like we have a pretty dang good ai as is but i didn't there are additional things we could do to make it even better so a couple ideas that i'm thinking about is uh one thing is that currently it doesn't weight lower like so if you think of the game i'm gonna run it real quick if you think of these rows down at the bottom it's like this first row here the second row etc it should we should wait um we should wait what am i trying to say we should wait the lower wins so like when you have three in a row that's lower that should probably be weighted higher than if you have a three in a row that's at the top here and that's because you know as the game fills up people have to push them on the lower rows before the upper rows so like if you are strategically playing as an ai you would want to like put your pieces down lower so you get more chances to win also what i recommend is you can make it even better than that by watching the even odd connect four strategy that i posted that's like a really cool strategy that you could try to implement with the ai um you could also let me think i mean there's other things you could play around with off the top of my head i'm kind of spacing right now but the values all could be tuned of the scores um but yeah just have fun with it have fun thank you guys for watching and peace out\n"

Random Videos