What's Behind Port Smash - Computerphile

The Art of Port Smashing: A Side Channel Attack on Modern CPUs

The concept of port smashing refers to a side channel attack that leverages the shared resources of modern CPUs to extract sensitive information from other processes. This technique exploits the fact that some instructions, such as multiplication and division, can be executed simultaneously on multiple parts of the CPU core. By monitoring the execution time of these instructions, an attacker can determine which operations are being performed by another process, thereby revealing its internal state.

To understand how port smashing works, let's consider a scenario where two programs, A and B, are running concurrently on a shared CPU core. Program A is performing some calculations, while program B is executing a different set of instructions. If both programs require the same resource, such as a multiplication unit, they will slow each other down. By monitoring the execution time of these resources, an attacker can determine which operations are being performed by program B.

The port smashing attack relies on identifying patterns in the execution times of specific resources on the CPU core. These patterns can be used to infer the internal state of another process, including the type of instructions it is executing. By analyzing the timing sequences generated by multiple processes, an attacker can construct a noisy signal that contains valuable information about the other process.

One notable example of port smashing is the demonstration of this technique on OpenSSL, a widely-used encryption library. The researchers used a custom-built program to monitor the execution time of OpenSSL's encryption algorithms and identify patterns in the timing sequences. By applying signal processing techniques to these sequences, they were able to extract sensitive information about the private key being used by OpenSSL.

The impact of port smashing is significant, as it allows an attacker to gain insights into the internal state of a process without directly accessing its memory or code. This can be particularly useful in situations where encryption is employed, such as secure online transactions or communication protocols.

To mitigate this threat, some experts recommend disabling hyper-threading on individual CPU cores or modifying operating systems to allow for selective enablement of hyper-threading. This would prevent an attacker from leveraging the shared resources of a CPU core to extract information about another process.

In conclusion, port smashing represents a powerful side channel attack that can be used to extract sensitive information from modern CPUs. By understanding how this technique works and applying signal processing techniques to timing sequences, attackers can gain valuable insights into the internal state of other processes. As the use of encryption continues to grow in importance, it is essential to develop countermeasures against port smashing and other side channel attacks.

The Port Smashing Example: A CPU-Specific Technique

The port smashing technique relies on exploiting the shared resources of modern CPUs to extract information about other processes. The example code provided demonstrates this concept using a custom-built program that monitors the execution time of specific resources on the CPU core.

To understand how this works, let's consider an Intel Skylake CPU with hyper-threading enabled. In this scenario, each physical core is divided into two logical threads, allowing for more efficient utilization of system resources. However, this also creates opportunities for side channel attacks, such as port smashing.

The example code uses a Linux-based operating system to demonstrate the port smashing technique. The program consists of two main components: the attacker and the victim. The attacker runs a custom-built program that monitors the execution time of specific resources on the CPU core, while the victim is a separate process being executed by the CPU.

By monitoring the timing sequences generated by the victim's instructions, the attacker can identify patterns in the execution times of specific resources. These patterns can be used to infer the internal state of the victim process, including the type of instructions it is executing.

The key insight behind port smashing is that some instructions, such as multiplication and division, can be executed simultaneously on multiple parts of the CPU core. By monitoring these instructions, an attacker can determine which operations are being performed by another process.

Is Port Smashing Practical for Hackers?

While port smashing represents a powerful side channel attack, its practicality depends on various factors, including the target system's architecture and the attacker's goals.

To demonstrate the feasibility of port smashing, researchers have developed a custom-built program that monitors the execution time of OpenSSL's encryption algorithms. This program is designed to work with Linux-based operating systems and Intel Skylake CPUs.

However, it's essential to note that running this program requires specific hardware and software configurations. The attacker needs to have access to a system with an Intel Skylake CPU, which has hyper-threading enabled, as well as a Linux-based operating system.

Once the necessary conditions are met, the program can be run alongside OpenSSL to measure the execution time of its encryption algorithms. By analyzing these timing sequences, the researcher can extract sensitive information about the private key being used by OpenSSL.

In conclusion, while port smashing represents a powerful side channel attack, its practicality depends on various factors, including the target system's architecture and the attacker's goals. Nevertheless, researchers have demonstrated the feasibility of this technique using custom-built programs and Linux-based operating systems.

"WEBVTTKind: captionsLanguage: enThere's been some noise over the past week about and a paper that's come out and an exploit the papers called portcontention for fun and profit people be referring it Port Smash. So what it does is it actuallyyou got open ssl running and it's using a private key and you've got another program which they call that spy program which runs alongsideIt and is able to extract the private key from the open ssl program even though it shouldn't be able to do thatSo I thought it was interesting to have a little chat about the way it's exploiting the cpu so again likespectrum meltdown and quite a few of the exploits that have turned up over the past year itsexploiting the fact that people have tried to make the CPUs run faster and faster and sort of squeezeevery last ounce of speed out of the actual cpu technology that's there andwhat this is specifically targeting is what's put into most intel cpus and AMDwhich is hyper threading. So what is hyper threading well normally when we think abouta computer system we have a cpu in there andoriginally that CPU would execute one single stream of instructions and process data with themyou could have two CPUs in there's got someMultiprocessor system or a multi-core system depending on how you wire them up and then you could have two separate streams of instructionsbeing executed andthe way that those CPUs are designed isyou have three stages that each instruction has to sort of go through that in the cpu that's for them it's a smaller stage butWe can think about this of three broad stages we have to sort of fetch the instruction from memorythen we decode it to work out what we actually wanted to do and then we execute it andTo make the cpu run as fast as possible then you end up with variousexecution units in your cpu which do various things there might be an algorithmic and logic unit which will do addition and subtraction and variouslogical operations. There might be bits that can load and storevalues from memory. There might be bits that can do various other sorts of calculations multiplications and so on addresscalculations floating point operations vectorprocessing and so on so you have lots of theseexecution units in your machine and one of the things you got was sort of a superscalar architecture where you'dfetch two instructions and execute them at the same timeproviding that they were using different parts that you could sort of fetch a value from memory while adding a value onto anotherregister as long as they're using separate registers and so on. So the idea is you've got if we sort of draw apicture you've got some sort of logic here which we'll call decode and you've got going into that a stream ofinstructions coming from memory. So you're feeding them in there and this is actually breaking them up into a series whatof what we call micro operations that do different things, so onex86 instruction may get broken up intomultiple micro operations for example to load a value from memory add that value onto a value in a register and store that resultout back into the same memory location it's all three operations so it gets split so which use differentexecution that operations. Some have to happen sequentially some can be done in parallel depending on what you're doingSo we end up with a series of execution operations - so let's say we've got an ALU andWe might have say a division unit in thereWe might have another one with an ALU it might have some things to do - vector type stuffwe've got another one which has got another ALU and a multiplication unit on there andthere's various ports that these are connected to -- so you've got a sort of portOne here which connects to this set of operationsPort two will say here and this is a generalized version which is connected to these operationsQ:Are these physical ports like physical wires?Erm they'll be parts with inside the CPU so the way that things are connected up... and this block is a sort ofscheduler which is getting the decodedmicro-ops from this section andsort of sending them to the right ports as they're being ... as they're available and so on to cause the right operations to happen inthe best order to make most use of the system. You'd have a few more over here that says this has got a load portAnd so on so what you can do is you can start pulling the multiple instructions here and as long as they're not depending onvalues that previous instructions have createdand haven't completed yet then you can sort of schedule them on differentparts the unit - so if you had oneinstruction which adds value one on to EAX you could put it on to this port the next insert is adding something onto B EBXYou could put it onto that port (they're registers within the CPU) and they could execute at the same time. But the problemyou've got is thatsometimes you get a sequence of instructions which either asequential so you add one to a value in a register then you multiply that register by twoAnd then so on - you've got to execute them and things and so you can't always make full use ofyourAvailableexecution units down here in the CPUSo the idea which happened manymany years ago and sort of fell out of favor and then was brought back with the Pentium 4 in the mid2000s and has existed through on various CPUs both from AMD andIntel is hyperthreading - you say well ok this is only a single core but let's make it present itself asif it was two coresTwo logical cores we've got one physical core with one set of execution units but we have it appear to the operating system as twological cores so the operating system can have two - as far as its concerned two - independent bits programs threads whateverExecuting on there and so they'll be two streams of instructions executing and so we'd have anotherstream ofinstructions coming in to the decode logic and thenthe CPUs got a better chance of keeping things running at the same time because you can either run an instruction from hereBut if you can't schedule that it might be out of scheduled instruction from the other stream of instructions. You may get some interestingthings so for example on this one that we've drawn we've only got onemultiplier we've only got one load and store unit. If we have both of these trying to do amultiply then one will have to wait forthe other to complete and the sort of way that CPU might do that it's a sort of round-robin that the firstclock cycle this one gets the multiply on the second clock cycle that one will get the multiply and so on. So that's the basicidea behind hyperthreading - you've got twological processors that are used by the operations to schedule the jobs on your computerbut they're executed by one physical core on the CPU.Q: So hyper threading is different to multi-threading?So multi threading is the idea that you split your program or your programs into multiple threads of operationand then they get scheduled either by the operating system on to differentCPU cores if you've got multiple ones or onto one single core by sort of executing a bit ofthread one than a bit of thread two you than a bit of thread threeeffectively like you could watch multiple programs on YouTube at once by chopping between the different programs and watching sort of bits after the otherBe quite garbled watching multiple computer files in that sort of way. So unlike a normal photograph/In a very basic sense if you've got/Bletchley Park/So that's a way of doing things in software and programming/yeahIt's/hyper threading is a bit more Hardware So the idea is there, okay well you've got these different threads of executionokay if you've got multipleCores multiple processing units then you can schedule your each of those threads ontoEach of the cores and have them executing at the same timebut a few limitations on access to memory and things because and so onWith hyper threading you say okay we'll have the idea we got twothreads of executionhappening at the same timeBut we've actually only got one physical set of units to do it so it's the hardware that's doing the scheduling because it cando a finer grain than the operating system can. The operating system is still scheduling across those twological cores but the hardware can then say well actuallythis one is trying to multiply this is trying to add I can run them at the same timewhereas this is trying toMultiply and this is trying to multiply I need to sequence it so it can actually start to do a finer grainsort of threading operation and sort ofknit them togetherQ: So where's the problem come in then? So the problem comes in thelet's say we've got a program where we want to find some information about what it's doing and let's say this program herewe want to know what sort of instructions it's executing well what we could do for exampleIs if we wanted to find out if it was executing multiply instructions on the example we've got here we've only got onemultiply unit so if this isTrying to execute multiple instructions and this is trying to execute multiply instructions then they're going to have to take turns to executethose multiply instruction on the other and if the one we're trying to find out on isn't executing multiply instructions thenThis one will be able to execute multiple instructions one after the other so what the port smash paper have done isthat they've written their program that willexecute certain types of instructions in a loop so they have a repetition of about 64 let's say it'sthese various different ones but so is the 64 add instructions to make use of all the ALUs on Intel CPU - there's four ofthem that it can make use ofsay just fourcontinuous adds we should all exceute at the same time if nothing else was running on that CPU and it times how long theytake to executeIt does that and it gets an idea of how long they take to execute and then you run the same thing at the sameTime as the other program is running and if it takes more time to executethan the other program then you know that program must be also executing some add instructions andSo what you can do is by looking at which of thesebits are being used by running instructions then you can find out what type of instructions are being executedon the other sideNow the reason why it's called port smash is becauseWe've drawn this a time one multiply it but that's also on the same part as an ALUfor example and what they actually do is that these are all connected to oneport of the scheduler within the CPU and so if we wanted to say use the multiply bitof this CPU then we have to run out of port 2 which means the ALU on port 2 can't be used as wellcan use one of the things inthis column same for example here if we want to use a divide we can't do any ALU processing or vector processingso we could run instructions that we know will tie up one of these specific ports or will tie up a group of them andThen we can see whether the other program providing we can get it scheduled onto the same physical execution unit which isn'tImpossible to do is also trying to use parts of the system on that point what the port smashexample program does is cleverly uses certain instructions which tie up a particular port on theCPU coreTo see whether that one is being used by the other program and by measuring the time we can see whetherThat has been done so we've got this side channel where we can seeWe can get insight into what the other process is doing as a black box we say ok it must be trying to execute thistype of instructions because it's interfering with our use of this port or it isn'tinterfering with this use of this port. So what they do is that they run this alongsideOpenSSL doing its encryption of the task that's been set to do and it can measure what type of instructions it's trying to executeWhat it ends up with is a series oftiming sequences that shows how long things are taking at particular points or sometimes it be running it full-speed some points it'll be running slowerand that gives it what they call a noisy signal which some signal processing they apply to it they can use to actually extractthe private key that was being used by open SSL purely from watching the timings that are going there. So what they've demonstrated is thatby running a program they can sort of monitor enough information because they can see what the other CPU is doingby what their program is doing Ie if the other program is trying to multiply at the same time as they're trying to multiply andthere's only one multiply unit that it will slow both programs down and you can detect thatThey can start to work out what operations the other program must be doing and then start to work out what that wouldmean in terms of what that program is doing and backtrack from that to actually extract information thatideally they shouldn't be able to accessSo the upshot of this is that one of the recommendations is that perhaps in certain circumstances you might want to turn offhyper-threading either completely and just go back to having four physical cores that only execute for separate threads rather than four physical coresexecuting eight logical threads or the very least modify things so that the operating system has the ability to turnhyper-threading on and off on each processor coredepending on what process is running on this because for some processes it doesn't matter and extracting information from it wouldn't be thatimportant but from others use of encryption programs you really don't want this sort of side channel there.Q:Is this operating system specificor is thiswhat's the deal there then?It's not operating system specific it will beCPU specific so the example they've got is for the Intel skylake in KB LakeCPU families you could probably do something similar with other CPUs that implement hyper threadingYou would have to calibrate your system depending on that but that's not a problemIt's not implementation specific you just have to tailor it to the machine are you looking at.Q:Is it a practical thingfor hackers to do this? Is it easy or them to do?The example codes there you can download it off github run it and Det run the demo on a Linux machine I don't have onewith the right sort of CPU here toDemo it unfortunately there is potential to do this there arelimitations on what you can do with it you need to have your spy program running on the same physical core as theOther program otherwise you won't have full access to the informationI'm sure in the right circumstances you could use it to get information out if it hasn't already been done, soif we hit thisBoom it goes off and sets a few things up the screen goes black but if I switch back to my other one, I typesu againit's logged me in as root and of course\n"