February 13, 2021

On the Mathematics behind Elliptic Curves over Finite Fields

A guided tour of elliptic-curve math—point addition, multiplication, finite fields, and why 256-bit ECC security is so hard to break.

CryptographyMathSecurity

An elliptic curve is comprised of all points satisfying the following equation in Weierstrass form:

y2=x3+ax+by^{2} = x^{3} + ax + b

with the condition:

4a3+27b204a^{3} + 27b^{2} \neq 0

The above condition exists to ensure that the curve doesn’t have a cusp, or a point in the curve where it must reverse direction. Some examples of elliptic curves:

Elliptic curve graphsElliptic curve graphs

An interesting property of these curves is that they’re all symmetrical about the x-axis. This is because yy is squared so there will always be two solutions to the equation given any xx value (except for y=0y = 0):

y=±x3+ax+by = \pm\sqrt{x^{3} + ax + b}

So why are we talking about these curves? Well, these curves are actually critical to modern day cryptography. Elliptic curves are used in Elliptic-curve cryptography (ECC) providing a more efficient way of doing public-key cryptography compared to the very popular RSA and classical Diffie-Hellman algorithms. To achieve good security with RSA, one needs to use a large key size, typically 4096 bits long. This however has a performance cost. ECC can achieve stronger security than 4096-bit RSA with just 256-bit keys! ECC can be used for encryption but you wouldn’t really do that since you’re restricted on the size of the plaintext (non-encrypted data). Where ECC is most applicable is in key agreement, most notably used in the Elliptic-Curve Diffie-Hellman (ECDH) algorithm. We’ll talk about ECDH and digital signatures with ECC in another post.

Point Addition

Today I want to focus on the mathematics behind ECC because it’ll give us insights into how we can generate secure private and public keys and thus perform key exchange to establish a secure connection between two parties.

We’ll start with addition, adding two points on the elliptic curve. This may seem odd at first, but stick with me, eventually it’ll make sense why we care about adding points on a curve. Let’s take the following curve that’s used in Bitcoin, Secp256k1:

Elliptic curve graph AdditionElliptic curve graph Addition

On an elliptic curve, we add points geometrically. Place PP and QQ on y2=x3x+1y^2=x^3-x+1, then draw the straight line through them; it meets the curve again at a third point, which we call R-R. Elliptic curves are symmetric about the xx-axis, so “negation” is just a vertical reflection—flip R-R over the xx-axis to land on RR. That reflected point RR is the sum: R=P+QR = P + Q.

Point Multiplication

With addition defined, let’s take it a bit further and look at multiplication. We can actually just define multiplication as point addition as follows:

n×P=P+P+P+...+Pn \times P = P + P + P + ... + P

We add PP to itself n1n-1 times. On the curve this proceeds as follows: we draw a tangent line from PP and find the point where this tangent line intersects the curve. We then reflect this point about the xx-axis and arrive at our new point, 2P2P. We can repeat this process to find 3P3P, by drawing a tangent line from 2P2P, finding the new intersection point with the curve and reflecting it about the xx-axis.

Elliptic curve graph multiplicationElliptic curve graph multiplication

Interlude 1 - Modular Arithmetic

Before diving deeper into these elliptic curves, I want to spend some time talking about a very important operator used almost everywhere in cryptography: the modulo operator. The modulo operator returns the remainder after dividing by a number. For example, 63mod1763 \mod 17. The question we ask here is how many times can 17 go into 63. Can it go in once? Yes, 17×1=1717\times1 = 17

How about twice? Sure. Three times? Yep, 17×3=5117\times3 = 51. Alright what about four times? 17×4=6817\times4 = 68. Alright so we can’t put 1717 into 6363 four times.

The most we can divide into 63 is 3 times. The operator returns the remainder after performing division. So recall 17×3=5117\times3 = 51. What’s our remainder? That’s 6351=1263-51 = 12. Thus, 63mod17=1263 \mod 17 = 12. Now, you can find calculators online that can calculate the modulos for you, I just wanted to walk you through the steps here.

What’s so special about this operator? Well, for one it’s actually not easily reversible. If I just told you what the remainder was but hid 63 from you, you wouldn’t be able to reverse this operation to find 63 without trial and error. That is, you’d have to perform a similar approach to what we did above, try numbers one by one until the equation is satisfied. Now, imagine using extremely large numbers and trying to reverse the modulo operator. Good luck.

Another reason why it’s important in our case is because it limits the domain of a particular group by a certain number called the modulus, in the above example 17 is the modulus. The answer to xmodnx \mod n where xx is any number, will always be less than nn. Why is this important? Well, for practical reasons, we can’t have immensely large numbers used in cryptography because these numbers take up actual memory in our computers. By using the modulus operator, we can solve this issue.

Interlude 2 - Finite Fields

A finite field is a field that contains a finite number of elements. A field is just a set or a collection of distinct elements on which addition, subtraction, multiplication and division are defined. What the hell does that mean? It just means that you have a set of distinct numbers and the four operations I mentioned above are all valid. For example, the set of all natural numbers would be a field (just not finite) {1,2,3,4,...}\{1, 2, 3, 4,...\}. I can add, subtract, multiply, and divide any number in that set by any other number in that set.

The set of all natural numbers is infinitely long (natural numbers never end) so it wouldn’t be a finite field. A common example of a finite field would be the integers modn\mod n where nn is a prime number. This is a set comprised of the set of all integers with the modulus operator applied on each element: {0,1,2,3,...,n1}\{0, 1, 2, 3,..., n-1\}. Because the four operations are all valid in this set, this is considered a finite field. You can define this field as FnF_n where nn is the modulus.

The Need for Primality

The modulus needs to be a prime number because this ensures that every non-zero element has a modular multiplicative inverse. I know we’re getting deep into the woods here but stick with me here. A modular multiplicative inverse of an integer aa is another integer xx such that the product of those two is equal to 1 under a particular modulus nn. This looks like (ax)modn=1(a \cdot x) \mod n = 1.

For example, with a=3a = 3, n=11n = 11 we can find the modular multiplicative inverse of aa by trying different values of xx such that the equation (3x)mod11=1(3 \cdot x) \mod 11 = 1 is true. Eventually, through trial and error we find that (34)mod11=1(3 \cdot 4) \mod 11 = 1 so 4 is thus the modular multiplicative inverse of 3.

Elliptic Curves over Finite Fields

I mentioned earlier that the modulo operator is important because it helps us restrict the size of a particular set. We need to do this because if we use our elliptic curve above to generate our private and public keys, then it’s possible these keys can’t be stored in 256 or 512 bits. This matters because if we start using larger keys then we start suffering performance loss without any gains in security. Don’t worry, I’ll explain how we generate keys shortly.

We want to restrict our elliptic curve to just the positive integers. We can do this by defining the curve with respect to the integers modulo some prime number. Mathematically, this looks like:

y2modn=(x3+ax+b)modny^{2} \bmod n = (x^{3} + a x + b) \bmod n

Thus, we now have the elliptic curve defined over a finite field, the set of all integers modulo nn. In Elliptic Curve Cryptography, the modulus is chosen as some very large prime number. How does this affect our graph though? Since we’re only interested in integers, this means that our graph will look discontinuous:

Elliptic curve over finite fieldElliptic curve over finite field

Fortunately for us, point addition and multiplication still follows the same rules as above. All we’re really doing here is just restricting the size of our points.

Private and Public Keys

The Elliptic Curve Discrete Logarithm Problem

I spent a lot of time going through the mathematics behind elliptic curves and we’re finally ready to see the connection to cryptography. Recall earlier we looked at point multiplication where we multiplied PP by itself kk times (I’m switching to kk to not get confused with nn the modulus we introduced earlier). We define PP as our base point and kk the number of times we multiply PP by itself. The result is Q=kPQ = kP. It turns out that if I just give you PP and QQ, and task you with finding kk, there’s no easier way of doing this than trial and error. This is known as the Elliptic Curve Discrete Logarithm Problem (ECDLP). This is the elliptic curve variant of the DLP that makes classical Diffie-Hellman work. We won’t go into the classical DLP here for the sake of brevity.

Now, that’s really all there is to it. All of ECC is based on the difficulty of figuring out how many times you’ve multiplied PP by itself. Since we want to keep kk secret, this would make a great private key. QQ would then be the public key since it’s okay to share this value.

Dependence on the Curve

Not all elliptic curves are great for ECC. Fortunately, there are standard curves that are battle tested so you should use those. For example, secp256k1 or Curve25519. These curves also provide a fixed point PP to use, such as P=9P = 9 for Curve25519. So we just need to define a random 256-bit integer to use as our secret key (kk) and we can generate the public key.

Security

For a key of size nn bits, we can be sure to have at least n/2n/2 bits of security. This means an attacker would need to perform 2(n/2)2^{( n/2 )} operations to break our keys. Refer to Security level - Wikipedia for how these numbers were derived. So for our key size of 256 bits, an attacker would need 21282^{128} operations to brute force search our key space. An unfathomable amount of time is required to perform such an attack. Here is a great video by 3Blue1Brown that discusses how secure 256-bit security is.

Where do we go from here?

With all this theory out of the way, we can start using ECC to perform some key agreement to establish a shared secret and thus encrypt data between two parties. I'll explore that in a later post.