CREATE2 - know your contract's address well...

Hail The Lord - Apr 18 - - Dev Community

Introduction

In this post, we will take an in-depth look at the CREATE2 opcode, exploring its functionality and potential vulnerabilities. Rather than offering a standard overview, this article is the culmination of extensive research I’ve conducted on CREATE2. It will serve as a beginner’s guide to contract creation; and also, will bridge the gap in understanding the subtle mechanics behind this opcode. So, grab a cup of coffee, and let's dive into the intricacies of this often overlooked, yet sneaky, opcode.

CREATE vs CREATE2

Image description
Image description

CREATE :

The CREATE opcode deploys a contract at a 20-byte address that, while predictable, remains outside of user control. The deployment address is determined solely by two factors, neither of which is influenced by the user. As demonstrated above, users cannot modify or control the sender_address, and the sender_nonce (commonly referred to as the nonce) will always increment sequentially. This rigid form of address generation limits flexibility, as users are unable to specify or alter the nonce, making the process less adaptable to custom requirements.
Image description

sender_address : The contract address deploying the new contract.
sender_nonce : The number of contracts already deployed by sender_address.

Questions That Should Arise :

If the nonce represents the number of contracts already deployed by an address, shouldn’t the sender_nonce initially be 0, But in the attached image, it shows 1 !

If you’ve asked yourself this question, well done—that’s exactly the kind of inquiry skilled security researchers make and then explore further.
- Before EIP-161, the nonce used to start from 0. After EIP-161, it was changed to 1. While the reason behind this change is out of the scope of this blog, it’s an excellent habit to start researching these kinds of questions. I highly recommend reading the EIPs whenever you have the time.

What exactly is this keccak256 ?
- keccak256 is a cryptographic hash function widely used in the Ethereum Virtual Machine (EVM). It takes any input and always generates a unique 32-byte hash. Even the slightest change in the input will produce a completely different 32-byte hash.

Since keccak256 outputs 32 bytes hash, but an Ethereum address is 20 bytes, how do we derive an address from the hash ?
- This is where [12:] comes into the picture. What [12:] indicates is that we start from the 12th index of the 32-byte hash and take the data until the end. This results in a 20-byte segment, as shown below :
Image description

Here, the address corresponds to the bytes highlighted in the red box.

NOTE : The above image illustrates the logic we want, but the implementation code is slightly different. In Solidity, the conversion from 32 bytes to 20 bytes is handled differently because Solidity doesn’t support a .slice method like in some other languages as shown below :
Image description


CREATE2:

Deploying contracts using CREATE felt a bit rigid, didn’t it ? CREATE2 aims to solve that issue. It also deploys a contract at a predictable 20-byte address, but with full user control. It adds a constant hex value, 0xff, as a prefix. While the sender_address is still part of the equation, just as it is in CREATE, the key differentiator here is the salt, a 32-byte user-provided value. The final factor is the deployed_bytecode, which refers to the bytecode of the contract about to be deployed.
Image description

0xff : A fixed 1-byte value.
sender_address : The contract address deploying the new contract
salt : A 32-byte value provided by the user.
deployed_bytecode : The bytecode of the contract to be deployed

Questions Time ❓

Why do we need 0xff, It's a constant value and doesn’t contribute to address change, right ?
- Good catch! Yes, it’s fixed for every address calculation, but its purpose isn’t address differentiation. Instead, it ensures the input used by CREATE and CREATE2 is different. Even if the inputs for both opcodes were the same (hypothetically), the outputs would still differ because of the presence of the fixed 0xff byte in CREATE2

What exactly is this salt - Does it make my new address salty enough that no one can tamper with it ?
- Haha, not quite! The salt is essentially a 32-byte value controlled by the user and unique per deployment. You might occasionally see users pass a uint256 as a salt without errors. That’s because the EVM is capable of converting uint256 into a 32-byte hash. More on this is beyond the scope of this post—cue research time !

Does deploying a contract using CREATE2 require the bytecode of the contract before its deployment ?
- This can indeed be confusing. The bytecode is generated after deployment, so how do we get it beforehand ? For now, just know that there are two forms of bytecode : one before deployment and one after. In this case, we use the bytecode generated before deployment. We’ll cover this in greater detail in the second post of this series.

one can obtain bytecode of a contract using type(ContractName).creationCode.

NOTE: The code snippet below shows how in Solidity, the address generation logic is written without [12:] :
Image description

Enough of the theory rant — how do you actually use CREATE2 in your code ?

Wait, hold on! - Didn’t we already discuss how to deploy a contract at a deterministic address using CREATE2 above ? 🤔

This is a point that confuses many, myself included. So far, we've only explored how these opcodes calculate the address of the contract—which, by the way, hasn’t even been deployed yet! The code snippets above show how to predict the address before the deployment happens using CREATE or CREATE2.

Now, let’s dive into how to actually deploy a contract using CREATE2 in Solidity :
Image description

type(DeployMe).creationCode : This returns the bytecode of the provided contract name. Yes, this is the same as what Remix displays when you click on the compiled Bytecode button.
callvalue() : Transfers the amount of ETH sent with the deployUsing function to the newly deployed contract.

Query Zone

What is the assembly keyword ?
- The assembly keyword is used to perform LOW Level call in Solidity and gives you more fine-grained control, allowing you to interact directly with the EVM. It’s particularly useful when you’re aiming to optimize gas usage or perform low-level operations that aren’t easily accessible through high-level Solidity. Above snippet shows how does LOW level CREATE2 call looks like, because its more complex to understand when compared to HIGH level call.

What is add(deployMeBytes, 0x20) and mload(deployMeBytes) ?
- When type(DeployMe).creationCode is called, it returns a dynamic bytes object, unlike fixed-size types like bytes32 or uint256. Solidity stores the length of this dynamic bytecode at the first 32 bytes (0x20 is the hex representation of the decimal vlaue 32), followed by the actual bytecode itself.
add(deployMeBytes, 0x20) adjusts the pointer to the start of the bytecode by skipping the initial 32 bytes (which hold the length).
mload(deployMeBytes) retrieves the length of the dynamic bytecode.
This is just the tip of the iceberg, and delving into the details of how dynamic data is stored in memory opens up a huge rabbit hole—one that definitely warrants further research.

With that, we wrap up this post. I hope it has clarified the mechanics of CREATE2 and how contracts deploy to deterministic addresses. This serves as a solid foundation, and in the next post, I'll discuss how to leverage this knowledge to adopt a hacker's mindset and safeguard the space before bad actors strike.

Thanks for reading, and happy coding!

. . .
Terabox Video Player