Hi there.
Previously, I blogged about implementing WebSocket protocol on my own in Go. Since then I love building primitive technology from scratch.
Today we're going to diving into the world of DNS protocol in Rust.
While I won't build a full-fledged DNS server and client (that's a big project!), we'll make a simple DNS client that focuses on A record lookups.
You may find resources like this DNS tutorial that explain DNS concepts more elegantly.
However, my goal is to explore DNS by building a client myself - even if it's a bit rough around the edges, it's a great learning experience!
You can find my code on GitHub
Let's get started!
Wireshark
Before reading RFC document for DNS protocol, I highly recommend you to have Wireshark installed on your machine and see what a simple DNS query for A record looks like.
This network analysis tool will be our best friend!
I personally found it very difficult to understand how one protocol works only by reading RFC.
If you are like me, utilizing Wireshark could help you by showing what the actual byte stream looks like.
So go ahead and install one for your machine accordingyly.
After that, open the tool and start capturing network packets by selecting your active network interface. On my machine (macOS), it's en0
:
Then open your terminal and hit the following command:
dig @8.8.8.8 A example.com
This command sends your DNS query for example.com
with type of A record to Google's DNS server.
And you should be able to see some network traffics in your Wireshark window.
You can stop capturing now by clicking stop button on top left of the window (the red one).
Also go to the filter form on top of the window and type "dns". This will filter out all the packets in the current buffer except ones with DNS protocol.
If your Wireshark windows look like mine, we are ready to check the RFC 🎉
RFC 1035
The next step is to read the protocol specification (It's here).
We don't have to read it the from top to the bottom.
Instead, focus on these key sections:
This section explains how DNS labels (parts of a domain like "example" and "com" in "example.com") are encoded.
Let's again consider the domain name example.com
:
- a word
example
is the length of 7 in ascii. So in this case we first put0x07
, thenexample
in octet. - a word
com
is the length of 3 in ascii. So we put0x03
, then thecom
in octet. - and finally put
0x00
at the end to indicate this is the end of the domain name.
Here is the result of how example.com
is encoded:
0x07 0x65 0x78 0x61 0x6d 0x70 0x6c 0x65 0x03 0x63 0x6f 0x6d 0x00
You can also see it yourself in Wireshark.
(Domain Name System (query) > Queries > example.com: type A, class IN > Name)
This section outlines the format of DNS message format (queries and responses). Each message always has Header
section as well as zero or more Question
, Answer
, Authority
, and Additional
sections in it.
(Typically, one DNS query has Header
and one Question
sections (no Answer
section).
And one DNS response has Header
, Question(s)
, Answer (with the same numbers Question has)
, and other sections as well).
Since DNS messages is limited up to 512 bytes, DNS itself has a nice compression mechanism to express domain names in it.
If there exists the same domain name in UDP data repeatedly, we don't have to consume the same size of bytes for the name.
Let's consider the following example.
When you query A record for the domain name example.com
, the response should contains the name 2 times.
In this case, in order to encode the domain name, we need 13 bytes for each name.
After all the domain name consumes 13 x 2 -> 26 bytes of UDP data in total.
Again example.com
will be encoded into:
0x07 0x65 0x78 0x61 0x6d 0x70 0x6c 0x65 0x03 0x63 0x6f 0x6d 0x00 (13 bytes!)
This is where the message compression comes into play.
When we pick up the first 2 bytes (16 bits) of the label and if the 2 most significant bits are 1
, it means the name is compressed.
And the OFFSET
below points to the index of the starting point of the domain name in the same response:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| 1 1| OFFSET |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Let's see what it looks like in our example.com
response in Wireshark.
The response also has Question
section that includes Name
field with the value example.com
.
It occupies 13 bytes as this is the first time that the domain name appears in the message.
Then, when you go to Answer
section, you can see Name
field the the same value example.com
again.
But this time, our Wireshark shows the byte representation of the domain name is 0xc00c
:
You can think of it as a information from DNS protocol such as:
Hey, we saw the domain name somewhere in this message before! Let's see... the rest of bits represents 0x0c. So the label must start at 12th byte of the message. Go ahead!
And if you see the 12th byte of the message, it points to the beginning of the binary representation of example.com
, which is 0x07
.
Building Our Rust Client
Let's write down the important Rust structs we'll need:
#[derive(Debug)]
pub struct Query {
header: Header,
questions: Vec<Question>,
// technically we need other section too, but this time let's ignore them.
}
#[derive(Debug)]
pub struct Response {
pub header: Header,
pub questions: Vec<Question>,
pub answers: Vec<ResourceRecord>,
pub authorities: Vec<ResourceRecord>,
pub additionals: Vec<ResourceRecord>,
}
Note: I'm omitting full code listings, but you can find them on GitHub.
Next, let's implement Try
trait for each struct.
Here is the example of Header struct:
impl From<&[u8; 512]> for Header {
fn from(value: &[u8; 512]) -> Self {
let id = (value[0] as u16) << 8 | value[1] as u16;
let qr = (value[2] & 0x80) != 0;
let opcode = (value[2] & 0x78) >> 3;
let aa = (value[2] & 0x04) != 0;
let tc = (value[2] & 0x02) != 0;
let rd = (value[2] & 0x01) != 0;
let ra = (value[3] & 0x80) != 0;
let rcode = value[3] & 0x0F;
let qdcount = (value[4] as u16) << 8 | value[5] as u16;
let ancount = (value[6] as u16) << 8 | value[7] as u16;
let nscount = (value[8] as u16) << 8 | value[9] as u16;
let arcount = (value[10] as u16) << 8 | value[11] as u16;
Header::new(
id, qr, opcode, aa, tc, rd, ra, rcode, qdcount, ancount, nscount, arcount,
)
}
}
We can use the method to decode DNS message as follows:
let header: Header = Header::from(bytes);
At this point I realized we need to track the current offset to decode the whole message.
Remember message compression
? It gives us the index of bytes stream pointing to the domain name we want.
So I decided to pass the offset (mutable reference) to each try_from
function. This doesn't look elegant - but let me go this way 🙏.
Here's an example of my code for ResourceRecord
:
impl TryFrom<(&[u8; 512], &mut usize)> for ResourceRecord {
type Error = String;
fn try_from((bytes, offset): (&[u8; 512], &mut usize)) -> Result<Self, Self::Error> {
if *offset + 12 >= bytes.len() {
return Err("data is too short".to_string());
}
let name = if bytes[*offset] == 192 {
// message compression
let mut tmp_offset = bytes[*offset + 1] as usize;
*offset += 2;
get_name(bytes, &mut tmp_offset)?
} else {
get_name(bytes, offset)?
};
let query_type = match ((bytes[*offset] as u16) << 8) + (bytes[*offset + 1] as u16) {
1 => QueryType::A,
28 => QueryType::AAAA,
_ => panic!(),
};
*offset += 2;
let query_class = QueryClass::IN;
*offset += 2;
let ttl = ((bytes[*offset] as u32) << 24)
+ ((bytes[*offset + 1] as u32) << 16)
+ ((bytes[*offset + 2] as u32) << 8)
+ (bytes[*offset + 3] as u32);
*offset += 4;
let rdlength = (((bytes[*offset] as u16) << 8) + bytes[*offset + 1] as u16) as u16;
*offset += 2;
let rdata = match query_type {
QueryType::A => RData::A([
bytes[*offset],
bytes[*offset + 1],
bytes[*offset + 2],
bytes[*offset + 3],
]),
// we just only consider A record this time
_ => unimplemented!(),
};
*offset += rdlength as usize;
Ok(ResourceRecord {
name,
query_type,
query_class,
ttl,
rdlength,
rdata,
})
}
}
Let's test It!
Once you have a basic client, give it a try:
cargo run -- google.com
If everything works as expected, you should see an IP address for Google!
This is the end of our journey to build a super simplified DNS client focused on A records.
Here are some challenges to take your learning further:
- More Record Types: Support lookups for records like AAAA (IPv6), MX (mail), etc.
- Error Handling: Handle potential errors and malformed responses.
- Recursion: A real-world client would likely make recursive queries if the initial server doesn't have the answer.
Thanks for reading ✌️