Alok Menghrajani
Previously: security engineer at Square, co-author of HackLang, put the 's' in https at Facebook. Maker of CTFs.
This blog does not use any tracking cookies and does not serve any ads. Enjoy your anonymity; I have no idea who you are, where you came from, and where you are headed to. Let's dream of an Internet from times past.
Home | Contact me | Github | RSS feed | Consulting services | Tools & games
The internet is made of thousands of inter-connected routers (at the time of writing, there are 40180 routers on the internet). Each of these routers is assigned a number called ASN (Autonomous System Number). The routers use a protocol called BGP (Border Gateway Protocol). BGP helps routers agree on how to talk to each other and how to find each other. BGP also automatically finds new routes when one or more routers goes away or when new routers are added to the network. Every public internet IP address is connected to one of these routers. It is therefore possible to map an IP address to its router’s ASN.
You might wonder what’s the value in having the ASN? If you are running a website with a login or a spam filtering system, knowning the ASN of your users can be used as part of a global risk analysis. The ASN number is pretty stable (e.g. it should be the same for an entire university campus, or for all the users of an ISP, even when the IP addresses are assigned dynamically).
You can use the ASN (as well as other signals) to prompt for a secondary password at login-time or throw a captcha at content submission time.
This article is going to show you how to build this mapping. The mapping only needs to be updated once in a while, and the lookup takes less than 1ms.
Getting the internet’s BGP table
Unless you run your own border router, you most likely don’t have direct access to the BGP data. You can however download the data from thyme.apnic.net, a website hosted by APNIC.
You are most likely going to want to download these two files:
Note: This data is for ipv4. There are many other routers exposing their BGP tables, some include the ipv6 network.
Choosing a data representation
The BGP table is fairly large, so we want to choose a data representation which will let us quickly find the mapping from IP to ASN. We could use various types of trees, but for now a simple hash table is fast enough.
We store the netmask -> ASN mapping in a hash table. Given an IP address, we generate all possible netmasks and stop as soon as we have a match.
The code
Here is some sample PHP code.
- <?php
- header('Content-type: application/json');
- $ip = $_GET['ip'];
- // Loading the asn table (netmask => asn).
- // You want to cache this across requests!
- $asn_table = array()
- $fh = fopen('data-raw-table', 'r');
- while ($line = fgets($fh)) {
- $match = array();
- if (preg_match("%([0-9.]+)/\d+\s+(\d+)%", $line, $match)) {
- $asn_table[ip2long($match[1])] = $match[2];
- }
- }
- // Loading the owners table (asn => owner)
- // You want to also cache this across requests!
- $asn_owners = array();
- $fh = fopen('data-used-autnums', 'r');
- while ($line = fgets($fh)) {
- $match = array();
- if (preg_match("%\s*(\d+)\s+(.*)%", $line, $match)) {
- $asn_owners[$match[1]] = $match[2];
- }
- }
- // The actual lookup
- $ip_value = ip2long($ip);
- $result = 'sorry, not found...';
- for ($i=0; $i<32; $i++) {
- $ip2 = ($ip_value >> $i) << $i;
- if (isset($asn_table[$ip2])) {
- $asn = $asn_table[$ip2];
- $result = $ip . ' maps to asn ' . $asn;
- if (isset($asn_owners[$asn])) {
- $result .= ' (' . $asn_owners[$asn] . ')';
- }
- break;
- }
- }
- // Returning the response as JSONP
- $arr = array('result' => $result);
- echo 'ip_to_asn(' . json_encode($arr) . ')';
Links
- http://en.wikipedia.org/wiki/Border_Gateway_Protocol
- http://en.wikipedia.org/wiki/Autonomous_System_Number
- http://thyme.apnic.net/
- https://www.team-cymru.org/ monitors BGP and provides various ASN related services
- http://xkcd.com/195/: map of the internet using a Hilbert Curve
- http://www.isi.edu/ant/address/: another map of the internet
- http://www.ripe.net/data-tools/stats/ris/ris-raw-data: another source for BGP data