Introduction

In this tutorial, we will be looking into how we can obtain network statistics from the bitcoin network for the purpose of node analytics.

Don't trust, verify! These are the words commonly spoken whenever we talk about bitcoin. If you've ever wanted to establish how sites like https://coin.dance/nodes and https://statoshi.info/dashboard/db/peers obtain their information, this tutorial is for you!

Background

A few months ago, I was curious to learn more about which reference client implementations of the bitcoin protocol were running on the network. Although this can be achieved through various readily available resources, it felt like I was trusting these entities in reporting on this honestly. I therefore set out to investigate just how this can be achieved so that I may verify and compare these for myself.

Let's do it!

We'll start off by having a look into the contribs folder on the bitcoin core client implementation.

https://github.com/bitcoin/bitcoin/tree/master/contrib/seeds

Here we notice that a list of seeds are provided as a convenience from bitcoin core developer Pieter Wuille (aka Sipa) https://github.com/sipa.

curl -s http://bitcoin.sipa.be/seeds.txt.gz | gzip -dc > seeds_main.txt

Downloading this file we are presented with the following information.

# address                                        good  lastSuccess    %(2h)   %(8h)   %(1d)   %(7d)  %(30d)  blocks      svcs  version
159.203.122.25:8333                                 1   1549113737  100.00% 100.00% 100.00% 100.00%  99.99%  561219  0000001d  70015 "/Satoshi:0.15.1/"
40.114.88.206:8333                                  1   1549113322  100.00% 100.00% 100.00% 100.00%  99.99%  561219  0000040d  70015 "/Satoshi:0.15.0.1/"
172.245.217.191:8333                                1   1549112849  100.00% 100.00% 100.00% 100.00%  99.99%  561219  0800040d  70015 "/Satoshi:0.15.1/"
94.130.222.201:9354                                 0   1549112931  100.00% 100.00% 100.00% 100.00%  99.99%  561219  0000002d  70015 "/Satoshi:0.15.0.1/"
45.55.234.179:8333                                  1   1549113505  100.00% 100.00% 100.00% 100.00%  99.99%  561219  0000001d  70015 "/Satoshi:0.15.1/"

The file format is broken down as follows:

  • address - This is the recorded public IP address of the node
  • good - This is boolean value indicating the health of this node
  • lastSuccess - This is a unix epoch timestamp of the last successful attempt with this node
  • %(2h)/%(8h)/%(1d)/%(7d)/%(30d) - Various time intervals providing stats on availability
  • blocks - Block height recorded on node
  • svcs - list of supported services
  • version - Version and User Agent String / Sub Version

This information can be used to do some neat analysis indeed! But we're still trusting sipa right? So, how is this list generated? Well, lucky for us, sipa has been friendly enough to share the tool and source code which generates this data for us here https://github.com/sipa/bitcoin-seeder!

Bitcoin-seeder is a crawler for the Bitcoin network, which exposes a list of reliable nodes via a built-in DNS server.  

Features:

  • regularly revisits known nodes to check their availability
  • bans nodes after enough failures, or bad behaviour
  • accepts nodes down to v0.3.19 to request new IP addresses from, but only reports good post-v0.3.24 nodes.
  • keeps statistics over (exponential) windows of 2 hours, 8 hours, 1 day and 1 week, to base decisions on.
  • very low memory (a few tens of megabytes) and cpu requirements.
  • crawlers run in parallel (by default 24 threads simultaneously)

Here are the steps required to install and generate your own data on any debian based distribution.

gr0kchain@bitcoindev:~$ sudo apt-get install build-essential libboost-all-dev libssl-dev
gr0kchain@bitcoindev:~$ git clone https://github.com/sipa/bitcoin-seeder
gr0kchain@bitcoindev:~$ cd bitcoin-seeder
gr0kchain@bitcoindev:~bitcoin-seeder$ make

This should have successfully built a binary called dnsseed. You can now generate the same data as we had seen before by running the following.

gr0kchain@bitcoindev:~/bitcoin-seeder$ ./dnsseed
Supporting whitelisted filters: 0x1,0x5,0x9,0xd
No nameserver set. Not starting DNS server.
Starting seeder...done
Starting 96 crawler threads...done
[19-02-02 14:38:59] 0/34 available (1 tried in 5s, 33 new, 0 active), 0 banned; 0 DNS requests, 0 db queries

A file called dnsseed.dump is generated which contains the statistics as previously seen in the download from sipa's link.

gr0kchain@bitcoindev:~/bitcoin-seeder$ head ./dnsseed.dump
# address                                        good  lastSuccess    %(2h)   %(8h)   %(1d)   %(7d)  %(30d)  blocks      svcs  version
161.0.121.250:8333                                  1   1549120453   12.97%   3.41%   1.15%   0.17%   0.04%  561227  00000009  99999 "/therealbitcoin.org:0.9.99.99/"
193.70.18.162:10303                                 0   1549120527   12.97%   3.41%   1.15%   0.17%   0.04%  530361  00000035  80003 "/BUCash:1.1.2(EB16; AD12)/"
64.71.74.75:8333                                    1   1549120450   12.97%   3.41%   1.15%   0.17%   0.04%  561227  0000001d  80002 "/BitcoinUnlimited:1.0.2(EB16; AD12)/"
76.173.161.44:8333                                  1   1549120483   12.97%   3.41%   1.15%   0.17%   0.04%  561227  0000043d  80002 "/BitcoinUnlimited:1.0.1.1(EB16; AD12)/"
46.4.89.67:8334                                     0   1549120768   12.97%   3.41%   1.15%   0.17%   0.04%  530554  0000002d  70017 "/SuperBitcoin:0.17.0.1/"
47.104.86.0:10000                                   0   1549120566   12.97%   3.41%   1.15%   0.17%   0.04%  558089  0000000d  70016 "/QuantumBitcoin:0.16.0.2/"
50.225.198.67:6562                                  0   1549120616   12.97%   3.41%   1.15%   0.17%   0.04%  531805  0000047d  70016 "/SuperBitcoin:0.16.0.2/"
52.25.94.127:8333                                   1   1549120642   12.97%   3.41%   1.15%   0.17%   0.04%  561228  0000040d  70015 "/Satoshi:0.16.3/"
35.177.81.84:8333                                   1   1549120646   12.97%   3.41%   1.15%   0.17%   0.04%  561228  0000040d  70015 "/Satoshi:0.17.1/"

Here is a quick way in which we can generate some stats by grouping these by the user agent string.

gr0kchain@bitcoindev:~/bitcoin-seeder$ cat ./dnsseed.dump |wc -l >> ./clients.txt | cat ./dnsseed.dump | awk '{ print $12 }'  | sort | uniq -c | sort -nr >> ./clients.txt
gr0kchain@bitcoindev:~/bitcoin-seeder$ head ./clients.txt
14122
   2524 "/Satoshi:0.15.1/"
   2084 "/Satoshi:0.17.0/"
   1881 "/Satoshi:0.16.3/"
   1615 "/Satoshi:0.17.1/"
   1086 "/Satoshi:0.17.0.1/"
    859 "/Satoshi:0.16.0/"
    830 "/Satoshi:0.14.99/"
    499 "/Satoshi:0.16.2/"
    456 "/Satoshi:0.13.2/"
    158 "/Satoshi:0.17.0/"
    152 "/Satoshi:0.16.3/"
    151 "/Satoshi:0.17.1/"
     88 "/Satoshi:0.14.99/"
     66 "/Satoshi:0.17.0.1/"
     59 "/Satoshi:0.15.1/"
     46 "/Satoshi:0.16.0/"
     29 "/Satoshi:0.13.2/"
     23 "/Satoshi:0.16.1/"
     20 "/Satoshi:0.16.2/"

Conclusion

In this tutorial we had a look at how we can generate our own data from which we can start doing various cool analytics. If you enjoyed this tutorial and would like to see more like them, please leave some comments and suggest topics you'd like to see covered by the bitcoin developer network community!