Collect unconfirmed transaction hashes from the mempool

For one of our earlier experiments, we needed to collect “all” the transaction hashes over a few days from the Bitcoin memory pool (mempool). In other words, transactions that are waiting to be picked up by miners, but haven’t yet been. Also, it deserves mentioning that since Bitcoin is a peer-to-peer network, the mempool isn’t a strictly defined thing, and at any given point in time, what’s in the mempool will vary from one node to the other.

Nevertheless, all transactions that will be collected and verified by the network have first appeared in the mempool, so if you don’t run your own Bitcoin full node (you should, it’s fun, entertaining and helps the Bitcoin ecosystem), you need to find a good entry point to scrape data from. We soon realized that BTC.com has a very nice API that could be used in a script for the task.

We threw together the following creation in pure Bash. Since it doesn’t need to be fast, and will not run all the time, we figured that Python or any other fancy language would be overkill. Copy and paste this…

#!/bin/bash

### This script will read the mempool data and
### filter all unique unconfirmed transaction hashes
### into one file named "unconfirmed.txt" that is
### saved in the user root folder.
### For automation, run this script everty 5 minutes
### (not more often - don't hammer) via crontab 

#Read all mempool data into a string
tx=$(curl -s https://chain.api.btc.com/v3/tx/unconfirmed)

#Filter out all transactions hashes and append them to a file
echo "$tx" | grep -oP '[0-9a-f]{64}' >> /home/$USER/unconfirmed.txt

#Read the file, sort unique and output to a temporary file
cat /home/$USER/unconfirmed.txt | sort -u > /home/$USER/unconfirmed-temp.txt

#Delete the original file containing many duplicates
rm /home/$USER/unconfirmed.txt

#Rename the sorted temporary file back to the combined output file
mv /home/$USER/unconfirmed-temp.txt /home/$USER/unconfirmed.txt

… into a new file in you user root folder, save it as “mempooltx” and issue the command

chmod +x mempooltx

so that we are allowed to run it.

The script needs a few explanations. What is does is that each time you run it, it downloads the entire mempool in JSON format. The mempool, which can be seen as the Bitcoin transfer waiting line, is typically several MB large. When the script has grabbed all the data it starts to filter out transactions hashes, since, in this project, it was the only piece of information we were interested in. The script appends every transaction to a file called “unconfirmed.txt”, which will also be created in the root folder. (If it already exists, it will not be overwritten.) Since there are many duplicates in the mempool taken together with the fact that you will probably download the same data more than once, the script has a sorting routine, so that in the end, the output file “unconfirmed.txt” is a sorted list containing only unique transaction hashes.

Running this script too often has no upsides and a couple of downsides (hammering a service provider’s API isn’t nice, and you will only grab more data that you already have. We know that a new Bitcoin block is created every 10 minutes. By experimenting back and forth a little, we found it optimal to let this script run every 7 minutes, no more, no less. (In our case, on a virtual server that is on 24/7.)

The easiest way to accomplish this is adding a so-called cronjob, that will execute the script for you however often you tell it to. Simply issue the command

cronjob -e

and append this line exactly (as it will instruct your system to run the script exactly every 7 minutes)

*/7 * * * * /home/$USER/./mempooltx

Now wait 10 minutes, then look up “unconfirmed.txt” with nano (in the terminal) or gedit (in the GUI). It should already after the first round contain thousands of transaction hashes. Let it run overnight and see how much you can capture!

After the first round, I had a file with slighly less than 2000 entries. It began like this

head unconfirmed.txt -n5
00028427e1061c866879b0f307fde213960c009cf0dd8b09c8ae6f422e2ecc79
0004dae194e98966af11071749c5265e457dbb204dd770bbfdff8e8b30469787
000878eca72c6a671d6f4423aad898764cc206210e57aa1e840a3e52cfb9e7b1
000d4d0986af3e5561e952a59724dd1a75d770788ea5eb4ae3bb1fed1732b498
001040b2f6244e56a9c76d1c2b439670276dbdc309b7c9e2764006a85748bff4

“What is all this good for?” is a valid question. It did a wonderful job for us, that much I can tell you. A large collection of transaction hashes can be truly useful. Use your imagination!

Comments or questions?

One more thing!

Consider the donation address at the bottom of the page. We re-invest all contributions into new projects for btcleak.com. Help us create new content and remain ad-free forever. Thank you.

One Reply to “Collect unconfirmed transaction hashes from the mempool”

  1. why would anyone wanna save tx hashes from the mempool? whats it good for?

    Reply

Leave a Reply

Your email address will not be published.