How to Build Your Own Blockchain Part 4.1 — Bitcoin Proof of Work Difficulty Explained

If you’re wondering why this is part 4.1 instead of part 4, and why I’m not talking about continuing to build the local jbc, it’s because explaining Bitcoin’s Proof of Work difficulty at a somewhat lower level takes a lot of space. So unlike what this title says, this post in part 4 is not how to build a blockchain. It’s about how an existing blockchain is built.

My main goal of the part 4 post was to have one section on the Bitcoin PoW, the next on Ethereum’s PoW, and finally talk about how jbc is going to run and validate proof or work. After writing all of part 1 to explain how Bitcoin’s PoW difficulty, it wasn’t going to fit in a single section. People, me included, tend get bored in the middle reading a long post and don’t finish.

So part 4.1 will be going through Bitcoin’s PoW difficulty calculations. Part 4.2 will be going through Ethereum’s PoW calculations. And then part 4.3 will be me deciding how I want the jbc PoW to be as well as doing time calculations to see how long the mining will take.

The sections of this post are:

  1. Calculate Target from Bits
  2. Determining if a Hash is less than the Target
  3. Calculating Difficulty
  4. How and when block difficulty is updated
  5. Full code
  6. Final Questions

TL;DR

The overall term of difficulty refers to how much work has to be done for a node to find a hash that is smaller than the target. There is one value stored in a block that talks about difficulty — bits. In order to calculate the target value that the hash, when converted to a hex value has to be less than, we use the bits field and run it through an equation that returns the target. We then use the target to calculate difficulty, where difficulty is only a number for a human to understand how difficult the proof of work is for that block.

If you read on, I go through how the blockchain determines what target number the mined block’s hash needs to be less than to be valid, and how that target is calculated.

Other Posts in This Series

Calculate Target from Bits

In order to go through Bitcoin’s PoW, I need to use the values on actual blocks and explain the calculations, so a reader can verify all this code themselves. To start, I’m going to grab a random block number to work with and go through the calculations using that.

>>>import random
>>> random.randint(0, 493928)
111388

Block number 11138 it is! Back in time to March of 2011 we go.

Continue reading

How to Build Your Own Blockchain Part 3 — Writing Nodes that Mine and Talk

Hello all and welcome to Part 3 of building the JackBlockChain — JBC. Quick past intro, in Part 1 I coded and went over the top level math and requirements for a single node to mine its own blockchain; I create new blocks that have the valid information, save them to a folder, and then start mining a new block. Part 2 covered having multiple nodes and them having the ability to sync. If node 1 was doing the mining on its own and node 2 wanted to grab node 1’s blockchain, it can now do so.

For Part 3, read the TL;DR right below to see what we got going for us. And then read the rest of the post to get a (hopefully) great sense of how this happened.

Other Posts in This Series

TL;DR

Nodes will compete to see who gets credit for mining a block. It’s a race! To do this, we’re adjusting mine.py to check if we have a valid block by only checking a section of nonce values rather than all the nonces until a match. Then APScheduler will handle running the mining jobs with the different nonce ranges. We shift the mining to the background if we want node.py to mine as well as being a Flask web service. By the end, we can have different nodes that are competing for first mining and broadcasting their mined blocks!

Before we start, here’s the code on Github if you want to checkout the whole thing. There are code segments on here to illustrate about what I did, but if you want to see the entire code, look there. The code works for me, but I’m also working on cleaning everything up, and writing a usable README so people can clone and run it themselves. Twitter, and contact if you want to get in contact.

Mining with APScheduler and Mining Again

The first step here is to adjust mining to have the ability to stop if a different node has found the block with the index that it’s working on. From Part 1, the mining is a while loop which will only break whenz it finds a valid nonce. We need the ability to stop the mining if we’re notified of a different node’s success.

Continue reading

How to Build Your Own Blockchain Part 2 — Syncing Chains From Different Nodes

Welcome to part 2 of the JackBlockChain, where I write some code to introduce the ability for different nodes to communicate.

Initially my goal was to write about nodes syncing up and talking with each other, along with mining and broadcasting their winning blocks to other nodes.   In the end, I realized that the amount of code and explanation to accomplish all of that was way too big for one post. Because of this, I decided to make part 2 only about nodes beginning the process of talking for the future.

By reading this you’ll get a sense of what I did and how I did it. But you won’t be seeing all the code. There’s so much code involved that if you’re looking for my total implementation you should look at the entire code on the part-2 branch on Github.

Like all programming, I didn’t write the following code in order. I had different ideas, tried different tactics, deleted some code, wrote some more, deleted that code, and then ended up with the following.

This is totally fine! I wanted to mention my process so people reading this don’t always think that someone who writes about programming does it in the sequence they write about it. If it were easy to do, I’d really like to write about different things I tried, bugs I had that weren’t simple to fix, parts where I was stuck and didn’t easily know how to move forward.

It’s difficult to explain the full process and I assume most people reading this aren’t looking to know how people program, they want to see the code and implementation. Just keep in mind that programming is very rarely in a sequence.

Twitter, contact, and feel free to use comments below to yell at me, tell me what I did wrong, or tell me how helpful this was. Big fan of feedback.

Other Posts in This Series

TL;DR

If you’re looking to learn about how blockchain mining works, you’re not going to learn it here. For now, read part 1 where I talk about it initially, and wait for more parts of this project where I go into more advanced mining.

At the end, I’ll show the way to create nodes which, when running, will ask other nodes what blockchain they’re using, and be able to store it locally to be used when they start mining. That’s it. Why is this post so long? Because there is so much involved in building up the code to make it easier to work with for this application and for the future.

That being said, the syncing here isn’t super advanced. I go over improving the classes involved in the chain, testing the new features, creating other nodes simply, and finally a way for nodes to sync when they start running.

For these sections, I talk about the code and then paste the code, so get ready.

Expanding Block and adding Chain class

This project is a great example of the benefits of Object Oriented Programming. In this section, I’m going to start talking about the changes to the Block class, and then go in to the creation of the Chain class.

The big keys for Blocks are:

Continue reading

How to Build Your Own Blockchain Part 1 — Creating, Storing, Syncing, Displaying, Mining, and Proving Work

I can actually look up how long I have by logging into my Coinbase account, looking at the history of the Bitcoin wallet, and seeing this transaction I got back in 2012 after signing up for Coinbase. Bitcoin was trading at about $6.50 per. If I still had that 0.1 BTC, that’d be worth over $500 at the time of this writing. In case people are wondering, I ended up selling that when a Bitcoin was worth $2000. So I only made $200 out of it rather than the $550 now. Should have held on.

Thank you Brian.

Despite knowing about Bitcoin’s existence, I never got much involved. I saw the rises and falls of the $/BTC ratio. I’ve seen people talk about how much of the future it is, and seen a few articles about how pointless BTC is. I never had an opinion on that, only somewhat followed along.

Similarly, I have barely followed blockchains themselves. Recently, my dad has brought up multiple times how the CNBC and Bloomberg stations he watches in the mornings bring up blockchains often, and he doesn’t know what it means at all.

And then suddenly, I figured I should try to learn about the blockchain more than the top level information I had. I started by doing a lot of “research”, which means I would search all around the internet trying to find other articles explaining the blockchain. Some were good, some were bad, some were dense, some were super upper level.

Reading only goes so far, and if there’s one thing I know, it’s that reading to learn doesn’t get you even close to the knowledge you get from programming to learn. So I figured I should go through and try to write my own basic local blockchain.

A big thing to mention here is that there are differences in a basic blockchain like I’m describing here and a ‘professional’ blockchain. This chain will not create a crypto currency. Blockchains do not require producing coins that can be traded and exchanged for physical money. Blockchains are used to store and verify information. Coins help incentive nodes to participate in validation but don’t need to exist.

The reason I’m writing this post is 1) so people reading this can learn more about blockchains themselves, and 2) so I can try to learn more by explaining the code and not just writing it.

In this post, I’ll show the way I want to store the blockchain data and generate an initial block, how a node can sync up with the local blockchain data, how to display the blockchain (which will be used in the future to sync with other nodes), and then how to go through and mine and create valid new blocks. For this first post, there are no other nodes. There are no wallets, no peers, no important data. Information on those will come later.

Other Posts in This Series

TL;DR

If you don’t want to get into specifics and read the code, or if you came across this post while searching for an article that describes blockchains understandably, I’ll attempt to write a summary about how a blockchains work.

At a super high level, a blockchain is a database where everyone participating in the blockchain is able to store, view, confirm, and never delete the data.

On a somewhat lower level, the data in these blocks can be anything as long as that specific blockchain allows it. For example, the data in the Bitcoin blockchain is only transactions of Bitcoins between accounts. The Ethereum blockchain allows similar transactions of Ether’s, but also transactions that are used to run code.

Slightly more downward, before a block is created and linked into the blockchain, it is validated by a majority of people working on the blockchain, referred to as nodes. The true blockchain is the chain containing the greatest number of blocks that is correctly verified by the majority of the nodes. That means if a node attempts to change the data in a previous block, the newer blocks will not be valid and nodes will not trust the data from the incorrect block.

Don’t worry if this is all confusing. It took me a while to figure that out myself and a much longer time to be able to write this in a way that my sister (who has no background in anything blockchain) understands.

If you want to look at the code, check out the part 1 branch on Github. Anyone with questions, comments, corrections, or praise (if you feel like being super nice!), get in contact, or let me know on twitter.

Step 1 — Classes and Files

Step 1 for me is to write a class that handles the blocks when a node is running. I’ll call this class Block. Frankly, there isn’t much to do with this class. In the __init__ function, we’re going to trust that all the required information is provided in a dictionary. If I were writing a production blockchain, this wouldn’t be smart, but it’s fine for the example where I’m the only one writing all the code. I also want to write a method that spits out the important block information into a dict, and then have a nicer way to show block information if I print a block to the terminal.

class Block(object):
  def __init__(self, dictionary):
  '''
    We're looking for index, timestamp, data, prev_hash, nonce
  '''
  for k, v in dictionary.items():
    setattr(self, k, v)
  if not hasattr(self, 'hash'): #in creating the first block, needs to be removed in future
    self.hash = self.create_self_hash()

  def __dict__(self):
    info = {}
    info['index'] = str(self.index)
    info['timestamp'] = str(self.timestamp)
    info['prev_hash'] = str(self.prev_hash)
    info['hash'] = str(self.hash)
    info['data'] = str(self.data)
    return info

  def __str__(self):
    return "Block<prev_hash: %s,hash: %s>" % (self.prev_hash, self.hash)

When we’re looking to create a first block, we can run the simple code.

Continue reading

NPR Sunday Puzzle Solving, And Other Baby Name Questions

If you have a long drive and no bluetooth or aux cord to listen to podcasts, NPR is easily the best alternative. Truck drivers agree with this statement no matter their overall views. For me, this was the case when driving home to Milwaukee from Ann Arbor where I went to a college friend’s wedding.

While driving back I listened to NPR and heard Weekend Edition Sunday and their Sunday Puzzle pop up. If you haven’t heart of it before, at the end of every week’s episode they state a puzzle. Throughout the next week listeners can submit their answer and one random correct submitter is chosen to be recorded doing a mini puzzle on air.

The puzzle they stated for the week after the wedding was as follows:

Think of a familiar 6-letter boy’s name starting with a vowel. Change the first letter to a consonant to get another familiar boy’s name. Then change the first letter to another consonant to get another familiar boy’s name. What names are these?

They’ve already released the show for this question (I didn’t win of course) so I figure I can write about how I found out the answer!

Solving The Name Question

First step as always for these types of posts is gathering the required list of familiar boy’s names.  Searching on Google for lists will show that there are a ton of sites which exist try to SEO themselves for the money. When scraping, you should to poke around and make sure to choose the post that has the correct data as well as being the most simple to gather. I went with this one.

Since there’s only one page with the data, there’s no need to use the requests library to scrape the different pages. So clicking save html file to the folder you’re programming in is the best way to get the data.

The scraping code itself is pretty simple.

from bs4 import BeautifulSoup

filename = 'boy_names.html'
vowels = ('A', 'E', 'I', 'O', 'U')

vowel_starters = []
consonant_starters = []

with open(filename, 'r') as file:
  page = file.read()
  html = BeautifulSoup(page.replace('\n',''), 'html.parser')
  for name_link in html.find_all("li", class_="p1"):
    name = name_link.text
    first_letter = name[0]
    if len(name) == 6:
      if first_letter in vowels:
        vowel_starters.append(name)
      else:
        consonant_starters.append(name)

for vname in vowel_starters:
  cname_same = []
  for cname in consonant_starters:
    if vname[1:] == cname[1:]:
      cname_same.append(cname)
  if cname_same:
    print vname
    for match in cname_same:
      print match

And the results are…

Austin, Justin, Dustin

Justin and Dustin rhyme which makes it more simple to realize that they match, but Austin isn’t exactly on the same page. If I didn’t have the code, zero chance I’d have gotten this correct.

That’s it right? Nope, I have all the code, I figured I should check to see if there’s a match for girls names with that same rules. All there was to do is save the popular girl names to the same folder, change the filename to ‘girl_names.html’, run the code, and we’ll get Ariana and Briana. A and B are the starting letters, and if Criana was a popular name (at this moment), we’d be good to for the full 3 name answers.

By going through this part, I came up with some other fun questions that could be answered with this list of names, and the rest of the post is about those.

Continue reading

Guest Post – Learning R as an MBA Student

Hello!

If I’d wanted to really grab your attention, I should have title this article something like 13 Tips that Could Save You Years of Effort or A Guide to Becoming a Full Stack Developer in 2017. But I don’t know how to save you years of effort and I can’t write a comprehensive guide and, frankly, I am not interested in pretending like I do! So instead, you get a boring but honest title, and my thoughts as someone who is not an expert and who is not planning to make a career out of programming.

I’m Sara, Jack’s older sister, and I have the privilege of writing a guest post today! My background is not in anything related to programming – I am a CPA and spent over five years working as a public accountant, mostly in corporate tax.

So how does my background give me any right to be posting on this blog? I am now an MBA student and as part of my curriculum, I took a Business Statistics class which involved learning how to use R. I am not sure I can even say I had heard of R before taking this class, so there was steep learning curve for me.

I spent a lot of time thinking about what I could actually write in this post that might be useful to Jack’s readers, since you could all out-program me in your sleep. My first draft of this post involved a lot of specifics regarding the questions I was assigned and how I solved them, but that didn’t seem helpful to anyone. Plus, although I did well on my assignment, I’m not totally sure what I did right versus what I did wrong.

Instead, I’ve decided to share a few of the insights I gleaned from my brief experience as a programmer, and also some of the resources I used to teach myself as I completed my assignments. I’d like to note that just like I’m new to programming, I’m new to writing about programming, so I assume this post will be unlike most others on the subject.

If you’re like me and you are new to programming, hopefully this will help you get started. If I learned one thing, you can’t just be taught how to do this, but maybe this will give you some inspiration.

Play around, but make sure you have an idea / project in mind

Considering I’m such a rookie, the first thing I needed to learn how to do wasn’t writing code, but installing R.

https://www.youtube.com/watch?v=A56PD8BSS0A

We then spent an entire class session following along as our professor demonstrated the very basics of R. This proved to be invaluable – I wouldn’t have even felt comfortable knowing where to even begin otherwise.

I think it’s important to note that although we were just playing with various functions, we were doing that in relation to one of the problems we’d been assigned. With his guidance, we were able to see how we should begin to work the problem, and how various functions could help meet our goal. It seemed to me that learning various functions through experimentation is useful, but you’ve got to make sure you’re working towards a goal as you do so.

The example we went over in class involved determining whether a specific pitcher for the Oakland As influenced the team’s ticket sales.

Following along with my professor, we named our data “As”, and then my very first lines into R looked like this:

> names(As)
[1] "TICKET" "OPP"    "POS"    "GB"     "DOW"    "TEMP"   "PREC"   "TOG"   
[9] "TV"     "PROMO"  "NOBEL"  "WKEND"  "OD"     "DH"    
> As(1,)
Error in As(1, ) : could not find function "As"
> As[1,]
  TICKET OPP POS GB DOW TEMP PREC TOG TV PROMO NOBEL WKEND OD DH
1  24415   2   5  1   4   57    0   2  0     0     0     0  1  0

As you can see, I made my first mistake right off the bat, using parentheses instead of brackets. After that first time, I don’t think I made that mistake again. Only through making the mistake and getting the error message did I figure out what I’d done wrong.

This proved to be the case time and time again as I made my way through the assignment, and I quickly learned mistakes are a way of life in programming. Somewhat infuriatingly, R returned just a blanket “error” message without much direction as to where I went wrong. I do think this helped me with my critical thinking, as it was up to me to figure out exactly where I had errored.

Bottom line: the time I spent tinkering around in R was when I learned the most. Figuring out how to correct errors was more instructive than getting it right the first time.

Time estimates are useless

In my career as a CPA, I could more or less predict how long various tasks would take. In fact, I usually had the previous year’s time reports to consult to see how much time we’d taken to complete the task in the past. I quickly found this is not the case with programming.

I sat down to work on my assignment on a beautiful summer Saturday, expecting it to take me a few hours maximum. By the time all was said and done, it took me probably ten hours to complete the three questions, spread over a few days. I talked to my classmates after we’d finished the assignment, and there was a wide array of time spend. People reported getting hung up on various parts, and there’s no telling how long it might take to work through the rough patches.

Maybe with more experience you get better at estimating how long projects take, but I have a strong hunch that you’ll often find you don’t know what you’re in for until you’re in the thick of it. I’m sure glad I started working on my assignment a week before it was due.

Don’t be embarrassed to google everything

If you’re having trouble with a specific function, chances are someone else out there has as well, and has asked the internet about it. Sometimes I didn’t even know what exactly it was that I was googling, but that generally didn’t matter as long as I hit the right keywords.

I often found myself reading helpful hints on various websites, and would cross-reference the notes I had from class. It quickly became clear to me that there’s no right way to do something. The way my professor showed an example in class was not how the people posting on Stackify perform the same task, and that’s ok. I actually think I learned more by seeing alternative ways of performing the same function.

I may be totally off base in saying this, but from all the googling I did, it seems to me that the exact lines of code one person uses to complete a project is as distinctive as their handwriting. I was not expecting programming to be such an individualized process.

I’m also not embarrassed to say that that’s how I solved a lot of my errors. This actually reminded me a lot of my accounting career. For many research projects, I often began by consulting google, and our firm definitely encouraged that.

Sleep on it

As I neared the end of my assignment, I got began to get very frustrated when something that I thought I’d mastered was not working out. I couldn’t figure out what I was doing wrong. I had more or less finished the question, and only had to insert a line of best fit into my scatter plot.

No matter what I did, that dang line would not appear.

As much as I wanted to finish the assignment that day, Jack wisely advised me that it would do no good to keep typing the same code, hoping that the line would magically generate. After a few minutes of doing exactly that, I concluded that he was correct. So… I went to sleep.

The next morning, I woke up and found my last few lines of code:

> plot(TICEKT,fit0$residual+fit0$fitted.values,pch=20)

Error in plot(TICEKT, fit0$residual + fit0$fitted.values, pch = 20)

Can you spot my error? Yep, “TICEKT” is not a word. (Is now a good time to note that I was also an English major?)

I corrected my error and produced this beautiful result:

I’m sure not all errors are that easily solvable with a good night of sleep, but I got lucky. I think this applies to lots of areas of life; if you leave things to ruminate a bit, you might just find a better answer. Or in my case, a line!

Always more to learn

I spoke to Jack about this assignment a little, and explained what it was I was asked to do. Until our conversation, it did not occur to me that there had been a lot of work on the front-end to get the data for our project ready to go.

For example, here’s a screenshot of a small portion of the data our professor provided:

(etc. etc. etc. etc.)

How did this data get here? When I was completing the assignment, I honestly didn’t care, and didn’t even think to care. I just felt incredibly smart for “mastering” R and successfully completing the assignment and making the beautiful plot I posted above. I was thinking to myself that I probably should have skipped this class and put myself directly into the more advanced Regressions class.

I came to my senses. Upon further reflection, it sure seems like compiling all this information is a heck of a lot harder than actually analyzing it! Jack then explained to me that all those articles I proofread for him about scraping data are doing exactly that. And those articles go so far over my head that hearing that brought be back to Earth about my only adequate capabilities.

There’s actually a term for this (credit to Jack for introducing me to it.) It’s called the Dunning-Kruger effect, which Wikipedia summarizes as when “persons of low ability suffer from illusory superiority when they mistakenly assess their cognitive ability as greater than it is.”

Well here I am, a person of “low ability” who has seen the light. I know that I don’t know what I don’t know. I used to know nothing about programming, but I have now progressed to knowing next to nothing about programming. At the very least, Jack’s articles now make more sense to me!

Web Scraping with Python — Part Two — Library overview of requests, urllib2, BeautifulSoup, lxml, Scrapy, and more!

Welcome to part 2 of the Big-Ish Data general web scraping writeups! I wrote the first one a little bit ago, got some good feedback, and figured I should take some time to go through some of the many Python libraries that you can use for scraping, talk about them a little, and then give suggestions on how to use them.

If you want to check the code I used and not just copy and paste from the sections below, I pushed the code to github in my bigishdata repo. In that folder you’ll find a requirements.txt file with all the libraries you need to pip install, and I highly suggest using a virtualenv to install them. Gotta keep it all contained and easier to deploy if that’s the type of project you’re working on. On this front, also let me know if you’re running this and have any issues!

Overall, the goal of the scraping project in this post is to grab all the information – text, headings, code segments and image urls – from the first post on this subject. We want to get the headings (both h1 and h3), paragraphs, and code sections and print them into local files, one for each tag. This task is very simple overall which means it doesn’t require super advanced parts of the libraries. Some scraping tasks require authentication, remote JSON data loading, or scheduling the scraping tasks. I might write an article about other scraping projects that require this type of knowledge, but it does not apply here. The goal here is to show basics of all the libraries as an introduction.

In this article, there’ll be three sections. First, I’ll talk about libraries that execute http requests to obtain HTML. Second, I’ll talk about libraries that are great for parsing HTML to allow you to scrape the data. Third, I’ll write about libraries that perform both actions at once. And if you have more suggestions of libraries to show, let me know on twitter and I’ll throw them in here.

Finally, a couple notes:

Note 1: There are many different ways of web scraping. People like using different methods, different libraries, different code structures, etc. I understand that.  I recognize that there are other useful methods out there – this is what I’ve found to be successful over time.

Note 2: I’m not here to tell you that it’s legal to scrape every website. There are laws about what data is copyrighted, what data that is owned by the company, and whether or not public data is actually legal to scrape. You might have to check things like robots.txt, their Terms of Service, maybe a frequently asked questions page.

Note 3: If you’re looking for data, or any other data engineering task, get in contact and we’ll see what I can do!

Ok! That all being said, it’s time to get going!

Requesting the Page

The first section here is showing a few libraries that can hit web servers and ask nicely for the HTML.

For all the examples here, I request the page, and then save the HTML in a local file. The other note on this section is that if you’re going to use one of these libraries, this is part one of the scraping! I talked about that a lot in the first post of this series, how you need to make sure you split up getting the HTML, and then work on scraping the data from the HTML.

First library on the first section is the famous, popular, and very simple to use library, requests.

requests

Let’s see it in action.

import requests
url = "https://bigishdata.com/2017/05/11/general-tips-for-web-scraping-with-python/" 
params = {"limit": 48, 'p': 2} #used for query string (?) values
headers = {'user-agent' : 'Jack Schultz, bigishdata.com, contact@bigishdata.com'}
page = requests.get(url, headers=headers)
helpers.write_html('requests', page.text.encode('UTF-8'))

Like I said, incredibly simple.

Requests also has the ability to use the more advanced features like SSL, credentials, https, cookies, and more. Like I said, I’m not going to go into those features (but maybe later). Time for simple examples for an actual project.

Overall, even before talking about the other libraries below, requests is the way to go.

urllib / urllib2

Ok, time to ignore that last sentence in the requests section, and move on to another simple library, urllib2. If you’re using Python 2.X, then it’s very simple to request a single page. And by simple, I mean couple lines.

Continue reading