Getting External Data with Chainlink

In this article, I’m going to show you how to code a contract that has access to the conversion rate between ETH and USD using the Chainlink decentralized oracle.

Blockchains are deterministic systems. This means that there is no room for variability amongst the data in nodes; each node should have the same exact data.

In order to maintain a blockchain’s deterministic nature, smart contracts on the blockchain are unable to connect with external systems, data feeds, APIs, existing payment systems or any other off-chain resources on their own.

In order to get data from the real world, smart contracts need to use an oracle. An oracle is a trusted third-party that interacts with the off-chain world to provide external data or computation to smart contracts. In other words, oracles are the bridges between blockchains and the real world.

Oracles can provide blockchains with information about currency prices, the weather, political results, and much more.

An oracle cannot be centralized because then we would have the same problem: nodes in a blockchain could end up with different data. Instead, oracles must be decentralized.

Chainlink is the most popular decentralized oracle provider.

Writing the Code

I created my contract on Remix IDE and I am using Remix to test my contract. I won’t be explaining how to use Remix in this article.

In order to work with the Chainlink oracle, we need to import their code into our contract using this line:

import "@chainlink/contracts/src/v0.8/interfaces/AggregatorV3Interface.sol";

What exactly is this line doing, though? Basically, we are importing code from the chainlink/contracts npm module. If you go to the project’s Github (here), and follow the path in the import statement (here), you will get to the code where they define the AggregatorV3Interface interface. So, basically, we are importing the AggregatorV3Interface interface into our project so we can work with it.

Interfaces are similar to abstract contracts. As you can see, the interface defines a function but it cannot define its implementation.

Interfaces tell your Solidity contract how it can interact with another contract. So, since we will be accessing Chainlink contracts, this interface tells our contract what functions we can use from the Chainlink contract.

We made a contract call to another contract from our contrat using interfaces. Interfaces are a minimalistic view into another contract.

We work with interfaces and other contracts the same way that we work with variables or structs within our own contracts.

Make Function to Get Interface Version

Now, we will access some information from the Chainlink blockchain using the interface. We will create a function to get the version of the interface.

function getVersion() public view returns (uint256){
}

First, we create an instance of the interface:

function getVersion() public view returns (uint256){
    AggregatorV3Interface priceFeed = AggregatorV3Interface()
}

In order to find where the ETH / USD price feed contract is located on the Chainlink Rinkeby blockchain, we can go to the Ethereum Price Feeds documentation:

It has a bunch of different price feeds. Find the ETH / USD price feed contract address and put the address in the parameter of the interface:

AggregatorV3Interface priceFeed = AggregatorV3Interface(0x8A753747A1Fa494EC906cE90E9f37563A8AF630e)

So, basically this line is saying that we have a contract that has these functions defined in the interface located at the address 0x8A753747A1Fa494EC906cE90E9f37563A8AF630e.

If this is true, we should be able to call the version method on priceFeed:

AggregatorV3Interface priceFeed = AggregatorV3Interface(0x8A753747A1Fa494EC906cE90E9f37563A8AF630e);
return priceFeed.version();

The full function:

Now, you can deploy your contract and see if it works. In my case, using Remix IDE, I was able to access the getVersion function through the IDE, so it was successful!

Make Function to Get Latest Price

Now, we will create a function to get the current price of ETH in terms of USD from Chainlink.

We start by defining the function and creating another instance of the interface:

function getPrice() public view returns(uint256){
        AggregatorV3Interface priceFeed = AggregatorV3Interface(0x8A753747A1Fa494EC906cE90E9f37563A8AF630e);
}

We are trying to access the answer return variable from the latestRoundData method of the interface (see interface code above).

The latestRoundData method returns 5 items. Since we only need answer, we can ignore the other variables and just add commas for them:

(,int256 answer,,,) = priceFeed.latestRoundData();

Then, convert answer to uint256 and return it:

return uint256(answer);

The full function code:

Summary

Here is the final code:

In this tutorial, you learned what oracles are and why they are important. You learned how to use the Chainlink oracle in your own smart contract to access real-world data on the blockchain.

Blockchain Development: Solidity Crash Course

Solidity is the programming language used to write Ethereum smart contracts. If you want to be an Ethereum blockchain developer, learning it should be one of the first things you do.

In this article, I’m going to be showing you some basic Solidity syntax and fundamental features that every smart contract must have. This article is for beginners but it assumes you have some prior knowledge in another programming language.

Versioning

At the top of any new smart contract file must be the Solidity version. You have a few options for denoting the version of your contract:

  • If the contract is for a specific Solidity version. In this case, 0.6.0:
pragma solidity 0.6.0
  • If the contract is for a version within the 0.6.0 range (0.6.0 to 0.6.12):
pragma solidity ^0.6.0
  • If the contract is for a version within the range from 0.6.0 to 0.9.0 (exclusive):
pragma solidity >=0.6.0 <0.9.0

You might be wondering: which version should I use? When deploying contracts, you should use the latest released version of Solidity. Apart from exceptional cases, only the latest version receives security fixes

Contract Declaration

Declare a new contract using the contract keyword followed by the contract’s name:

contract Bank {

} 

As you can see, the syntax for defining a new contract in Solidity is similar to the syntax for defining a class in a language like Java or Python.

This is the most simple version of a valid contract.

Types and Declaring Variables

Solidity has pretty much all of the same data types as any other programing language.

There are multiple different sizes of integers that you can create. For example,

uint256 newNumber = 6;

There are also some data types that you may not be familiar with.

For example, there is an address type for storing account addresses on the blockchain, such as a MetaMask account address:

address accountAddress = 0x29D7d1dd5B6f9C864d9db560D72a247c178aE86B;

Comments

Comments are denoted using two slashes (//)

// This is a comment in Solidity

Defining Functions

Defining functions in Solidity is similar to other languages

function storeMoney(uint256 _amount) {
}

If a function returns a value, you can use this syntax:

function storeMoney(uint256 _amount) returns (uint256) {
   return _amount;
}

Visibility

Visibility refers to where functions and variables within a Solidity contract are available. There are four types of visibility:

  1. External: Functions/variables with external visibility must be called by another contract; they can’t be called within the same contract
  2. Public: Functions/variables with public visibility can be called by anybody, including any users of the blockchain
  3. Internal: Functions/variables with internal visibility can only be accessed internally (i.e. from within the current contract or contracts deriving from it)
  4. Private: Functions/variables with private visibility are only visible for the contract they are defined in and not in derived contracts

To specify the visibility of a function:

function storeMoney(uint256 _amount) public {
}

To specify the visibility of a variable:

bool private isCheckingAccount;

View and Pure Functions

If a function updates the value of a variable in a smart contract, it is changing the contract’s state. Any state change in a smart contract is considered a transaction and thus requires a gas fee to update the contract.

There are two types of functions that do not update the state of the contract: view and pure functions. In other words, these are functions you do not have to make a transaction on and they don’t cost gas.

A view function reads some state off of the blockchain. Public variables are already technically view functions.

function retrieveBalance(uint256 _balance) public view returns (unit256) {
   return _balance;
}

A pure function is a function that purely does some type of math, but does not store the output of that math. Thus, there is no change of the blockchain’s state since no variables are being saved or updated.

function addBalances(uint256 balance) public pure {
    return balance + balance;   // balance is not being updated
}

Structs

Structs are a way to define new type in Solidity. They are structures that contain one or more native Solidity variable types.

Define a new struct:

struct Account {
   uint256 balance;
   string name;
}

Create an instance/object of the struct:

Account public myAccount = Account({ balance: 100, name: "John Doe" });

Define an array of struct objects;

Account[] public accounts;    // dynamic array (can grow to any size)

Account[3] public accounts;   // fixed array (fixed size of 3)

Memory

In Solidity, there are two main ways to store information: you can store it in memory or in storage.

When you store an object in memory, the data will only be stored during the execution of the function/contract call. When an object is stored in storage, the data will persist even after the function executes; it will persist.

Strings are actually not a variable type in Solidity. In Solidity, strings are objects– arrays of bytes. Since it’s an object, the programmer has to decide where they want to store it: memory or storage.

function addAccount(string memory _name, uint256 _balance) {
    accounts.push(Account(_balance, _name));
}

Mappings

mapping is a Solidity variable type that is similar to a dictionary in other languages. It is an array of key-value pairs (with 1 value per key). If given a key, a mapping spits out whatever variable that that key is mapped to.

mapping(string => uint256) public nametoBalance;

function addAccount(string memory _name, uint256 _balance) {
    accounts.push(Account(_balance, _name));
    nametoBalance[_name] = _balance;
}

SPDX License

Many times, the compiler will raise an error if your source code file doesn’t include an SPDX License Identifier.

The Ethereum community believes that trust in smart contract can be better established if their source code is available. For legal reasons, the SPDX License Identifier makes it easier for your source code to be accessed by others.

You should add it to the very top of your file (above the version):

// SPDX-License-Identifier: MIT

pragma solidity ^0.8.0

Now, you have a basic knowledge of Solidity code. Thanks for reading!

Remix IDE Alert: This contract may… not invoke an inherited contact’s constructor correctly

The Alert

I just tried deploying a contract on Remix IDE, then got this alert:

This contract may be abstract, not implement an abstract parent’s methods completely or not invoke an inherited contract’s constructor correctly.

Here is a screenshot:

Solution

In my case, the problem was that I had the wrong contract selected (lol)! Make sure you have the right contract selected

Getting Started with MetaMask

MetaMask is a free-to-use browser extension and smartphone app that allows you to interact with the Ethereum blockchain. On MetaMask, you can send and receive coins from your cryptocurrency wallet and use any of the massive array of decentralized apps built on the Ethereum blockchain.

Download MetaMask

First, navigate to the MetaMask website.

I’m installing on Google Chrome, so I’ll click the “Install MetaMask for Chrome.” If you’re installing for iOS or Android, click the designated button.

Then click “Add to Chrome.

MetaMask can be used on Chrome, FireFox, Brave and Edge browsers. Sorry, Safari users but there is no support for Safari as yet.

Then, you’ll be navigated to a page where you’ll set up your MetaMask account. If you don’t already have a wallet, click “Create a Wallet.

It will ask you to create a password. Then, it will give you your Secret Recovery Phrase. Your Secret Recovery Phrase is a 12-word phrase that is the “master key” to your wallet and your funds.

It is very, very, very important and it’s crucial that you don’t lose or share the phrase. If you forget it, there is absolutely nothing that MetaMask can do to recover your account and thus your funds.

Never, ever share your Secret Recovery Phrase, not even with MetaMask. If someone asks for your recovery phrase they are likely trying to scam you and steal your wallet funds.

MetaMask Settings

On the home screen of your account, if you click the three dots in the right corner, the following window should pop up:

In this window, you can update the name of your account to be something more personal than “Account 1.” You can also see the address that represents your account, which is the long string of characters. People can use this address, which is specifically only to you, to send you money.

You can create more than one account and each account will have its own account address.

You can use a tool called Etherscan to see some of the details of a MetaMask account. Etherscan is a platform that has the details of every transaction and account on the Ethereum blockchain (obviously, this does not include private information about accounts and transactions). Platforms such as Etherscan are possible because of the complete transparency and public nature of blockchain. Anyone has access to the records of any transaction on the blockchain. If you plug an account address into Etherscan, it will show you information such as the balance of the account.

Test networks

When you are making real transactions and working on the actual Ethereum blockchain, you would use the Ethereum Mainnet, which should be the default network that you are on (you can check in the top right corner of your main account screen).

If you’re a developer and you want to test out code on a fake blockchain, you’ll want to use a test network. You can turn the “Show test networks” setting on in your Settings to have access to test networks.

Test networks are networks that resemble Ethereum and act in the same way that Ethereum does but don’t use real money and are just for testing your applications.

How to Get Free (Fake) Ether

I will show you how to get free fake Ether on a test network for testing and learning purposes.

First, choose a test network. For this tutorial, I’ll be using the Rinkeby Test Network. Then, I navigate to the the Rinkeby Authenticated Faucet, which is a platform that provides the fake crypto. Here is the link. The website should look something like this:

Then, you need to make a post on social media including your public MetaMask account address.

I’m using Twitter. If you’re using Twitter, your tweet might look like this with the blacked-out portion being your public address (there isn’t really a reason for me to black it out on this post, but I just did it anyway).

Then, copy the link to the post you just made and paste it into the Faucet:

If the transaction is accepted, a green message should pop up saying that the transfer is accepted and it will go through.

Note: The networks are not always up and working. If the transaction is unsuccessful, try doing the same process on another test network.

If it was successful, you should soon see some Ether in your test network account:

If you go to the Rinkeby Etherscan (or the Etherscan for whichever test network you used) and search your MetaMask account address, you should now see that the balance has been updated and the transaction details are publicly available.

Top 12 VS Code Extensions

Visual Studio Code Extensions make your workflow so much more efficient and enjoyable in VS Code. This article provides a list of some of my favorite VS Code extensions. There is no ordering from best to worst.

1. Auto Rename Tag

If you change an HTML/XML tag, this extension will automatically update the paired tag.

2. Bracket Pair Colorizer

This extensions’ intuitive colorizing of matching brackets (and parentheses) makes it easier to see where a block starts and ends

3. Live Server

This extension launches a local server with live reload. This extension is great for updating static HTML/CSS/JS files quickly without creating your own server.

4. Prettier

Have you been formatting your code manually? No more. This extension will format your code for you on save.

5. Live Share

Do you want to work on a project with a friend or teammate? This extension allows you to collaborate on a project in VS Code in real-time. Think Google Docs but for code!

6. Quokka.js

This extension allows you to test out JavaScript and TypeScript code on the fly in an in-editor playground.

7. Import Cost

When you import a package into a file in VS Code, this extension will show you how much memory that package takes up.

8. GitLens

This extension allows you to visualize code authorship via Git blame annotations and code lens. You can also navigate in the history of a file back and forth to see the changes that were made on it.

9. Path Intellisense

This extension will auto-complete filenames and file paths for you as you type them.

10. Snippets

This is not a single extension but a collection of extensions that provide code snippets for specific languages and frameworks. Using snippets allows you to not have to type as much. The example I’m showing here is the ES7 React snippets for React developers.

11. Better Comments

This extension augments your ability to create more human-friendly comments in your code. You can categorize your annotations with different colors.

12. VS Code Icons

This extension adds icons to your files in the sidebar so it is easier to tell which is which when there is a lot of files.

Go check out all of these great extensions!

How to Create 3-D Charts with Matplotlib in Jupyter Notebook

In this article, I will show you how to work with 3D plots in Matplotlib. Specifically, we will be making a 3D line plot and surface plot.

First, import matplotlib and numpy. %matplotlib inline just sets the backend of matplotlib to the ‘inline’ backend, so the output of plotting commands shows inline (directly below the code cell that produced it).

import matplotlib.pyplot as plt
import numpy as np

%matplotlib inline

Add this import statement to work with 3D axes in matplotlib:

from mpl_toolkits.mplot3d.axes3d import Axes3D

Now, let’s generate an empty 3D plot

fig = plt.figure()
ax = plt.axes(projection='3d')

plt.show()

3-D Line Plot

Now, it’s time to put a graph on the plot. We’ll start by making a 3D line plot.

We need to define values for all 3 axes:

z = np.linspace(0, 1, 100)
x = z * np.sin(25 * z)
y = z * np.cos(25 * z)

If we print out the shape of the arrays we just created, we’ll see that they are one-dimensional arrays.

>>> print('Z Array: ', z.shape)
Z Array:  (100,)

This is important because the plot3D function only accepts 1D arrays as inputs. Now, we can add the plot!

ax.plot3D(x, y, z, 'blue')

plt.show()

Final code:

import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d.axes3d import Axes3D
%matplotlib inline

fig = plt.figure()
ax = plt.axes(projection='3d')

z = np.linspace(0, 1, 100)
x = z * np.sin(25 * z)
y = z * np.cos(25 * z)

ax.plot3D(x, y, z, 'blue')

plt.show()

3-D Surface Plot

Now, we will make a 3D surface graph on the plot.

We need to define values for all 3 axes:

x = np.linspace(start=-2, stop=2, num=200)
y = x_4.copy().T
z = 3**(-x_4**2-y_4**2)

The x, y, z arrays we just created are all 1D arrays. The surface plot function requires 2D array inputs so we need to convert the numpy arrays to be 2D. We can use the reshape function for this.

x = np.reshape(x,(1, x.size))
y = np.reshape(y,(1, y.size))
z = np.reshape(z,(1, z.size))

So now if we print our arrays, we see that they’re 2D.

>>> print('X Array: ', x.shape)
X Array:  (200, 200)

Now, we can plot the surface graph!

ax.plot_surface(x, y, z)

plt.show()

Final code:

import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d.axes3d import Axes3D
%matplotlib inline

fig = plt.figure()
ax = plt.axes(projection='3d')

x = np.linspace(start=-2, stop=2, num=200)
y = x_4.copy().T
z = 3**(-x_4**2-y_4**2)

x = np.reshape(x,(1, x.size))
y = np.reshape(y,(1, y.size))
z = np.reshape(z,(1, z.size))

ax.plot_surface(x, y, z)

plt.show()

We can also add a title and axis labels to the graph:

ax.set_title('My Graph')

ax.set_xlabel('X', fontsize=20)
ax.set_ylabel('Y', fontsize=20)
ax.set_zlabel('Z', fontsize=20)

This is just a basic intro to 3D charting with Matplotlib. There are a variety of other types of plots and customizations that you can make. Happy graphing!

Calculate Derivative Functions in Python

In machine learning, derivatives are used for solving optimization problems. Optimization algorithms such as gradient descent use derivatives to decide whether to increase or decrease the weights in order to get closer and closer to the maximum or minimum of a function. This post covers how these functions are used in Python.

Symbolic differentiation manipulates a given equation, using various rules, to produce the derivative of that equation. If you know the equation that you want to take the derivative of, you can do symbolic differentiation in Python. Let’s use this equation as an example:

f(x) = 2x2+5

Import SymPy

In order to do symbolic differentiation, we’ll need a library called SymPy. SymPy is a Python library for symbolic mathematics. It aims to be a full-featured computer algebra system (CAS). First, import SymPy:

from sympy import *

Make a symbol

Variables are not defined automatically in SymPy. They need to be defined explicitly using symbols. symbols takes a string of variable names separated by spaces or commas, and creates Symbols out of them. Symbol is basically just the SymPy term for a variable.

Our example function f(x) = 2x2+5 has one variable x, so let’s create a variable for it:

x = symbols('x')

If the equation you’re working with has multiple variables, you can define them all in one line:

x, y, z = symbols('x y z')

Symbols can be used to create symbolic expressions in Python code.

>>> x**2 + y         
x2+y
>>> x**2 + sin(y)   
x2+sin(y)

Write symbolic expression

So, using our Symbol x that we just defined, let’s create a symbolic expression in Python for our example function f(x) = 2x2+5:

f = 2*x**2+5

Take the derivative

Now, we’ll finally take the derivative of the function. To compute derivates, use the diff function. The first parameter of the diff function should be the function you want to take the derivative of. The second parameter should be the variable you are taking the derivative with respect to.

x = symbols('x')
f = 2*x**2+5

df = diff(f, x)

The output for the f and df should look like this:

>>> f
2*x**2+5
>>> df
4*x

You can take the nth derivative by adding an optional third argument, the number of times you want to differentiate. For example, taking the 3rd derivative:

d3fd3x = diff(f, x, 3)

Substituting values into expressions

So we are able to make symbolic functions and compute their derivatives, but how do we use these functions? We need to be able to plug a value into these equations and get a solution.

We can substitute values into symbolic equations using the subs method. The subs method replaces all instances of a variable with a given value. subs returns a new expression; it doesn’t modify the original one. Here we substitute 4 for the variable x in df:

>>> df.subs(x, 4)
16

To evaluate a numerical expression into a floating point number, use evalf.

>>> df.subs(x, 4).evalf()
16.0

To perform multiple substitutions at once, pass a list of (old, new) pairs to subs. Here’s an example:

>>> expr = x**3 + 4*x*y - z
>>> expr.subs([(x, 2), (y, 4), (z, 0)])
40

The lambdify function

If you are only evaluating an expression at one or two points, subs and evalf work well. But if you intend to evaluate an expression at many points, you should convert your SymPy expression to a numerical expression, which gives you more options for evaluating expressions, including using other libraries like NumPy and SciPy.

The easiest way to convert a symbolic expression into a numerical expression is to use the lambdify function.

The lambdify function takes in a Symbol and the function you are converting. It returns the converted function. Once the function is converted to numeric, you can Let’s convert the function and derivative function from our example.

>>> f = lambdify(x, f)
>>> df = lambdify(x, df)
>>> f(3)
23
>>> df(3)
12

Here’s an example of using lambdify with NumPy:

>>> import numpy
>>> test_numbers = numpy.arange(10)   
>>> expr = sin(x)
>>> f(test_numbers)
[ 0.          0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427  -0.2794155   0.6569866   0.98935825  0.41211849]

The application of these functions in a data science solution will be covered in another post.

Introduction to Gradient Descent with Python

In this article, I’m going to talk about a popular optimization algorithm in machine learning: gradient descent. I’ll explain what gradient descent is, how it works, and then we’ll write the gradient descent algorithm from scratch in Python. This article assumes you are familiar with derivatives in Calculus.

What is Gradient Descent?

Gradient descent is an optimization algorithm for finding the minimum of a function. The algorithm does this by checking the steepness of the slope along the graph of a line and using that to slowly move towards the lowest point, which presumably has a slope of 0. Write an algorithm to find the lowest cost.

You can think of gradient descent as akin to a blind-folded hiker on top of a mountain trying to get to the bottom. The hiker must feel the incline, or the slope, of the mountain in order to get an idea of where she is going. If the slope is steep, the hiker is closer to the peak and can take bigger steps. If the slope is less steep, the hiker is closer to the bottom and takes smaller steps. If the hiker feels flat ground (a zero slope), she can assume she’s reached the bottom, or minimum.

So given a function with a convex graph, the gradient descent algorithm attempts to find the minimum of the function by using the derivative to check the steepness of points along the line and slowly move towards a slope of zero. After all, “gradient” is just another word for slope.

Implement Gradient Descent in Python

Before we start, import the SymPy library and create a “symbol” called x. We’ll be needing these lines later when we are working with math functions and derivatives.

from sympy import *
x = Symbol('x')

We create our gradient_descent function and give it two parameters: cost_fn, starting_point and learning_rate. The cost_fn is the math function that we want to find the minimum of. The initial_guess parameter is the integer that is our first guess for the x-value of the minimum of the function. We will update this variable to be our new guess after each learning iteration. The last parameter is the learning rate.

def gradient_descent(cost_fn, initial_guess, learning_rate):
    df = cost_fn.diff(x)
    df = lambdify(x, df)

    new_x = initial_guess

    for n in range(100):
        # Step 1: Predict (Make a guess)
        previous_x = new_x

        # Step 2: Calculate the error
        gradient = df(previous_x)

        # Step 3: Learn (Make adjustments)
        new_x = previous_x - learning_rate * gradient

Inside the function, we first get the derivative of the cost function that was inputted as a parameter using the diff function of the SymPy library. We store the derivative in the df variable. Then, we use the lambdify function because it allows us to plug our predictions into the derivative function. Read my article on calculating derivatives in Python for more info on this.

In the for loop, our gradient descent function is following the 3-step algorithm that is used to train many machine learning tools:

  1. Predict (Make a guess)
  2. Calculate the error
  3. Learn (Make adjustments)

You can learn more about this process in this article on how machines “learn.”

In the for loop, the first step is to make an arbitrary guess for the x-value of the minimum of the function. We do this by setting previous_x to new_x, which is the user’s initial guess. previous_value will help us keep track of the preceding prediction value as we make new guesses.

Next, we calculate the error or, in other words, we see how far our current guess is from the minimum of the function. We do this by calculating the derivative of the function at the point we guessed, which will give us the slope at that point. If the slope is large, the guess is far from the minimum. But if the slope is close to 0, the guess is getting closer to the minimum.

Next, we “learn” from the error. In the previous step, we calculated the slope at the x-value that we guessed. We multiply that slope by the learning_rate and subtract that from the current guess value stored in previous_x. Then, we store this new guess value back into new_x.

Then, we run these steps over and over in our for loop until the loop is over.

Before we run our gradient descent function, let’s add some print statements at the end so we can see the values of at the minimum of the function.

print('Minimum occurs at x-value:', min_x)
print('Slope at the minimum is: ', df(min_x))

Now, let’s run our gradient descent function and see what type of output we get with an example. In this example, the cost function is f(x) = x2. The initial guess for x is 3 and the learning rate is 0.1

my_fn = x**2
gradient_descent(my_fn, 3, 0.1)

Currently, we are running the learning loop an arbitrary amount of times. In this example, the loop runs 100 times. But maybe we don’t need to run the loop this many times. Oftentimes you already know ahead of time how precise a calculate you need. You can tell the loop to stop running once a certain level of precision is met. There are many ways you can implement this, but I’ll show you using that for loop we already have.

precision = 0.0001

for n in range(100):
    previous_x = new_x
    gradient = df(previous_x)
    new_x = previous_x - learning_rate * gradient
    
    step_size = abs(new_x - previous_x) 
    
    if step_size < precision:
        break

First, we define a precision value that the gradient descent algorithm should be within. You can also make this a parameter to the function if you choose.

Inside the loop, we create a new variable called step_size which is the distance between previous_x and new_x, which is the new guess that was just calculated in the “learning” step. We take the absolute value of this difference in case it’s negative.

If the step_size is less than the precision we specified, the loop will finish, even if it hasn’t reached 100 iterations yet.

Instead of solving a cost function analytically, the gradient descent algorithm converges on the minimum of a function by brute force. Like a blind-folded hiker, the algorithm goes down the valley (the cost function), following the slope of the graph until it reaches the minimum point.

How Do Machines Learn?

You’ve probably heard of machine learning models that can read human handwriting or understand speech. You might know that these models had to be trained in order to accomplish these tasks– they had to learn. But how exactly does a machine “learn”? What are the steps involved?

In this article, I’m going to be giving a high-level overview of how the “learning” in machine learning happens. I’m going to talk about fundamental ML concepts including cost functions, optimization, and linear regression. I’ll outline the basic framework used in most machine learning techniques.

Data is the foundational of any machine learning model. In a nutshell, the data scientist feeds a bunch of data into the ML model and, as it starts to “learn” from the data, the model will eventually develop a solution. What is the solution? The solution is typically a function that describes the relationship in the data. For a given input, the function should be able to provide the expected output.

In the case of linear regression, one of the most basic ML models, the regression model “learns” two parameters: the slope and the intercept. Once the model learns these parameters to the desired extent, the model can be used to compute the output y for a given input X (in the linear regression equation y = b0 + b1*X). If you’re unfamiliar with linear regression, take a look at my article on linear regression to understand this better.

So now that we know what the goal of machine learning is, we can talk about how exactly the learning happens. The machine learning model usually follows three core steps in order to “learn” the relationship in the data as described by the solution function:

  1. Predict
  2. Calculate the error
  3. Learn

The first step is for the model to make a prediction. To start, the model may make arbitrary guesses for the values that it is solving for in the solution function. In the case of linear regression, the ML model would make guesses for the values of the slope and intercept.

Next, the model would check its prediction against the actual test data and see how good/bad the prediction was. In other words, the model calculates the error in its prediction. In order to compare the prediction against the data, we need to find a way to measure how “good” our prediction was.

Finally, the model will “learn” from its error by adjusting its prediction to have a smaller error.

The model will repeat these 3 steps– predict, calculate error, and learn– a bunch of times and slowly come to the best coefficients for the solution. This simple 3-step algorithm is the basis for training most machine learning models.

When I talked about calculating error earlier, I didn’t talk about the ways in which we measure how “good” or “bad” our predictions are. That leads me to the next topic: cost functions. In machine learning, a cost function is a mechanism that returns the error between predicted outcomes and the actual outcomes. Cost functions measure the size of the error to help achieve the overall goal of optimizing for a solution with the lowest cost.

The objective of an ML model is to find the values of the parameters that minimize the cost function. Cost functions will be different depending on the use case but they all have this same goal.

The Residual Sum of Squares is an example of a cost function. In linear regression, the Residual Sum of Squares is used to calculate and measure the error in predicted coefficient values. It does this by finding the sum of the gaps between the predicted values on the linear regression line and the actual data point values (check out this article for more detail). The lowest sum indicates the most accurate solution.

Cost functions fall under the broader category of optimization. Optimization is a term used in a variety of fields, but in machine learning it is defined as the process of progressing towards the defined goal, or solution, of an ML model. This includes minimizing “bad things” or “costs”, as is done in cost functions, but it also includes maximizing “good things” in other types of functions.

In summary, machine learning is typically done with a fundamental 3-step process: make a prediction, calculate the error, and learn / make adjustments. The error in a prediction is calculated using a cost function. Once the error is minimized, the model is done “learning” and is left with a function that should provide the expected result for future data.

Introduction to Linear Regression

In this article, I will define what linear regression is in machine learning, delve into linear regression theory, and go through a real-world example of using linear regression in Python.

What is Linear Regression?

Linear regression is a machine learning algorithm used to measure the relationship between two variables. The algorithm attempts to model the relationship between the two variables by fitting a linear equation to the data.

In machine learning, these two variables are called the feature and the target. The feature, or independent variable, is the variable that the data scientist uses to make predictions. The target, or dependent variable, is the variable that the data scientist is trying to predict.

Before attempting to fit a linear regression to a set of data, you should first assess if the data appears to have a relationship. You can visually estimate the relationship between the feature and the target by plotting them on a scatterplot.

If you plot the data and suspect that there is a relationship between the variables, you can verify the nature of the association using linear regression.

Linear Regression Theory

Linear regression will try to represent the relationship between the feature and target as a straight line.

Do you remember the equation for a straight line that you learned in grade school?

y = mx + b, where m is the slope (the number describing the steepness of the line) and b is the y-intercept (the point at which the line crosses the vertical axis)

Equation of a Straight Line

Equations describing linear regression models follow this same format.

The slope m tells you how strong the relationship between x and y is. The slope tells us how much y will go up or down for a given increase or decrease in x, or, in this case, how much the target will change for a given change in the feature.

In theory, a slope of 0 would mean there is no relationship at all between the data. The weaker the relationship is, the closer the slope is to 0. But if there is a strong relationship, the slope will be a larger positive or negative number. The stronger the relationship is, the steeper the slope is.

Unlike in pure mathematics, in machine learning, the relationship denoted by the linear equation is an approximation. That’s why we refer to the slope and the intercept as parameters and we must estimate these parameters for our linear regression. We even use a different notation in which the intercept constant is written first and the variables are greek symbols:

Simple Linear Regression in Python (From Scratch) | by Aidan Wilson |  Towards Data Science

Even though the notation is different, it’s the exact same equation of a line y=mx+b. It is important to know this notation though because it may come up in other linear regression material.

But how do we know where to make the linear regression line when the points are not straight in a row? There are a whole bunch of lines that can be drawn through scattered data points. How do we know which one is the “best” line?

There will usually be a gap between the actual value and the line. In other words, there is a difference between the actual data point and the point on the line (fitted value/predicted value). These gaps are called residuals. The residuals can tell us something about how “good” of an estimate our line is making.

Look at the size of the residuals and choose the line with the smallest residuals. Now, we have a clear method for the hazy goal of representing the relationship as a straight line. The objective of the linear regression algorithm is to calculate the line that minimizes these residuals.

For each possible line (slope and intercept pair) for a set of data:

  1. Calculate the residuals
  2. Square them to prevent negatives
  3. Add the sum of the squared residuals

Then, choose the slope and intercept pair that minimizes the sum of the squared residuals, also known as Residual Sum of Squares.

Linear regression models can also be used to estimate the value of the dependent variable for a given independent variable value. Using the classic linear equation, you would simply substitute the value you want to test for x in y = mx + b; y would be the model’s prediction for the target for your given feature value x.

Linear Regression in Python

Now that we’ve discussed the theory around Linear Regression, let’s take a look at an example.

Let’s say we are running an ice cream shop. We have collected some data for daily ice cream sales and the temperature on those days. The data is stored in a file called temp_revenue_data.csv. We want to see how strong the correlation between the temperature and our ice cream sales is.

import pandas
from pandas import DataFrame 

data = pandas.read_csv('temp_revenue_data.csv')

X = DataFrame(data, columns=['daily_temperature'])
y = DataFrame(data, columns=['ice_cream_sales'])

First, import Linear Regression from the scikitlearn module (a machine learning module in Python). This will allow us to run linear regression models in just a few lines of code.

from sklearn.linear_model import LinearRegression

Next, create a LinearRegression() object and store it in a variable.

regression = LinearRegression()

Now that we’ve created our object we can tell it to do something:

The fit method runs the actual regression. It takes in two parameters, both of type DataFrame. The feature data is the first parameter and the target data is the second. We are using the X and y DataFrames defined above.

regression.fit(X, y)     

The slope and intercept that were calculated by the regression are available in the following properties of the regression object: coef_ and intercept_. The trailing underline is necessary.

# Slope Coefficient
regression.coef_

# Intercept
regression.intercept_

How can we quantify how “good” our model is? We need some kind of measure or statistic. One measure that we can use is called R2, also known as the goodness of fit.

regression.score(X, y)
output: 0.5496...

The above output number (in percentage) is the amount of variation in ice cream sales that is explained by the daily temperature.

Note: The model is very simplistic and should be taken with a grain of salt. It especially does not do well on the extremes.