Return the n largest/smallest values for column in DataFrame

Get n-largest values from a particular column in Pandas DataFrame

df.nlargest(5, 'Gross')

Return the first n rows with the smallest values for column in DataFrame

df.nsmallest(5, ['Age'])

To order by the smallest values in column “Age” and then “Salary”, we can specify multiple columns like in the next example.

df.nsmallest(5, ['Age', 'Salary'])

There is also an optional keep parameter for the nlargest and nsmallest functions. keep has three possible values: {'first', 'last', 'all'}. The default is 'first'

Where there are duplicate values:

  • first : take the first occurrence.
  • last : take the last occurrence.
  • all : do not drop any duplicates, even it means selecting more than n items.
df.nlargest(5, 'Gross', keep='last')

Working with a New Dataset / DataFrame

When you are working with a new Pandas DataFrame, these attributes and methods will give you insights into key aspects of the data.

The dir function let’s you look at all of the attributes that a Python object has.

dir(df)

The shape attribute returns a tuple of integers indicating the number of elements that are stored along each dimension of an array. For a 2D-array with N rows and M columns, shape will be (N,M). 

df.shape

You may be working with a dataframe that has hundreds or thousands of rows. To get a glimpse of the data inside a dataframe without printing out all of the values you can use the head and tail methods.

Returns the first n rows in the dataframe

df.head() # returns rows 0-4
df.head(n) # returns the first n rows

Returns the last n rows in the dataframe

df.tail()
df.tail(n)

The count method of a dataframe shows you the number of entries for each column

df.count()

Check if there are any missing values in any of the columns

pd.isnull(df).any()

The info method of the dataframe gives a bunch of information. It tells

  1. The number of entries in the df
  2. The names of the columns
  3. The number of columns
  4. The number of entries in each column
  5. The dtype of each column
  6. If there are null values in a column
df.info()

Different Ways to Create Pandas DataFrames

A Pandas DataFrame is a 2D labeled data structure with columns of potentially different types.

There are a variety of different methods and syntaxes that can be used to create a pd.DataFrame.

Firstly, make sure you import the pandas module:

import pandas as pd

Method 1: Creating DataFrame from list of lists

# initialize list of lists
data = [['bob', 20], ['jane', 30], ['joe', 40]]
 
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Age'])
df

Output:

Method #2: Creating DataFrame from dictionary of lists

In this method, you define a dictionary which has the column name as the key which corresponds to an array of row values.

# initialize dictionary of lists
data = {'Name': ['Bob', 'Joe', 'Jane', 'Jack'],
        'Age': [30, 30, 21, 40]}
 
# Create DataFrame
df = pd.DataFrame(data)
df

Output:

You can use custom index values for the DataFrame by adding a parameter to the pd.DataFrame function. Set the optional index parameter of the pd.DataFrame function to an array of strings for the index values.

df = pd.DataFrame(data, index=['first',
                                'second',
                                'third',
                                'fourth'])
df

Output:

In the same way that we just defined the index values, you can also define the column names separately. Set the optional columns parameter of the pd.DataFrame function to an array of strings for the column values.

Notice that the row values are now defined as a list of lists rather than a dictionary of lists. This is because the column values are no longer being defined with them.

df = pd.DataFrame(
    [[4,5,6],
     [7,8,9],
     [10,11,12]],
    index = ['row_one','row_two','row_three'],
    columns=["a","b","c"]
    )

df

Output:

Method #3: Creating DataFrame using zip() function.

The zip function returns an iterator of tuples where the corresponding items in each passed iterator is paired together. By calling the list function on the object returned from the zip function, we convert the object to a list which can be passed into the pd.DataFrame function.

name = ["Bob", "Sam", "Sally", "Sue"]
age = [19, 17, 51, 49]

data = list(zip(name, age))

df = pd.DataFrame(data,
                  columns = ['Name', 'Age'])

df

Output:

Setup Firebase in Next.js Project

In this tutorial, I will show you how to set up Firebase in a Next.js project. It’s actually pretty simple.

Setup Next.js Project

First, create a new Next.js project:

npx create-next-app myproject

You can verify that everything is working okay by quickly running npm run dev.

Then, install the Firebase package into your new project:

npm install firebase

Setup Firebase Project

Now, head over to the Firebase website to set up a new Firebase project:

Then, select the option to add Firebase to your web app:

Next, register your app with a name and copy the code that is created for you. This code will initialize Firebase in your project.

Add Firebase config to Next.js Project

In your project, create a new file to hold the Firebase configuration code:

import { initializeApp } from "firebase/app";

const firebaseConfig = {
  apiKey: "...",
  authDomain: "...",
  projectId: "...",
  storageBucket: "...",
  messagingSenderId: "...",
  appId: "..."
};

export const firebaseApp = initializeApp(firebaseConfig);

Now, you can import the Firebase app into other files within your project and thus access Firebase in your project.

Import the app const from the Firebase config file you just created:

import firebaseApp from '../firebase_config';

Working with Firestore

Firestore is a powerful Cloud, NoSQL database within Firebase. There is a good chance you will use it in your web application.

To setup Firestore within your project, add the following code to your Firebase config file:

First, import the getFirestore function at the top:

import { getFirestore } from "firebase/firestore";

Then, create a database using the function, passing in your firebaseApp from before:

const db = getFirestore(firebaseApp);

Lastly, export the database along with your firebase app:

export { firebaseApp, db }

Now, you can work with Firestore in your project. Here’s an example:

import { firestore } from '../firebase_config';
import { useState } from 'react';

import { collection, QueryDocumentSnapshot, DocumentData } from "@firebase/firestore";

const todosCollection = collection(firestore,'todos');

const [todos,setTodos] = useState<QueryDocumentSnapshot<DocumentData>[]>([]);

This was just a very simple, bare-bones introduction to using Firebase in Next.js. Hope it helped.

How to Split Training and Test Data in Python

In this article, I’ll be explaining why you should split your dataset into training and testing data and showing you how to split up your data using a function in the scikitlearn library.

If you are training a machine learning model using a limited dataset, you should split the dataset into 2 parts: training and testing data.

The training data will be the data that is used to train your model. Then, use the testing data to see how the algorithm performs on a dataset that it hasn’t seen yet.

If you use the entire dataset to train the model, then by the time you are testing the model, you will have to re-use the same data. This provides a slightly biased outcome because the model is somewhat “used” to the data.

We will be using the train_test_split function from the Python scikitlearn library to accomplish this task. Import the function using this statement:

from sklearn.model_selection import train_test_split

This is the function signature for the train_test_split function:

sklearn.model_selection.train_test_split(*arrays, test_size=None, train_size=None, random_state=None, shuffle=True, stratify=None)

The first parameters to the function are a sequence of arrays. The allowed inputs are lists, numpy arrays, scipy-sparse matrices or pandas dataframes.

So the first argument is gonna be our features variable and the second argument is gonna be our targets.

# X = the features array
# y = the targets array
train_test_split(X, y, ...)

The next parameter test_size represents the proportion of the dataset to include in the test split. This parameter should be either a floating point number or None (undefined). If it is a float, it should be between 0.0 and 1.0 because it represents the percentage of the data that is for testing. If it is not specified, the value is set to the complement of the train size.

This is saying that I want the test data set to be 20% of the total:

train_test_split(X, y, test_size=0.2)

train_size is the proportion of the dataset that is for training. Since test_size is already specified, there is no need to specify the train_size parameter because it is automatically set to the complement of the test_size parameter. That means the train_size will be set to 1 – test_size. Since the test_size is 0.2, train_size will be 0.8.

The function has a shuffle property, which is set to True by default. If shuffle is set to True, the function will shuffle the dataset before splitting it up.

What’s the point of shuffling the data before splitting it? If your dataset is formatted in an ordered way, it could affect the randomness of your training and testing datasets which could hurt the accuracy of your model. Thus, it is recommended that you shuffle your dataset before splitting it up.

We could leave the function like this or add another property called random_state.

random_state controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls. We are using the arbitrary number 10. You can really use any number.

train_test_split(X, y, test_size=0.2, random_state=10)

The function will return four arrays to us: a training and testing dataset for the feature(s), and a training and testing dataset for the target.

We can use tuple unpacking to store the four values that the function returns:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)

Now, you can verify that the splitting was successful.

The percent of the training set will be the number of rows in X_train divided by the total number of rows in the dataset as a whole:

len(X_train)/len(X)

The percent of the testing dataset will be the number of rows in X_test divided by the total number of rows in the dataset:

len(X_train)/len(X)

The numbers returned by these calculations will probably not be exact numbers. For example, if you are using an 80/20 split, then this division by give you numbers like 0.7934728 instead of 0.80 and 0.1983932 instead of 0.20.

That’s it!

How to set Python default version to 3.x on macOS

In this tutorial, I’ll show you how to update your Mac’s default version of Python from 2.x to 3.x.

If Python 3.X is already installed on your system, even if it’s not the default version, then you should be able to run version 3.X using the python3 command in your Terminal. But in this tutorial, I’ll be showing you how you can run version 3.X using the default python command.

Step 0: Install Homebrew

Homebrew is a very popular package manager for macOS. If you don’t already have it installed, you can install it by running this command:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Step 1: Install Python with Homebrew

This command will Python 3.

brew install python

Step 2:

Next, we’ll check the symlinks for Python 3.x. A symlink or a Symbolic Link is simply enough a shortcut to another file.

ls -l /usr/local/bin/python*

This should output something like the following:

Look at the first line. It shows default python being symlinked to the brew installed python3. If it does not show this exactly, then set it as the default python symlink run the following:

ln -s -f /usr/local/bin/python3 /usr/local/bin/python

You should probably run this command just to be sure. Run ls -l /usr/local/bin/python* again and make sure you have the desired output.

Step 3: Verify Python 3.X install

Run the following command to see which executable Python binary will launch when you issue a python command to the shell:

which python

The output should be this: /usr/local/bin/python

Finally, check if the default Python version has changed:

python --version

The terminal should output the newest version of Python 3.x. For me, it is Python 3.9.7.

That’s it! Hope this helps.

Getting External Data with Chainlink

In this article, I’m going to show you how to code a contract that has access to the conversion rate between ETH and USD using the Chainlink decentralized oracle.

Blockchains are deterministic systems. This means that there is no room for variability amongst the data in nodes; each node should have the same exact data.

In order to maintain a blockchain’s deterministic nature, smart contracts on the blockchain are unable to connect with external systems, data feeds, APIs, existing payment systems or any other off-chain resources on their own.

In order to get data from the real world, smart contracts need to use an oracle. An oracle is a trusted third-party that interacts with the off-chain world to provide external data or computation to smart contracts. In other words, oracles are the bridges between blockchains and the real world.

Oracles can provide blockchains with information about currency prices, the weather, political results, and much more.

An oracle cannot be centralized because then we would have the same problem: nodes in a blockchain could end up with different data. Instead, oracles must be decentralized.

Chainlink is the most popular decentralized oracle provider.

Writing the Code

I created my contract on Remix IDE and I am using Remix to test my contract. I won’t be explaining how to use Remix in this article.

In order to work with the Chainlink oracle, we need to import their code into our contract using this line:

import "@chainlink/contracts/src/v0.8/interfaces/AggregatorV3Interface.sol";

What exactly is this line doing, though? Basically, we are importing code from the chainlink/contracts npm module. If you go to the project’s Github (here), and follow the path in the import statement (here), you will get to the code where they define the AggregatorV3Interface interface. So, basically, we are importing the AggregatorV3Interface interface into our project so we can work with it.

Interfaces are similar to abstract contracts. As you can see, the interface defines a function but it cannot define its implementation.

Interfaces tell your Solidity contract how it can interact with another contract. So, since we will be accessing Chainlink contracts, this interface tells our contract what functions we can use from the Chainlink contract.

We made a contract call to another contract from our contrat using interfaces. Interfaces are a minimalistic view into another contract.

We work with interfaces and other contracts the same way that we work with variables or structs within our own contracts.

Make Function to Get Interface Version

Now, we will access some information from the Chainlink blockchain using the interface. We will create a function to get the version of the interface.

function getVersion() public view returns (uint256){
}

First, we create an instance of the interface:

function getVersion() public view returns (uint256){
    AggregatorV3Interface priceFeed = AggregatorV3Interface()
}

In order to find where the ETH / USD price feed contract is located on the Chainlink Rinkeby blockchain, we can go to the Ethereum Price Feeds documentation:

It has a bunch of different price feeds. Find the ETH / USD price feed contract address and put the address in the parameter of the interface:

AggregatorV3Interface priceFeed = AggregatorV3Interface(0x8A753747A1Fa494EC906cE90E9f37563A8AF630e)

So, basically this line is saying that we have a contract that has these functions defined in the interface located at the address 0x8A753747A1Fa494EC906cE90E9f37563A8AF630e.

If this is true, we should be able to call the version method on priceFeed:

AggregatorV3Interface priceFeed = AggregatorV3Interface(0x8A753747A1Fa494EC906cE90E9f37563A8AF630e);
return priceFeed.version();

The full function:

Now, you can deploy your contract and see if it works. In my case, using Remix IDE, I was able to access the getVersion function through the IDE, so it was successful!

Make Function to Get Latest Price

Now, we will create a function to get the current price of ETH in terms of USD from Chainlink.

We start by defining the function and creating another instance of the interface:

function getPrice() public view returns(uint256){
        AggregatorV3Interface priceFeed = AggregatorV3Interface(0x8A753747A1Fa494EC906cE90E9f37563A8AF630e);
}

We are trying to access the answer return variable from the latestRoundData method of the interface (see interface code above).

The latestRoundData method returns 5 items. Since we only need answer, we can ignore the other variables and just add commas for them:

(,int256 answer,,,) = priceFeed.latestRoundData();

Then, convert answer to uint256 and return it:

return uint256(answer);

The full function code:

Summary

Here is the final code:

In this tutorial, you learned what oracles are and why they are important. You learned how to use the Chainlink oracle in your own smart contract to access real-world data on the blockchain.

Blockchain Development: Solidity Crash Course

Solidity is the programming language used to write Ethereum smart contracts. If you want to be an Ethereum blockchain developer, learning it should be one of the first things you do.

In this article, I’m going to be showing you some basic Solidity syntax and fundamental features that every smart contract must have. This article is for beginners but it assumes you have some prior knowledge in another programming language.

Versioning

At the top of any new smart contract file must be the Solidity version. You have a few options for denoting the version of your contract:

  • If the contract is for a specific Solidity version. In this case, 0.6.0:
pragma solidity 0.6.0
  • If the contract is for a version within the 0.6.0 range (0.6.0 to 0.6.12):
pragma solidity ^0.6.0
  • If the contract is for a version within the range from 0.6.0 to 0.9.0 (exclusive):
pragma solidity >=0.6.0 <0.9.0

You might be wondering: which version should I use? When deploying contracts, you should use the latest released version of Solidity. Apart from exceptional cases, only the latest version receives security fixes

Contract Declaration

Declare a new contract using the contract keyword followed by the contract’s name:

contract Bank {

} 

As you can see, the syntax for defining a new contract in Solidity is similar to the syntax for defining a class in a language like Java or Python.

This is the most simple version of a valid contract.

Types and Declaring Variables

Solidity has pretty much all of the same data types as any other programing language.

There are multiple different sizes of integers that you can create. For example,

uint256 newNumber = 6;

There are also some data types that you may not be familiar with.

For example, there is an address type for storing account addresses on the blockchain, such as a MetaMask account address:

address accountAddress = 0x29D7d1dd5B6f9C864d9db560D72a247c178aE86B;

Comments

Comments are denoted using two slashes (//)

// This is a comment in Solidity

Defining Functions

Defining functions in Solidity is similar to other languages

function storeMoney(uint256 _amount) {
}

If a function returns a value, you can use this syntax:

function storeMoney(uint256 _amount) returns (uint256) {
   return _amount;
}

Visibility

Visibility refers to where functions and variables within a Solidity contract are available. There are four types of visibility:

  1. External: Functions/variables with external visibility must be called by another contract; they can’t be called within the same contract
  2. Public: Functions/variables with public visibility can be called by anybody, including any users of the blockchain
  3. Internal: Functions/variables with internal visibility can only be accessed internally (i.e. from within the current contract or contracts deriving from it)
  4. Private: Functions/variables with private visibility are only visible for the contract they are defined in and not in derived contracts

To specify the visibility of a function:

function storeMoney(uint256 _amount) public {
}

To specify the visibility of a variable:

bool private isCheckingAccount;

View and Pure Functions

If a function updates the value of a variable in a smart contract, it is changing the contract’s state. Any state change in a smart contract is considered a transaction and thus requires a gas fee to update the contract.

There are two types of functions that do not update the state of the contract: view and pure functions. In other words, these are functions you do not have to make a transaction on and they don’t cost gas.

A view function reads some state off of the blockchain. Public variables are already technically view functions.

function retrieveBalance(uint256 _balance) public view returns (unit256) {
   return _balance;
}

A pure function is a function that purely does some type of math, but does not store the output of that math. Thus, there is no change of the blockchain’s state since no variables are being saved or updated.

function addBalances(uint256 balance) public pure {
    return balance + balance;   // balance is not being updated
}

Structs

Structs are a way to define new type in Solidity. They are structures that contain one or more native Solidity variable types.

Define a new struct:

struct Account {
   uint256 balance;
   string name;
}

Create an instance/object of the struct:

Account public myAccount = Account({ balance: 100, name: "John Doe" });

Define an array of struct objects;

Account[] public accounts;    // dynamic array (can grow to any size)

Account[3] public accounts;   // fixed array (fixed size of 3)

Memory

In Solidity, there are two main ways to store information: you can store it in memory or in storage.

When you store an object in memory, the data will only be stored during the execution of the function/contract call. When an object is stored in storage, the data will persist even after the function executes; it will persist.

Strings are actually not a variable type in Solidity. In Solidity, strings are objects– arrays of bytes. Since it’s an object, the programmer has to decide where they want to store it: memory or storage.

function addAccount(string memory _name, uint256 _balance) {
    accounts.push(Account(_balance, _name));
}

Mappings

mapping is a Solidity variable type that is similar to a dictionary in other languages. It is an array of key-value pairs (with 1 value per key). If given a key, a mapping spits out whatever variable that that key is mapped to.

mapping(string => uint256) public nametoBalance;

function addAccount(string memory _name, uint256 _balance) {
    accounts.push(Account(_balance, _name));
    nametoBalance[_name] = _balance;
}

SPDX License

Many times, the compiler will raise an error if your source code file doesn’t include an SPDX License Identifier.

The Ethereum community believes that trust in smart contract can be better established if their source code is available. For legal reasons, the SPDX License Identifier makes it easier for your source code to be accessed by others.

You should add it to the very top of your file (above the version):

// SPDX-License-Identifier: MIT

pragma solidity ^0.8.0

Now, you have a basic knowledge of Solidity code. Thanks for reading!

Remix IDE Alert: This contract may… not invoke an inherited contact’s constructor correctly

The Alert

I just tried deploying a contract on Remix IDE, then got this alert:

This contract may be abstract, not implement an abstract parent’s methods completely or not invoke an inherited contract’s constructor correctly.

Here is a screenshot:

Solution

In my case, the problem was that I had the wrong contract selected (lol)! Make sure you have the right contract selected

Getting Started with MetaMask

MetaMask is a free-to-use browser extension and smartphone app that allows you to interact with the Ethereum blockchain. On MetaMask, you can send and receive coins from your cryptocurrency wallet and use any of the massive array of decentralized apps built on the Ethereum blockchain.

Download MetaMask

First, navigate to the MetaMask website.

I’m installing on Google Chrome, so I’ll click the “Install MetaMask for Chrome.” If you’re installing for iOS or Android, click the designated button.

Then click “Add to Chrome.

MetaMask can be used on Chrome, FireFox, Brave and Edge browsers. Sorry, Safari users but there is no support for Safari as yet.

Then, you’ll be navigated to a page where you’ll set up your MetaMask account. If you don’t already have a wallet, click “Create a Wallet.

It will ask you to create a password. Then, it will give you your Secret Recovery Phrase. Your Secret Recovery Phrase is a 12-word phrase that is the “master key” to your wallet and your funds.

It is very, very, very important and it’s crucial that you don’t lose or share the phrase. If you forget it, there is absolutely nothing that MetaMask can do to recover your account and thus your funds.

Never, ever share your Secret Recovery Phrase, not even with MetaMask. If someone asks for your recovery phrase they are likely trying to scam you and steal your wallet funds.

MetaMask Settings

On the home screen of your account, if you click the three dots in the right corner, the following window should pop up:

In this window, you can update the name of your account to be something more personal than “Account 1.” You can also see the address that represents your account, which is the long string of characters. People can use this address, which is specifically only to you, to send you money.

You can create more than one account and each account will have its own account address.

You can use a tool called Etherscan to see some of the details of a MetaMask account. Etherscan is a platform that has the details of every transaction and account on the Ethereum blockchain (obviously, this does not include private information about accounts and transactions). Platforms such as Etherscan are possible because of the complete transparency and public nature of blockchain. Anyone has access to the records of any transaction on the blockchain. If you plug an account address into Etherscan, it will show you information such as the balance of the account.

Test networks

When you are making real transactions and working on the actual Ethereum blockchain, you would use the Ethereum Mainnet, which should be the default network that you are on (you can check in the top right corner of your main account screen).

If you’re a developer and you want to test out code on a fake blockchain, you’ll want to use a test network. You can turn the “Show test networks” setting on in your Settings to have access to test networks.

Test networks are networks that resemble Ethereum and act in the same way that Ethereum does but don’t use real money and are just for testing your applications.

How to Get Free (Fake) Ether

I will show you how to get free fake Ether on a test network for testing and learning purposes.

First, choose a test network. For this tutorial, I’ll be using the Rinkeby Test Network. Then, I navigate to the the Rinkeby Authenticated Faucet, which is a platform that provides the fake crypto. Here is the link. The website should look something like this:

Then, you need to make a post on social media including your public MetaMask account address.

I’m using Twitter. If you’re using Twitter, your tweet might look like this with the blacked-out portion being your public address (there isn’t really a reason for me to black it out on this post, but I just did it anyway).

Then, copy the link to the post you just made and paste it into the Faucet:

If the transaction is accepted, a green message should pop up saying that the transfer is accepted and it will go through.

Note: The networks are not always up and working. If the transaction is unsuccessful, try doing the same process on another test network.

If it was successful, you should soon see some Ether in your test network account:

If you go to the Rinkeby Etherscan (or the Etherscan for whichever test network you used) and search your MetaMask account address, you should now see that the balance has been updated and the transaction details are publicly available.