Teach What I Learn Programming

Posts

Unsupervised Machine Learning (KMeans Clustering) with Scikit-Learn

- March 17, 2018

Machine learning can be divided into two main categories, supervised machine learning and unsupervised machine learning. In supervised machine learning, we initially provide the data with it's corresponding label to train the model, with the trained model we can find the label for new data. But in unsupervised machine learning, we throw the data to the model without any labeling, the model will find patterns in the data. In Unsupervised machine learning we can't find the class of the data, but instead, we can group the data points that are similar, this process is known as clustering. There are number of clustering algorithms, in this article I will talk about KMeans Clustering . How KMeans Clustering work ? Let's understand this step by step, with the below image. Step (a) - Unsupervised Initial Data Step (b) - Choose random initial centroids (centroids are the center of the clusters.) , In this example we need to separate the data set into ...

Web Scraping with Python and Beautiful Soup

- March 04, 2018

For any machine learning task the first thing we need to do is data collection. There are number of different ways to collect data. Let's see some of those. 1. API's: The preferred way of data collection is consuming API's, the reason is, API's are well structured, also consuming an API is very simple. 2. Public Data sets: This is the second best option, there are people who have collected data and made it available to the public. Sites like Kaggle , University of California's Machine Learning Repository are some place where you can find public data sets. If you could not find data from the above two methods then the last option is to go for the web scraping. Choose the website which provides data for you, scrape the content, prepare your own data set. In this post I'm going to show you an example of scraping the cricbuzz website blog posts. Let's understand the scraping procedure. It's very simple, we have python libraries that wil...

CSS Specificity? What is it???

- March 01, 2018

I am damn sure most of you might find difficult to pronounce the word specificity, if you don't then congratulation you are some one special. Others don't get disappointed, the concept of specificity is not that hard to understand. So what is specificity? In CSS, to style a DOM element we can use different CSS declarations, for example in the below image all four CSS declarations can affect the h1 element. OK, now we know that we can style a DOM element with different CSS declarations, and then comes another question, among these declaration which declaration styles will be applied to the element (this matters when we try to change the same style property value, in this case we have color property in all the CSS declarations)? To answer this question, I would say the declaration with highest specificity will get the priority and its styles will be applied to the DOM element. Now let's comeback to the first question, what is specificity?...

Classification using K-Nearest Neighbor Classifier with Scikit Learn

- February 23, 2018

K-NN is a very simple machine learning algorithm which can classify a point based on other nearest points. Let's take an example, if you see the below image. We have set k = 3, this means that we will classify a point based on the nearest three points, in this case two of the three points are orange points therefore the unknown point (blue point) will be classified as an orange point. OK, let's do a real classification task, in this example I am going to classify the most popular iris flower data set. In this data set there are samples of three different types of Iris flower. Those are Versicolor, Verginica, Setosa and I am going to build a model that is capable of classifying a new flower into one of these categories. Let's code, Step 1 - Import the necessary libraries and load the data (I am going to make use of Scikit Learn's Iris Data set) Let's understand the data set, this data set consist of 150 records (Every Iris type has 50 r...

Search This Blog