Posts

Getting Started With Association Rule Algorithms in Machine Learning (Apriori) - with only 8 lines of code

Image
Machine Learning is all about finding patterns in the past data and using it to take future decisions. There are many types of algorithms available in machine learning like classification, regression, clustering and etc. Based on our requirement and resources we will choose one.  Association rules are another type of machine learning algorithms which will allow us to find the relationships between two or more elements. In association rules we will try to find data item that have association with another data item, to explain this let's take the super market for example, In super market arranging the products take lots of decision making, because they need to decide how to make the customer buy more products. Let's say if you keep a complementing product close to another product such as tea and sugar then there is a high possibility where customer will buy both. Finding association between tea and sugar is quite obvious, but do you know baby pampers and the beer have ...

Unsupervised Machine Learning (KMeans Clustering) with Scikit-Learn

Image
Machine learning can be divided into two main categories, supervised machine learning and unsupervised machine learning. In supervised machine learning, we initially provide the data with it's corresponding label to train the model, with the trained model we can find the label for new data.  But in unsupervised machine learning, we throw the data to the model without any labeling, the model will find patterns in the data. In Unsupervised machine learning we can't find the class of the data, but instead, we can group the data points that are similar, this process is known as clustering. There are number of clustering algorithms, in this article I will talk about KMeans Clustering . How KMeans Clustering work ? Let's understand this step by step, with the below image. Step (a) - Unsupervised Initial Data Step (b) - Choose random initial centroids (centroids are the center of the clusters.) , In this example we need to separate the data set into ...

Web Scraping with Python and Beautiful Soup

Image
For any machine learning task the first thing we need to do is data collection. There are number of different ways to collect data. Let's see some of those. 1. API's: The preferred way of data collection is consuming API's, the reason is, API's are well structured, also consuming an API is very simple. 2. Public Data sets: This is the second best option, there are people who have collected data and made it available to the public. Sites like  Kaggle , University of California's Machine Learning Repository  are some place where you can find public data sets. If you could not find data from the above two methods then the last option is to go for the web scraping. Choose the website which provides data for you, scrape the content, prepare your own data set.  In this post I'm going to show you an example of scraping the cricbuzz website blog posts.  Let's understand the scraping procedure. It's very simple, we have python libraries that wil...

CSS Specificity? What is it???

Image
I am damn sure most of you might find difficult to pronounce the word specificity, if you don't then congratulation you are some one special. Others don't get disappointed, the concept of specificity is not that hard to understand. So what is specificity? In CSS, to style a DOM element we can use different CSS declarations, for example in the below image all four CSS declarations can affect the h1 element.   OK, now we know that we can style a DOM element with different CSS declarations, and then comes another question, among these declaration which declaration styles will be applied to the element (this matters when we try to change the same style property value, in this case we have color property in all the CSS declarations)?  To answer this question, I would say the declaration with highest specificity will get the priority and its styles will be applied to the DOM element.  Now let's comeback to the first question, what is specificity?...

Classification using K-Nearest Neighbor Classifier with Scikit Learn

Image
K-NN is a very simple machine learning algorithm which can classify a point based on other nearest points. Let's take an example, if you see the below image. We have set k = 3, this means that we will classify a point based on the nearest three points, in this case two of the three points are orange points therefore the unknown point (blue point) will be classified as an orange point. OK, let's do a real classification task, in this example I am going to classify the most popular iris flower data set. In this data set there are samples of three different types of Iris flower. Those are Versicolor, Verginica, Setosa and I am going to build a model that is capable of classifying a new flower into one of these categories.  Let's code, Step 1 - Import the necessary libraries and load the data (I am going to make use of Scikit Learn's Iris Data set) Let's understand the data set, this data set consist of 150 records (Every Iris type has 50 r...