Machine Learning (Supervised Learning)

Photo by Reiseuhu on Unsplash

In my previous blog — Shades of Machine Learning — we discussed what are the two main types of machine learning algorithms. Just to brush up, we have Supervised Learning (where the target is known/ the data is labeled and the model works under this supervision) and Unsupervised Learning (where the target is not known/ the data is unlabeled so the algorithm doesn’t have any supervision).

In this blog, we’re just going to talk about Classification. We will be addressing some basic but important questions related to classification like — what does classification really mean? What kind of data can…

Supervised vs Unsupervised Learning

Photo by Iñaki del Olmo on Unsplash

In the last few blogs, we discussed various methods of cleaning and transforming data at different scales before applying any ML algorithm to it (you can find the links at the bottom of this article). This preprocessing of the data is required because dirty/messy data would not make any sense to the model and to us. So now that we have learned how to clean the data, let’s look at the main picture — the different ML techniques to solve the different types of problems around us.

Let’s start with the basic ML Techniques in the world!

Supervised vs Unsupervised Learning

Supervised Learning

As the name…

Performing feature engineering on the dataset

Photo by William Felker on Unsplash

The next step after exploring the patterns in data is feature engineering. Any operation performed on the features/columns which could help us in making a prediction from the data could be termed as Feature Engineering. This would include the following at high-level:

  1. adding new features
  2. eliminating some of the features which tell the same story
  3. combining several features together
  4. breaking down a feature into multiple features

Adding new features

Suppose you want to predict sales of ice-cream or gloves, or umbrella. What is common in these items? The sales of all these items are dependent on “weather” and “location”. Ice-creams sell more during…

The next step after importing the data

Photo by William Iven on Unsplash

In my previous blog, I explained how to clean the data, perform EDA (exploratory data analysis) and what is a basic feature engineering in brief.

So let’s say, you did a “read_csv” and imported the data. Now, what next? The next important thing to talk about is how can we understand and analyze various patterns in our dataset. This will help us address a few things :

a. It will help us understand how many missing values/duplicate values are currently present in the data (which are a problem for our model in the future) — explained in the previous post

Preprocessing, EDA & Feature Engineering

Photo by Franki Chamaki on Unsplash

Believe it or not, this part covers about 60–70% of the entire ML work! I will not talk about the codes here, because I believe that if you know the logic, finding a code isn’t super difficult, only getting there would vary from person to person. Some places to get the codes are StackOverflow, Kaggle, Github, or anything on Google which helps in making that logic work!

Instead, let’s talk about general logic/concepts which we as data scientists keep in mind when doing the initial analysis. This would include tidying the data, understanding different patterns behind, etc.

You can go…

A practical starter kit to data science for non-programmers

Image by Author

So I come across a lot of people from different backgrounds (Electronics, Mechanical, Undergraduates (1st/2nd year), Sales, Finance, etc. who want to explore the Data Science field, and some of the questions they have are these, which even I was wondering about when I started. —

  • Can I enter in the data science world?
  • Can I do without coding?
  • Where to get the data from?
  • Or more importantly, where is the problem that I can solve?

Well, the answer to the first question — Yes. Anyone can enter the field. It’s all about consistent learning and applying logic.

And for…

Understanding how to make the business and the data science teams work together

Image by author
Image by author

Data Science is used to provide a “data-driven solution”. But before we can solve a problem, we need a problem! And problems come from the pain points of the different business teams. (I will be using the term “Business Teams” for the various teams which are part of every organization -> Marketing and Sales, Customer Services, R&D, Production & Distribution, IT, HR, Finance, etc.)

In my previous post, I tried to bring together the important stages of a Machine Learning Pipeline, helpful for anyone new in this field. …

10 steps to help you become a data scientist in the real world!

I often hear people without a data background having doubts about Machine learning. And honestly, I had similar doubts a few years back! So, this is my attempt to make you all understand Machine Learning one step at a time.

I am sure you all know that Machine learning is a part of Artificial Intelligence which is a part of Computer Science. Some of the places where it is used in our day to day lives are: People you may Know (Facebook), Personal Assistant (Google Home, Alexa, etc.), Email Spam Detection, Customers who viewed this item also viewed (Amazon), etc.

Kriti Srivastava

Howdy! Data Scientist from Texas| Dog Mom!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store