Sign in

4MViews. Bridging the gap between Data Science and Intuition. MLE@FB, Ex-WalmartLabs, Citi. Connect on Twitter @mlwhiz
Accessible Learning, Image by silviarita from Pixabay

An action plan from a Data Scientist

I am a Mechanical engineer by education. And I started my career with a core job in the steel industry.

With those heavy steel enforced gumboots and that plastic helmet, venturing around big blast furnaces and rolling mills. Artificial safety measures, to say the least, as I knew that nothing would save me if something untoward happens. Maybe some running shoes would have helped. As for the helmet. I would just say that molten steel burns at 1370 degrees C.

As I realized based on my constant fear, that job was not for me, and so I made it my…

Photo by ThisisEngineering RAEng on Unsplash

Parallelism and concurrency aren’t the same things. In some cases, concurrency is much more powerful. Here is a guide to help you make the most of concurrency with Asyncio.

Python is an easy language to pick up, but mastering it requires understanding a lot of concepts.

In my last post, I talked about using a multiprocessing module to do parallel processing. Python also offers users the power to work with concurrency using the Asyncio module from version 3.4 forward.

For people who come from a JavaScript background, this concept might not be new, but for people coming from Python 2.7 (yes, that’s me), Asyncio may prove to be hard to understand, as does the difference between concurrency and parallelism. …

Photo by Marc-Olivier Jodoin on Unsplash

Parallel processing in Python using Multiprocessing and Joblib

Finally, my program is running! Should I go and get a coffee?

We data scientists have got powerful laptops. Laptops which have quad-core or octa-core processors and Turbo Boost technology. We routinely work with servers with even more cores and computing power. But do we really use the raw power we have at hand?

Instead of taking advantage of our resources, too often we sit around and wait for time-consuming processes to finish. Sometimes we wait for hours, even when urgent deliverables are approaching the deadline. Can we somehow do better?

In this post, I will explain how to use…

In one of my previous posts, I talked about how to become a data Scientist using some awesome resources from Coursera. This post is still one of the posts I ask my readers to read on as this defines the actual rigorous curriculum a data scientist should go through in order to succeed.

But just recently I came to know from Coursera that they have floated few of there courses for learners in India in these trying times for free and I couldn’t stop myself to rave about them in this post.

So what’s stopping you now?

1. Code Yourself! An Introduction to Programming

This course is…

Image by Please Don’t sell My Artwork AS IS from Pixabay

Using BERT and Huggingface to create a Question Answer Model

In my last post on BERT, I talked in quite a detail about BERT transformers and how they work on a basic level. I went through the BERT Architecture, training data and training tasks.

But, as I like to say, we don’t really understand something before we implement it ourselves. So, in this post, we will implement a Question Answering Neural Network using BERT and HuggingFace Library.

What is a Question Answering Task?

In this task, we are given a question and a paragraph in which the answer lies to our BERT Architecture and the objective is to determine the start and end span for the…

Binary Search: Image by Author

Algorithms Interviews

With this Simple Trick

Algorithms are an integral part of data science. While most of us data scientists don’t take a proper algorithms course while studying, they are important all the same. Many companies ask data structures and algorithms as part of their interview process for hiring data scientists.

Now the question that many people ask here is what is the use of asking a data scientist such questions. The way I like to describe it is that a data structure question may be thought of as a coding aptitude test.

We all have given aptitude tests at various stages of our life, and…

Image by Cedric Yong from Pixabay

BERT introduced by Google in 2018 was one of the most influential papers for NLP. But it is still hard to understand.

In my last series of posts on Transformers, I talked about how a transformer works and how to implement one yourself for a translation task.

In this post, I will go a step further and try to explain BERT, one of the most popular NLP models that utilize a Transformer at its core and which achieved State of the Art performance on many NLP tasks including Classification, Question Answering, and NER Tagging when it was first introduced.

Specifically, unlike other posts on the same topic, I will try to go through the highly influential BERT paperPre-training of Deep…

Image by Joshua Woroniecki from Pixabay

Also called Magic Methods, Dunder Methods are necessary to understand Python

In my last post, I talked about Object-Oriented Programming(OOP). And I specifically talked about a single magic method __init__ which is also called as a constructor method in the OOP terminology.

The magic part of __init__ is that it gets called whenever an object is created automatically. But it is not the only one in any sense. Python provides us with many other magic methods that we end up using without even knowing about them. Ever used len(), print() or the [] operator on a list? You have been using dunder methods.

In this post, I would talk about five…

Image by Gerd Altmann from Pixabay

ROC curves are one of the most common evaluation metrics for checking a classification model’s performance. This guide will help you to truly understand how ROC curves and AUC work together

ROC curves, or receiver operating characteristic curves, are one of the most common evaluation metrics for checking a classification model’s performance. Unfortunately, many data scientists often just end up seeing the ROC curves and then quoting an AUC (short for the area under the ROC curve) value without really understanding what the AUC value means and how they can use them more effectively.

Other times, they don’t understand the various problems that ROC curves solve and the multiple properties of AUC like threshold invariance and scale invariance, which necessarily means that the AUC metric doesn’t depend on the chosen threshold…

Photo by Aaron Burden on Unsplash

TLDR; Do let me know in the comments or to my mail ID:, if you want to contribute articles on ML or Data Science. I would add you as an author.

Full Disclosure here: As part of the editing, I would try to provide you with comments and editing guidelines on your articles. I might also do some hyperlinking to other articles and add some links to my site and some courses as well. I might also add some blogs on my original site as well(

A note on minimal formatting requirements for a post, apart from good content:

Rahul Agarwal

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store