Detect Fraud In Credit Card Data

sehan farooqui
5 min readJul 29, 2019

Introduction:

Most of the times my connections at LinkedIn ask me about how to get started with the data science? I think the question should be how to find out if data science is really my passion or is it just a market trend these days which is why I think I should join the rat-race?

“Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do. If you haven’t found it yet, keep looking”-Steve Jobs

I think the best way to find out the passion is by diving in it and for that purpose in my spare time I usually work on my personal portfolio. In this blog I will share my recent work related to the fraud detection for credit card dataset which I acquired from UCI repository. Later I will publish a separate blog which will cover the deployment of the model using flask (python web-framework) and Heroku free account.

In this blog we will briefly discuss the exploratory data analysis which is then followed by feature selection using mutual information which is one of the filter method and lastly we will train our model (Random Forest) and evaluate the model by model evaluation metrics.

Exploratory Data Analysis (EDA):

Exploratory data analysis is about 70% of the overall effort in any data science project. The best way to get started with EDA is to start framing the questions…

--

--

sehan farooqui
sehan farooqui

Written by sehan farooqui

By profession I am data scientist (love to interpret hidden stories from data), swimmer, software engineer and love to learn from reading books.