Book Review: The Manga Guide to Linear Algebra

As my data science class has been covering more machine learning models, I have found myself in need of a linear algebra refresher. Luckily, my local library has several books on the subject. I picked ‘The Manga Guide to Linear Algebra’ by Shin Takahashi and Iroha Inoue for the novelty of a math-based graphic novel. As someone who often checks out multiple books on topics of interest, this seemed like a good starting point before moving to something more serious. However, it turns out that this book covers the topic quite well…

A house’s zip code is one of the most significant factors in determining its price.

Project Overview

As part of the Flatiron School’s Data Science program, we were tasked with building a linear regression model using a dataset containing King County Housing Data from 2014–15. I decided to analyze this data from the perspective of a company that would like to build an app to help home buyers in their search for a house in the very competitive King County housing market.



The dataset contains just over 20,000 rows representing house sales in King County. The features include things like number of bedrooms, bathrooms, square feet, zip code, etc. Only 3 features contained…

Data scaling is a useful tool in machine learning. It creates a common scale for the numeric features of a dataset and thus eliminates the problem of data with larger values having a stronger influence on the machine learning model.

There are many ways to scale data, each of which influences the data’s values and histogram in different ways. Here, we will look at several different methods to scale data available in Scikit Learn. To examine how each method changes the data, I have created a dataset with 3 features with different distributions as follows-

skewed = [2, 2, 4…

Welcome to my data science blog! I am writing this blog as a part of my data science coursework and job search. The first task is to explain “Why data science?” There are several aspects of data science that correspond with my skills and interests, including data, problem solving, and programming.

At its base, the answer to “Why data science?” is that I want to play with data. Data holds so much information just waiting to be discovered if we can look at it in the right way. In my past life as a cellular and molecular biologist, I had…

