Bhaskar Karambelkar's Blog

  • My Thoughts on Northwestern University's MSPA

    This is my review of Northwestern University’s Masters in Predictive Analytics (MSPA) online degree. I enrolled in MSAP in the Summer of 2013 and finished in the Summer of 2016. Normally it shouldn’t take this long to finish this program, but I took a break after the first Q1/015 and resumed in Q1/2016. This blog post is a retrospective analysis of the program, what I got out of it, and what it meant to me.

  • Alternative to using legends in ggplot2

    Recently I got hold of some regional spending forecast data. I quickly plotted it using ggplot2, and here’s the first version of it. Figure 1: First Attempt The data is from 2014 and the values from 2015 to 2019 are the forecasted values. For now don’t worry about the validity of this data or the lack of margin of error in the forecasted values. Lets just concentrate on the problems with the visual elements of this chart.

  • Re-plotting Russian AirStrikes In Syria

    My Cartography mentor Bob Rudis pointed me to a blog post visualizing Russian Air Strikes in Syria and commanded me to redo the static maps to something more interactive and easier to explore. TL;DR Version Interactive Map at Rpubs created using Leaflet after scraping data using RSelenium+ PhantomJS + dplyr. You can use the LayerSelector at the Top Right to toggle various Base Tiles. Clicking on any Marker will show details about that Air Strike.

  • Shiny in a SmartOS zone

    My Last post showed you how to install R inside a SmartOS zone. This post is about installing the shiny server in the said zone. While setting up R was relatively straight forward, for setting up Shiny server I had to patch some C++ code to make shiny server work on solaris. Which means you don’t have to, just follow along. First install R in a zone as shown in my earlier post.

  • Setting up R on a SmartOS Zone.

    Recently I converted a spare beefy laptop (8 cores, 16 GB RAM, 750GB HD) to a SmartOS hypervisor. I wanted to play with some bare metal hypervisor / container stuff and ESXi was just not cutting it. I’m not a Solaris nerd, but I know enough Unix to find may way around in Linux/*BSDs/Solaris/HP-UX, so it was not a big pain. In fact ZFS is really nice. Anyway, this post is about setting up R in a zone.

  • Redoing some Bad Data Viz.

    I saw the above graph in my Twitter feed. This beauty comes from Business Insider and was part of this article describing the misery in the world. There are so many wrong visualization elements here. So let’s see what they are and if we can fix them. Stacked Bar Chart are not useful when you have to compare the category which doesn’t align on an axis. In this case you can’t really compare the inflation values of each country because they don’t have a common baseline.

  • Introduction to NoSQL Databases

    Recently I was asked to make a small presentation to a Graduate level course on Databases about NoSQL Databases. Here are the slides for the same. The slides go over high level introduction to NoSQL Databases, What they are ? What are some of the characteristics and how they differ from traditional relation databsaes ? Their Pros and Cons and finally some examples of different types of NoSQL DBs.  

  • Video of my talk on Elasticsearch at Elastic{ON} 2015

    Back in March, 2015 I gave a talk at Elastic{ON}, 2015 on how to scale Elasticsearch for production scale data. Here’s a blog post on it and here’s the video of it. I got a lot of positive feedback from the community on the talk and it was personally a wonderful experience to share our story with the ever growing elasticsearch community. The opportunity to speak at a large user conference was beneficial for me tooa as it allowed me to sharpen my public speaking skills.

  • Book Review : Data Driven Security

     Disclosure I work with the two authors of this book. In fact one of them is my manager. But a) I don’t like to suck up to my colleagues and b) I’m sure they don’t like being sucked up to either. Despite this if you think my review will be biased then stop reading now. Go watch some cat videos. Data Driven Security is a first of it’s kind book that aims to achieve the impossible; To be a book that integrates all 3 dimensions of ‘Data Science’, a) Math and Statistical Knowledge, b) Coding/Hacking skills, and c) Domain Knowledge.

  • The 10 commandments for hiring Data Scientists

    As a Data Scientist (whatever it means), I get a lot of job offers over LinkedIn and other channels. Although I’m not actively looking for a job, I still go through them. One just because I’m curious to find out what exactly do organizations look for in a Data Scientist, and secondly to amuse myself. This post is about the later part, it amuses me to no end what some people want in a Data Scientist, and I’ve made a consolidated list for all the recruiters and organizations who are looking to hire one (or more).

© 2015 Bhaskar V. Karambelkar. All rights reserved.