Friday, 4 December 2015

Program For Linear Regression With Gradient Descent

I took a Python program that applies gradient descent to linear regression and converted it to Ruby. But first, a recap: we use linear regression to do numeric prediction.

Friday, 27 November 2015

About Gradient Descent Algorithm

Gradient Descent Algorithm is a key tool in Data Science used to find the minimum value of a function. You would have come across gradient in Microsoft PowerPoint or Adobe Photoshop when you start your slide or image creation. But what we are talking about here is all mathematical stuff, and to be specific - concepts from calculus.

Sunday, 15 November 2015

Minor Update To gtdm-r

I have made a minor update to the gtdm-r repository on github. gtdm-r has Ruby version of programs in "A Programmer's Guide To Data Mining”.

Saturday, 7 November 2015


I am not able to get on with my one-blog-post-per-week routine. I have a set of blog posts in various stages of writing, some just being a title to some being almost completed pieces. And I usually upload them on Fridays.

If I missed the Friday mood and deadline, I wait till next Friday to post. But now the gap between posts are increasing. So I thought I will do a small emergency post.

Today is Saturday and on Saturdays the road traffic is a bit lesser than week days. The peak congestion hours in Hyderabad are 10 am - 12 noon and 6:30 p.m. to around 9 p.m. And during these peak hours, a travel that take 30 - 40 minutes ends up taking 1 - 1.5 hours.

Friday, 9 October 2015

About Logistic Regression

Logistic Regression is one of those non-intuitive terms. For normal people (as opposed to data scientists), logistics is a familiar term. In general, people understand that it means movement of goods. For those with a software engineering background, regression means running previous test cases with new code to check none of them have failed.

Turns out logistic regression has nothing to do with these notions of logistics or regression.

Friday, 2 October 2015

3 Basic Analytics Algorithms : The long, the short and the applications

Data Science or analytics is a combination of methodologies from statistics and machine learning. The three basic analytics algorithms, that a beginner data scientist comes across are:
  • linear regression
  • k-nn
  • k-means

Friday, 18 September 2015

MultiSocial : Posting Tweets To Facebook

The three social networks -- Twitter, Facebook and LinkedIn -- are basically communication forums each to a different set of people.

On Twitter you are putting out your information and ideas in a very succinct manner and normally people don't hide their timelines. On Facebook you can choose to communicate with all people with internet access or to your friends or to a subset of friends, as you can control the privacy settings on each individual update. LinkedIn is the means to communicate with all your professional contacts.

Friday, 11 September 2015

A Basic Data Analysis Example In Core Java

In this post, I discuss a basic data analysis example using core Java. Consider a bank which has presence in 50 districts having 10,000 accounts in each district. The balance in each of those 500,000 account is available say, in a text file. The bank manager is interested in finding, "In every district what are the three accounts with the largest balance?"

Friday, 4 September 2015

MyBankPortal : Starting with JAX-WS and Apache CXF

I was playing with Glen Mazza's DoubleIt example application which he presented in his blog post "Creating a WSDL-first web service with Apache CXF or GlassFish Metro". I wanted to make just a couple of changes, re-run the example and be done with it. Then, one thing lead to another and I converted it to a wholly different application, which I have named as MyBankPortal application

Now I am thinking more. Using it as a base application, I would add more applications around it and have a sample integration scenario. Thus it will be a tutorial to more than on technology. MyBankPortal itself would have more and more functionality to illustrate various technical features.

All this in due course of time.

Friday, 28 August 2015

Meteor And Rails

Rails the darling of startups could possibly give way to the new kid on the block, Meteor which is built on JavaScript. In Berlin at least, MEAN another set of JavaScript technologies has over taken Rails.[1]

Meteor has structural blocks and semantics that would resonate with Rails developers. In this blog post, I cover five such idioms common to both technologies. They are: i) templates ii) helpers iii) partials iv) fixtures, and v) routing. And also give relevant extracts from the books on Rails [2] and Meteor [3] on these aspects.

Friday, 21 August 2015

SOAP web services without JAXB

What if you wanted to develop a SOAP web service without using JAXB? Now, as a Java programmer, why in the world would you not use JAXB? After all, it makes it so easy to map classes to XML representations.

Friday, 14 August 2015

Parsing XML: XPath with JDOM2

About parsing XML, Horstmann & Cornell write: “To process an XML document, you need to parse it. A parser is a program that reads a file, confirms that the file has the correct format, breaks it up into the constituent elements, and lets a programmer access those elements. The Java library supplies two kind of XML parsers:
  • Tree parsers, such as the Document Object Model (DOM) parser, that read an XML document into a tree structure.
  • Streaming parsers, such as the Simple API for XML (SAX) parser, that generate events as they read an XML document.
The DOM parser is easier to use for most purposes.”[1]

Friday, 7 August 2015

Generating xsds for XML interfaces

XSD is XML Schema Definition, one of those recursive acronyms that I like. In this post, I share my notes on generating an xsd file from a given XML file. But first a disclaimer: It’s been quite a while since I played with XML, so my methods maybe a bit outdated; if you know of better tools and automation approaches please let me know.

Friday, 3 July 2015

H2 Reading List

Half the year has gone by and I could not do much of reading till now. Most of the reading has been on the technical front, it being the subject matter of Data Science, Meteor, Ruby on Rails etc. On the fiction side, I managed just a few books like A Bad Character by Deepti Kapoor and The Madras Mangler by Usha Narayanan.

Currently I am reading short stories by H.H. Munro and Chekhov. Reading, just like most other activities, is often opportunistic than planned. As time passes, you suddenly realize that you have not read any of the stuff you wanted to read.

So I thought it fit to make a to-do reading list for the second half of this year. The list is given below. I may be reading other books based on necessity or recommendation, but these are the ones I am determined to complete just for the pleasure of reading.

Friday, 26 June 2015

A Meteor Introduction

I have been dabbling with Meteor during the past few weeks. Yet another webapp development framework!! As if we don’t have enough of those. This also intends to develop your app rapidly. But the bells and whistles it comes with are rather compelling.

The first time I got introduced to Meteor and looked at it, my reaction was : This is effing cool. I haven’t seen anything like this before.

Friday, 22 May 2015

Three Facebook Updates : Canon 5D Mark IV, Arijit Singh, Malupu

Some of my Facebook updates are like blog posts.

At what point do you say, hey this does not look like just a FB update, it ought to be on my blog? I don't know, I don't have any objective criteria.

Periodically, I review my FB updates, combine a few of them at random without applying any objective criteria and put them here. This is one such post. It is the collection of three updates on the Canon 5D Mark-IV, Arijit Singh and Malupu.

Friday, 8 May 2015

Hucking Cool Matrix Factorization

Matrix Factorization is hucking cool. Three times over. Firstly because it is a frequently mentioned requirement in data science jobs. For examples, you can take a look at this LinkedIn job description for Lead Machine Learning Scientist and this Apple job description for Data Scientist.

Friday, 1 May 2015

Mesmerizing Words

Once in a while I come across words, words that don't let my eyes go further keeping them transfixed for a quantum of time and in those moments I am lost in the addiction to the diction as if time itself has stopped. It is not just the lustre and the luminosity of those words that cast a spell, it is also the meaning and depth conveyed, some times with the truthfulness of a friend, some times with the mentorship of a teacher, and some times with the mysticism of a saint. I call them mesmerizing words, and I will be posting them in my blog whenever possible.

Friday, 24 April 2015


During my recent wanderings, I faced a mathematical problem which had to be solved by a Java program. The problem is:
You are given a number n. Find the smallest number N, which is divided by n, with the condition that N should have only 0s and 1s.

Friday, 17 April 2015

NoSQL / MongoDB Basics

As a software developer or architect, you would have encountered and used NoSQL. Even if you haven't used it, it is good to have the basics of NoSQL and MongoDB at your fingertips. A couple of times, I faltered while discussing NoSQL / MongoDB. Hence this compilation of basic points, as a ready reference.

Friday, 27 March 2015

Ruby Version Of Programs In "A Programmer's Guide To Data Mining"

I have written the Ruby version of the programs in the online book "A Programmer's Guide To Data Mining". The book's website is This is my first small step into the world of analytics and now I feel I am a member of the exclusive Numerati club.

Though the book is on data mining, it is a good entry level resource for machine learning. It simplifies and de-mystifies data analytics for the programmer using Python examples. I translated the programs into Ruby for my own learning, as well as for further simplification and de-mystification with easy language idioms.

Thursday, 26 March 2015

Key Points From A Programmer's Guide To Data Mining

To me, the best programmers are empty cups, who constantly explore new technology (noSQL, node.js, whatever) with open minds. Mediocre programmers have surrounded their minds with cities of delusion -- C++ is good, Java is bad, PHP is the only way to do web programming. MySQL is the only database to consider. My hope is that you will find some of the ideas in this book valuable and I ask that you keep a beginner's mind when reading it. As Shunryu Suzuki says:

Wednesday, 25 March 2015

Learning From A Programmer's Guide To Data Mining

During the past weeks, I was reading up, and learning from the online book, A Programmer's Guide to Data Mining by Ron Zacharski. The website is

Sunday, 22 February 2015

Print Dendrogram In Ruby

After waddling through a lot of Java code during the week, I decided to unwind in the weekend with Ruby code. And a favourite pastime is to convert Python code to Ruby code.

I took one program, Dendrogram drawing -- a Python recipe, from the site What the program does is, to "Print dendrogram of a binary tree. Each tree node is represented by a length-2 tuple." When I read it first time, I too didn't understand fully as to what it does. In order to spare some of you the same agony, let's break down the quoted objective into key terms and get their meanings.

Thursday, 8 January 2015

More Ruby idioms

In this post, I discuss three Ruby idioms I wrote as equivalents for Python code snippets that I had come across.

Friday, 2 January 2015

Yet Another New Year

Welcome to the turn of the calendar and an increment to a number. Hope something exciting and not-done-in-the-past kind of initiative happens. In your lives and mine.

Unlike last year's first post, I wouldn't be specifying quantified targets for myself on the personal front. But, one thing that I learnt during 2014 was that online presence is a key differentiator/value-add in the current times. Check Michael Peggs if you are not sure about it.