Posts

Showing posts from 2015

Program For Linear Regression With Gradient Descent

Image
I took a Python program that applies gradient descent to linear regression and converted it to Ruby. But first, a recap: we use linear regression to do numeric prediction.

About Gradient Descent Algorithm

Image
Gradient Descent Algorithm is a key tool in Data Science used to find the minimum value of a function. You would have come across gradient in Microsoft PowerPoint or Adobe Photoshop when you start your slide or image creation. But what we are talking about here is all mathematical stuff, and to be specific - concepts from calculus.

Minor Update To gtdm-r

Image
I have made a minor update to the gtdm-r repository on github. gtdm-r has Ruby version of programs in " A Programmer's Guide To Data Mining ”.

HaaS

Image
I am not able to get on with my one-blog-post-per-week routine. I have a set of blog posts in various stages of writing, some just being a title to some being almost completed pieces. And I usually upload them on Fridays. If I missed the Friday mood and deadline, I wait till next Friday to post. But now the gap between posts are increasing. So I thought I will do a small emergency post. Today is Saturday and on Saturdays the road traffic is a bit lesser than week days. The peak congestion hours in Hyderabad are 10 am - 12 noon and 6:30 p.m. to around 9 p.m. And during these peak hours, a travel that take 30 - 40 minutes ends up taking 1 - 1.5 hours.

About Logistic Regression

Image
Logistic Regression is one of those non-intuitive terms. For normal people (as opposed to data scientists), logistics is a familiar term. In general, people understand that it means movement of goods. For those with a software engineering background, regression means running previous test cases with new code to check none of them have failed. Turns out logistic regression has nothing to do with these notions of logistics or regression.

3 Basic Analytics Algorithms : The long, the short and the applications

Image
Data Science or analytics is a combination of methodologies from statistics and machine learning. The three basic analytics algorithms, that a beginner data scientist comes across are: linear regression k-nn k-means

MultiSocial : Posting Tweets To Facebook

Image
The three social networks -- Twitter, Facebook and LinkedIn -- are basically communication forums each to a different set of people. On Twitter you are putting out your information and ideas in a very succinct manner and normally people don't hide their timelines. On Facebook you can choose to communicate with all people with internet access or to your friends or to a subset of friends, as you can control the privacy settings on each individual update. LinkedIn is the means to communicate with all your professional contacts.

A Basic Data Analysis Example In Core Java

Image
In this post, I discuss a basic data analysis example using core Java. Consider a bank which has presence in 50 districts having 10,000 accounts in each district. The balance in each of those 500,000 account is available say, in a text file. The bank manager is interested in finding, "In every district what are the three accounts with the largest balance?"

MyBankPortal : Starting with JAX-WS and Apache CXF

I was playing with Glen Mazza's DoubleIt example application which he presented in his blog post "Creating a WSDL-first web service with Apache CXF or GlassFish Metro". I wanted to make just a couple of changes, re-run the example and be done with it. Then, one thing lead to another and I converted it to a wholly different application, which I have named as MyBankPortal application Now I am thinking more. Using it as a base application, I would add more applications around it and have a sample integration scenario. Thus it will be a tutorial to more than on technology. MyBankPortal itself would have more and more functionality to illustrate various technical features. All this in due course of time.

Meteor And Rails

Image
Rails the darling of startups could possibly give way to the new kid on the block, Meteor which is built on JavaScript. In Berlin at least, MEAN another set of JavaScript technologies has over taken Rails.[1] Meteor has structural blocks and semantics that would resonate with Rails developers. In this blog post, I cover five such idioms common to both technologies. They are: i) templates ii) helpers iii) partials iv) fixtures, and v) routing. And also give relevant extracts from the books on Rails [2] and Meteor [3] on these aspects.

SOAP web services without JAXB

Image
What if you wanted to develop a SOAP web service without using JAXB? Now, as a Java programmer, why in the world would you not use JAXB? After all, it makes it so easy to map classes to XML representations.

Parsing XML: XPath with JDOM2

Image
About parsing XML, Horstmann & Cornell write: “To process an XML document, you need to parse it. A parser is a program that reads a file, confirms that the file has the correct format, breaks it up into the constituent elements, and lets a programmer access those elements. The Java library supplies two kind of XML parsers: Tree parsers, such as the Document Object Model (DOM) parser, that read an XML document into a tree structure. Streaming parsers, such as the Simple API for XML (SAX) parser, that generate events as they read an XML document. The DOM parser is easier to use for most purposes.”[1]

Generating xsds for XML interfaces

Image
XSD is XML Schema Definition, one of those recursive acronyms that I like. In this post, I share my notes on generating an xsd file from a given XML file. But first a disclaimer: It’s been quite a while since I played with XML, so my methods maybe a bit outdated; if you know of better tools and automation approaches please let me know.

H2 Reading List

Image
Half the year has gone by and I could not do much of reading till now. Most of the reading has been on the technical front, it being the subject matter of Data Science, Meteor, Ruby on Rails etc. On the fiction side, I managed just a few books like A Bad Character by Deepti Kapoor and The Madras Mangler by Usha Narayanan. Currently I am reading short stories by H.H. Munro and Chekhov. Reading, just like most other activities, is often opportunistic than planned. As time passes, you suddenly realize that you have not read any of the stuff you wanted to read. So I thought it fit to make a to-do reading list for the second half of this year. The list is given below. I may be reading other books based on necessity or recommendation, but these are the ones I am determined to complete just for the pleasure of reading.

A Meteor Introduction

Image
I have been dabbling with Meteor during the past few weeks. Yet another webapp development framework!! As if we don’t have enough of those. This also intends to develop your app rapidly. But the bells and whistles it comes with are rather compelling. The first time I got introduced to Meteor and looked at it, my reaction was : This is effing cool. I haven’t seen anything like this before.

Three Facebook Updates : Canon 5D Mark IV, Arijit Singh, Malupu

Image
Some of my Facebook updates are like blog posts. At what point do you say, hey this does not look like just a FB update, it ought to be on my blog? I don't know, I don't have any objective criteria. Periodically, I review my FB updates, combine a few of them at random without applying any objective criteria and put them here. This is one such post. It is the collection of three updates on the Canon 5D Mark-IV, Arijit Singh and Malupu.

Hucking Cool Matrix Factorization

Image
Matrix Factorization is hucking cool. Three times over. Firstly because it is a frequently mentioned requirement in data science jobs. For examples, you can take a look at this LinkedIn job description for Lead Machine Learning Scientist and this Apple job description for Data Scientist .

Mesmerizing Words

Image
Once in a while I come across words, words that don't let my eyes go further keeping them transfixed for a quantum of time and in those moments I am lost in the addiction to the diction as if time itself has stopped. It is not just the lustre and the luminosity of those words that cast a spell, it is also the meaning and depth conveyed, some times with the truthfulness of a friend, some times with the mentorship of a teacher, and some times with the mysticism of a saint. I call them mesmerizing words, and I will be posting them in my blog whenever possible.

n_multiple_min_N_zeros_ones

Image
During my recent wanderings, I faced a mathematical problem which had to be solved by a Java program. The problem is: You are given a number n. Find the smallest number N, which is divided by n, with the condition that N should have only 0s and 1s.

NoSQL / MongoDB Basics

Image
As a software developer or architect, you would have encountered and used NoSQL. Even if you haven't used it, it is good to have the basics of NoSQL and MongoDB at your fingertips. A couple of times, I faltered while discussing NoSQL / MongoDB. Hence this compilation of basic points, as a ready reference.

Ruby Version Of Programs In "A Programmer's Guide To Data Mining"

I have written the Ruby version of the programs in the online book "A Programmer's Guide To Data Mining". The book's website is http://guidetodatamining.com/ . This is my first small step into the world of analytics and now I feel I am a member of the exclusive Numerati club. Though the book is on data mining, it is a good entry level resource for machine learning. It simplifies and de-mystifies data analytics for the programmer using Python examples. I translated the programs into Ruby for my own learning, as well as for further simplification and de-mystification with easy language idioms.

Key Points From A Programmer's Guide To Data Mining

Preface To me, the best programmers are empty cups, who constantly explore new technology (noSQL, node.js, whatever) with open minds. Mediocre programmers have surrounded their minds with cities of delusion -- C++ is good, Java is bad, PHP is the only way to do web programming. MySQL is the only database to consider. My hope is that you will find some of the ideas in this book valuable and I ask that you keep a beginner's mind when reading it. As Shunryu Suzuki says:

Learning From A Programmer's Guide To Data Mining

During the past weeks, I was reading up, and learning from the online book, A Programmer's Guide to Data Mining by Ron Zacharski. The website is http://guidetodatamining.com/

Print Dendrogram In Ruby

Image
After waddling through a lot of Java code during the week, I decided to unwind in the weekend with Ruby code. And a favourite pastime is to convert Python code to Ruby code. I took one program, Dendrogram drawing -- a Python recipe, from the site code.activestate.com . What the program does is, to "Print dendrogram of a binary tree. Each tree node is represented by a length-2 tuple." When I read it first time, I too didn't understand fully as to what it does. In order to spare some of you the same agony, let's break down the quoted objective into key terms and get their meanings.

More Ruby idioms

In this post, I discuss three Ruby idioms I wrote as equivalents for Python code snippets that I had come across.

Yet Another New Year

Welcome to the turn of the calendar and an increment to a number. Hope something exciting and not-done-in-the-past kind of initiative happens. In your lives and mine. Unlike last year's first post, I wouldn't be specifying quantified targets for myself on the personal front. But, one thing that I learnt during 2014 was that online presence is a key differentiator/value-add in the current times. Check Michael Peggs if you are not sure about it.