Posts

My First Infographic poster

Image

R | Vectors vs List vs Array vs Matrix vs dataframe

For a non-R programmers some of the R data-types might be a bit confusing. R Non-R InR, vector and list and arrays are different and has different properties. List can be named or un-named Un-named list are Arrays Named lists are dictionary However Array and Dictionary can contain another Array or a dictionary, but its still called Arrays and dictionaries. Vector Simplest form of R object All values in a vector should be of same obj/class. However they cant contain another vector or array or list. (non-R programmers terms) its like array but with all values in the array having same data-types List It is like a vector but can contain any obj/class. class also have named attributes. can contain list within a list etc. (non-R programmers terms) its like array (if not named) and dictionaries( if named) Array Its a list with one, two or more dimensions dimensions can be named (non-R programmers te...

R | Rcmdr ; Rattle on R3.1 and Mac OSX yosemite

Image
Today I tried to install Rcmdr on my MBP( yosemite ) and found that it was the installation was not successful and kept throwing a lot of errors and also some of the dependancies packages were not getting installed. I also noticed that some of the dependancies are not available for R.3.1 and yosemite. After spending a lot of time googling and trial and error, now I got everything working  Steps to do Since Rcmdr uses xWindows system, we need a OSX version of the same. So make sure you have installed the Quartz(X11) from this site - http://xquartz.macosforge.org/landing/ Install macports -  https://www.macports.org/install.php Make sure you have already installed the Xcode. Since some of the dependancies binaries are not available for the current R version, we might have to compile them from source Download RGtk2 source and cairoDevice source from CRAN ( download the "packaged source") http://cran.r-project.org/web/packages/RGtk2/     http://cran.r-pro...

Python | Push To GeckoBoard

Image
I'm doing a python framework which is capable of pushing the data to Geckoboard. Here is a the sample of the code and it handles for "highcharts"(different kinds) and gecko-o-meter. It can be extended to other charts as well. This is just an template/idea. Here is a sample i made #!env python -tt ''' 1) Gecko is used for pushing data to Gecko board ---------------------- Modification History ---------------------- -- 22-Jan-2015| vsubr | created ''' import sys import datetime from collections import OrderedDict import pprint import json import requests import MySQLdb from contextlib import closing import mod.conf.cfg as cfg import mod.lib.logger as logger import mod.conf.settings as settings class Gecko(object): ''' "Task" class takes Gecko id as the input and executes that Gecko-task. ''' #-- use this site to check the colors http://html-color-codes.info/ arr_gecko_colors =[ "#108ec5...

Kinesis | Trouble Shooting

Im currently working on AWS KINESIS and thought I will document all the issue I'm facing When running consumer.py - I got              SEVERE: Failed to start client executable             java.io.IOException: Cannot run program "/bla/bla/bla/my_aws/mySample_kcl_app.py": error=2, No such file or directory SOLUTION: This is what I found you can have absolute path or relative it doesnt matter. it is not necessary to set the PATH if you are using absolute path if you had use the (consumer)sample.py as a template, then fix the shebang (ie.. origina sample.py had #!env python... It is for this the KCL lib threw the error file or directory not found. change this to the actual python path like (#!/usr/bin/env python). Then the file or folder not found error is gone.

DataStage | Remote Column Propagation ( RPC )

Image
I actually love this feature in DataStage. I think this is one of the excellent feature of dataStage and yet not been effective used by everyone. So I thought let me document it. Things covered in this Blogs are RPC features and highlights Manipulating just few column from the list of RPC'd columns. Deleting a column from the RPC list as it is not required for the target. Add new columns at the bottom of the existing RPC list. RPC highlights As you all know, when we enable the check box "Runtime column propagation" there is no need to define any column definition. The column name, type, nullable characteristics and the value gets carried to the target stage.  Saves a lot of time in manual column defining ( using "load" table definition manually is not that great). Secondly when you make changes to you tables, you don't have to do anything on your DS job ( you can also argue this as a negative thing, but in most case I think it is helpful)...

TABLEAU | My Résumé

Image
Here is my first Attempt to a VISUAL CV. I thought why not use TABLEAU to achieve this. I also gave a challenge to myself to have different data source( i,e excel sheets) and join them to achieve what I want. My main objective of this exercise are - Achieve a VISUAL and INTERACTIVE CV. Use TABLEAU's new "story board" feature and hight areas of the chats based on the story selected. Is to gain a better understanding of joins with multiple data sources within TABLEAU. Visual / Interactive CV in TABLEAU URL -  https://public.tableausoftware.com/views/CV_11/CareerStory?:embed=y&:display_count=no Learn About Tableau