Posts

Prestodb | Configuring for accessing Mysql, SqlServer, Redshift and Hive

Im trying to configure PrestoDB ( distributed query engine) for my testing/understanding purposes. Here are the steps I did to deploy prestodb in my env. PrestoDB Setup: Download the latest prestoDB server( presto-server-<version>.tar.gz )  from https://prestodb.io/docs/current/installation/deployment.html Download presto-CLI(command line interface)- (  presto-cli-<version>-executable.jar ) from  https://prestodb.io/docs/current/installation/cli.html Below is my master-worker setup MASTER ( or coordinator) IP- 10.x.x.123 works as both Coordinator and worker WORKER IP- 10.x.x.122 works are worker presto CLI works from this machine DBs configured -  Mysql  MSSql-server Redshift Hive Deployment Steps on MASTER / COORDINATOR create a folder called "prestodb". I would like to keep all my directories, related to prestodb, under one folder. but in reality you can keep your folder anywhere you want. now create a folder like...

reactjs | react-magic-move changes for React V15.1

Image
React Magic Move is a great component given by Ryan Florence( https://github.com/ryanflorence/react-magic-move). However when I tried to implement it it was not working in React version 15.x. Some of the functions used in that code were deprecated and were throwing error. So I thought I will fix that and here is the revised working code. https://github.com/venkat-vs-id/magic_move_kat This is not a Production module, its just a fun experiment I used to understand and play with ReactJS. Steps: Have a DIV with id="example" Copy the components to your components code. my_react_script.js is the code which implements the MagicMove examples. Either copy this code into your js file or use this file and make necessary changes. Make sure you copy the contents of my_CSS.css to your CSS file OR use this file and make required changes. Below are the screenshots of the examples 1) First screen 2) Basic Example 3) Buddy List Example

react-bootstrap-table | header column alignment fix

Image
I am working with react-bootstrap-table. Its a great module with few teething problem. for example whenever I have a lots of columns (in the table) my header column is misaligned with the columns in the dataarea. example: (before the fix) after applying the fix ( note that now you have a scroll bar too) Reason for the problem: In react-bootstrap-table "header" is a table and the data is another table. Since they are 2 different tables sometimes the header column width gets misaligned from the column of data table. Fix-Logic applied: The idea is simple, First I allow the table to be rendered as it is  Then first adjust the width of the data table columns using the width of the respective header column width. But remember only for the columns whose width is lesser than the header respective column width.  Now get the width of all data-columns and set the same to the width of the respective header columns. and tadaaaa the table in perfect alignment now. ...

Rundeck | py library

I have a Rundeck setup running in Linux and DB= mysql. I have all my code/scripts in py. So I thought it will be easy for me to do any maintenance on rundeck if I have a rundeck library in py. This Library will give me easy access to rundeck details that I need for doing my maintenance. USE- CASE The first maintenance I wanted to do on Rundeck is to delete the old executions as the db size was getting bigger and the logs on the machine was also occupying a lot of space. Since I have lots of recurring jobs, I thought it would be better to also have a maintenance job which can delete old executions after a specific no_of_days. Please note that I have lots of library code (like logger, Cquery for mysql_query etc). You will have to replace those with your library function. If you dont have an equivalent library then you need to change the respective area in the code to make this work. If you find any difficulty, please let a comment and I will try to help you. """ Th...

Django 1.8 | email as username

Image
As you all know by default Django offers username and email separately. But if you are like me who wants to use email as the username and want to override the default behavior of django, there are few ways to do it Change the <username> label to 'email' using a custom form. But downside is username field is only 30 char long. Registering a custom user-auth-model, to the existing django-registration process. But downside is you can only do this before the first migration.  There are few modules available which extended username. So you can install that. But the downside is I was not sure if those modules are supported anymore. Building your own auth-model. But this requires lot of time. Making simple tweaks to the existing django-auth-model ( User) and use the existing django-registration ( I'm using django-registration-redux) to take care of everything. There are pros and cons in every model, but l don't want to spend too much time on the login module ...

AWS Code Commit - How To setup

We are using AWS platform for our data warehouse, so we wanted to use AWS-Code-Commit for our source control. Its a private git repository on the AWS platform and its very easy to use. Install local git in your Local working machine For Linux : sudo yum install git-all For Windows- Download git https://git-scm.com/downloads and install it. When the installation is done, then type the below to check if the installation was successful git help Install AWS CLI If you are already working with AWS platform then you should have already installed it. If yes good, else install it. To check if it is already installed or not type the "help" command as given below. For windows – download and install from -http://docs.aws.amazon.com/cli/latest/userguide/installing.html#install-msi-on-windows For Linux -  http://docs.aws.amazon.com/cli/latest/userguide/installing.html#install-bundle-other-os Finally type this command to make sure you have installed the AWS CLI – aws hel...

Py | Rundeck Delete Executions

We were running a Rundeck instance using a File-DB ( not MYSQL) and after a month we found that rundeck webpages were very slow and it was unusable. It was because we had lot of job that runs repeatedly in shorter interval and there were so many execution history. This slowed the whole rundeck web front end. I tried to use the normal bulk delete operation using the API and it was still very slow to get all the execution IDs and then to delete them. So I used a work around to get the Execution IDs from the log-filename instead of using the rundeck API. It was still slow but it reduced the overall time by like 60-70% as getting the IDs was the time consuming task. Below is the code #!/usr/bin/python -tt ''' Rundeck delete execution --------------- change-history --------------- 1.0|21-oct-2015|vsubr|created ''' __version__ = '1.0' import sys import time from datetime import datetime import re from os import listdir from os.path import isfile, join...

Py | Generic XML 2 Relational Data Convertor - Advanced Options

Image
Please have a look into the basic options first - http://venkat-echo.blogspot.com.au/2015/12/py-generic-xml-2-relational-data.html 1. If you want to force a specific XML Level into a separate table.  In the above example if you want all toppings details in a separate table and each topping as a seperate row in that able, use the following option #If you want to force a specific XML Level into a separate table. xml2rd.arr_predefined_xmlPath4Tables = ['/items/item/topping'] Result of above parameter 2. If you would like to add/merge/insert different xmlpath to the same table then use the below config. But please note that Merge XML paths as the same DEPTH.  In the above example if you want to merge <batters> and <batterscost> then use this config # to put multiple XMLPath into the same table. But make sure all XMLPATH are at the same level xml2rd.d_common_table_4XmlPaths = { '/items/item/batters':{'table':'common_tab...

Py | Generic XML 2 Relational Data Convertor - Basic

Image
Why I need this Generic XML Parser Its often required to load xml data into tables, so that business users can access the XML data and also use those table to write reports.  XMLs I get normally are from 3rd parties and most of the time I dont get the xsd for those XML. Plus we get different XMLs from time to time and if we had hard-code the path to read a specific XML then I had to write code for every XML that I received. So I thought it is better to make a GENERIC XML PARSER which can take any XML file and convert that into RELATIONAL DATA style and write that into CSVs ( which can be used to load the tables) Logic applied for  XML 2 Relational Data conversion Even Level in the XML is a table. ie  items/ item is a table  items/item/baters is a table Can not handle Namespaces All key columns + columns created (by XML2RD) for reference created by with prefix = '_' (eg _xid , _xpath etc) <Element> name becomes the column name. In case of...

My First Infographic poster

Image

R | Vectors vs List vs Array vs Matrix vs dataframe

For a non-R programmers some of the R data-types might be a bit confusing. R Non-R InR, vector and list and arrays are different and has different properties. List can be named or un-named Un-named list are Arrays Named lists are dictionary However Array and Dictionary can contain another Array or a dictionary, but its still called Arrays and dictionaries. Vector Simplest form of R object All values in a vector should be of same obj/class. However they cant contain another vector or array or list. (non-R programmers terms) its like array but with all values in the array having same data-types List It is like a vector but can contain any obj/class. class also have named attributes. can contain list within a list etc. (non-R programmers terms) its like array (if not named) and dictionaries( if named) Array Its a list with one, two or more dimensions dimensions can be named (non-R programmers te...

R | Rcmdr ; Rattle on R3.1 and Mac OSX yosemite

Image
Today I tried to install Rcmdr on my MBP( yosemite ) and found that it was the installation was not successful and kept throwing a lot of errors and also some of the dependancies packages were not getting installed. I also noticed that some of the dependancies are not available for R.3.1 and yosemite. After spending a lot of time googling and trial and error, now I got everything working  Steps to do Since Rcmdr uses xWindows system, we need a OSX version of the same. So make sure you have installed the Quartz(X11) from this site - http://xquartz.macosforge.org/landing/ Install macports -  https://www.macports.org/install.php Make sure you have already installed the Xcode. Since some of the dependancies binaries are not available for the current R version, we might have to compile them from source Download RGtk2 source and cairoDevice source from CRAN ( download the "packaged source") http://cran.r-project.org/web/packages/RGtk2/     http://cran.r-pro...

Python | Push To GeckoBoard

Image
I'm doing a python framework which is capable of pushing the data to Geckoboard. Here is a the sample of the code and it handles for "highcharts"(different kinds) and gecko-o-meter. It can be extended to other charts as well. This is just an template/idea. Here is a sample i made #!env python -tt ''' 1) Gecko is used for pushing data to Gecko board ---------------------- Modification History ---------------------- -- 22-Jan-2015| vsubr | created ''' import sys import datetime from collections import OrderedDict import pprint import json import requests import MySQLdb from contextlib import closing import mod.conf.cfg as cfg import mod.lib.logger as logger import mod.conf.settings as settings class Gecko(object): ''' "Task" class takes Gecko id as the input and executes that Gecko-task. ''' #-- use this site to check the colors http://html-color-codes.info/ arr_gecko_colors =[ "#108ec5...