Thoughts, Journal, Tech

A software developer, freelancer and street photographer(hobbyist). I am originally from India. I have been in software industry since 2008 and have worked at startups, online media, hospitality and healthcare sectors.

github | flickr

Read this first

Data mining for associated entities

Data mining–the practice of examining large pre-existing databases in order to generate new information.
– by google

In one of my not so recent project for an e-commerce platform i developed a workflow. A workflow in which raw data is collected and placed to be mined in order to find relationship between them. And store this information into database to be utilized later on.

I used Apriori algorithm to derive the relationship between a single or combinations of entities. Once the data is associated it can be used to generate recommendations for the end user based on his/her activities. For instance, if user has added bread in the shopping basket, system can recommend milk based on historical transaction in which milk and bread appeared quite often and thus treated as associated entities.

Apriori algorithm is used to generate rules for the data item appears in transactions called...

Continue reading →

Code Quality

Often when developing softwares the initial goal is to solve the domain specific problem as soon as possible. And this is how it should be. However, in rushing to get the feature done may result in compromising code readability, modifiability, testability and ultimately delivering a poorly coded and hard to maintain software. Which may not perform as expected when go live.

A well designed software architecture should address all of these issues in advance. And delivers better experience for stakeholders as well as its maintainers.

In python, there are few tools which comes handy in finding code complexity, logical issues, code smells and ultimately improving code quality.

pylint, is the first in the list that can help in finding syntax error, wrong indentation, invalid variable name etc. It also give rating at the end of check. Below is the sample output of pylint.

[bash] >> pylint

Continue reading →

Multi-processing vs Multi-threading

In python, multiprocessing and multithreading are common terms when it comes to improve the application performance using concurrent and parallel programming.

Let’s look at each term individually.

Multithreading–ability to run code on separate processors inside a single python process. This is a common approach to spread the work load across available CPUs. But there is a big problem with this. If you have been in Python world, you might have encountered the word GIL(Global Interpreter Lock), a lock that must be acquired each time python needs to execute the code. So no matter how many CPUs are there, python will execute the threads in a sequence, one after another. This means, if you try to scale the application by adding more threads, you will always be limited by GIL.

Each time to execute a thread processor switches the context. Context switching happens so fast that it looks like...

Continue reading →

Javascript promises

Most of the processes in javascript are asynchronous–meaning they are handled independently from the main program and run in background. If you have coded an AJAX call, probably have used promises as well.

Promises–an object acts as proxy to the results that is initially unknown due to computation of its value. And value computed by promises are called future also known as resolving, fulfilling or binding. Promises are the best way to handle asynchronous programming flow and other convoluted flow.

Most of the javascript libraries developed recently used promises under the hood. One of them is AngularJS. And for the same reason i am using AngularJS in code examples.

    var project = angular.module('project', []);
    project.controller('app', function($scope, $http){
        var url = ''
        var params = {

Continue reading →

Workflow with airflow

Airflow is an open source project started at Airbnb. It is a tool to orchestrate the desire flow of your application dynamically which is readily scalable to infinity because of it modular architecture and message queuing mechanism.

It can be also understood as advance cron application which executes the tasks when their dependencies are fulfilled. And can even retry the task, if failed, for a certain number of time configured for it.

This is how a airflow tasks pipeline looks like:


In the above example each block represents task and some of the task are connected to other tasks reflecting their dependencies and relationship.

Let’s say you need to develop and application which helps your customer find some common products available online at some selected e-commerce platform and generate the report then send them. For this purpose you can design a workflow where one task is...

Continue reading →

Python class & static method

In this post i will try to explain python’s class and static method and how to use them.


In order to understand class and static method, specially class method, one should know the difference between class and instance variables.

  • Class variables, variables defined in the class but outside of its methods and can be accessed directly using class name. e.g.
    class AClass:
        a_class_variable = ''
  • Instance variables, variables defined inside class methods and can be accessed using class’s instance. .e.g
    class AClass:
        def a_method(self):
            self.a_instance_variable = ''

You may have seen class and static method while reading or writing a class in python programming. Though they looks very similar but serves different purpose. And can be quite puzzling for beginners.


    class Salad:

        restaurant_name = 'XY-Z'

Continue reading →

Kudos system in general

Recently after writing a post my mouse cursor happened to be over the kudos circle at the end of the post for like 2 seconds. And it gave it 1 kudos. Which really bothered me because as the author of the post i did not intend to give it a kudos.

I think not allowing user to undo their kudos still ok but at the same allowing authors to kudos their own post, though it’s not that of a big deal, is something seems broken to me.

There can be two scenario here. First, it could be a well known defect which is ignored to achieve the fancy kudos functionality. Second, and most likely that it does not annoy majority of users.

In my opinion it’s a trivial feature to like things on the internet; adding animation and seconds delay; is bit over designed.

View →

Python Generators: yield

In this post i will try to explain the basics of python generator. And will be writing another post to cover advance use and some debugging techniques.

In python, generator is part of functional programming. A generator is an object that returns a value on each call of its next() method until it raises StopIteration exception.

It’s introduced in PEP255 and offers really easy way to implement Iterator Protocol.

Basic Syntax:

    def a_generator():
        yield 1

Yes, thats all it takes to write a python generator function. It’s just like a normal python function contains yield statement instead of return. Python will detect the yield statement and tag the function as a generator.

When function’s execution reaches a yield statement, it returns a value with a return statement, but in the case of generator python interpreter will save a stack reference, which will be used to resume...

Continue reading →