Calculation of Pi

After my initial encountering with OpenCL, I wanted to explore more. So I wrote a small program to calculate Pi .
This uses a very simple algorithm to calculate Pi

  1. Inscribe a circle in a square
  2. Randomly generate points in the square
  3. Determine the number of points in the square that are also in the circle
  4. Let r be the number of points in the circle divided by the number of points in the square
  5. PI ~ 4 r
  6. Note that the more points generated, the better the approximation

This gave the value of Pi as 3.141172 . Pretty decent, I suppose 🙂
 

The code

 

import pyopencl as cl 
import numpy
import random

class circle(object):
    def __init__(self, r=100):
        self.r = r
    def inside_the_circle(self, x, y):
        if x**2+y**2 <= r**2: 
           return 1
        else:
           return 0

class CL:
    def __init__(self, size=1000):
        self.size = size
        self.ctx = cl.create_some_context()
        self.queue = cl.CommandQueue(self.ctx)

    def load_program(self):
        f = """
        __kernel void picalc(__global float* a, __global float* b, __global float* c)
        {
         unsigned int i = get_global_id(0);

        if (a[i]*a[i]+ b[i]*b[i] < 100*100)
           {
            c[i] = 1;
           }
        else
           {
            c[i]=0;
           }
        }
        """
        self.program = cl.Program(self.ctx, f).build()

    def popcorn(self):
        mf = cl.mem_flags
        x = [random.uniform(-100,100) for i in range(self.size)]

        y = [random.uniform(-100,100) for i in range(self.size)]

        self.a = numpy.array(x, dtype=numpy.float32)
        self.b = numpy.array(y, dtype=numpy.float32)

        self.a_buf = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=self.a)
        self.b_buf = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=self.b)
        self.dest_buf = cl.Buffer(self.ctx, mf.WRITE_ONLY, self.b.nbytes)

    def execute(self):
        self.program.picalc(self.queue, self.a.shape, None, self.a_buf, 
                            self.b_buf, self.dest_buf)
        self.c = numpy.empty_like(self.a)
        cl.enqueue_read_buffer(self.queue, self.dest_buf, self.c).wait()
        
    def calculate_pi(self):
        number_in_circle = 0
        for i in self.c:
            number_in_circle = number_in_circle + i
        pi = number_in_circle*4 / self.size
        print 'pi = ', pi



if __name__ == '__main__':
    ex = CL(1000000)
    ex.load_program()
    ex.popcorn()
    ex.execute()
    ex.calculate_pi()

 

CPU vs GPU performance comparision with OpenCL

I recently had opportunity to explore an awesome library called OpenCL (Open Computing Language) which enables me to create programs which helps me utilize the computation power of my Graphic Card. I wanted to try out how much faster a normal program (addition of elements to two arrays) would work if I parallize the program using OpenCL.

Source
Using OpenCL

import pyopencl as cl
import numpy
import sys

class CL(object):
    def __init__(self, size=10):
        self.size = size
        self.ctx = cl.create_some_context()
        self.queue = cl.CommandQueue(self.ctx)

    def load_program(self):
        fstr="""
		__kernel void part1(__global float* a, __global float* b, __global float* c)
		{
       		unsigned int i = get_global_id(0);

	       c[i] = a[i] + b[i];
		}
	     """
        self.program = cl.Program(self.ctx, fstr).build()

    def popCorn(self):
        mf = cl.mem_flags

        self.a = numpy.array(range(self.size), dtype=numpy.float128)
        self.b = numpy.array(range(self.size), dtype=numpy.float128)

        self.a_buf = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
                               hostbuf=self.a)
        self.b_buf = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
                               hostbuf=self.b)
        self.dest_buf = cl.Buffer(self.ctx, mf.WRITE_ONLY, self.b.nbytes)

    def execute(self):
        self.program.part1(self.queue, self.a.shape, None, self.a_buf, self.b_buf, self.dest_buf)
        c = numpy.empty_like(self.a)
        cl.enqueue_read_buffer(self.queue, self.dest_buf, c).wait()
        print "a", self.a
        print "b", self.b
        print "c", c

if __name__ == '__main__':
    matrixmul = CL(10000000)
    matrixmul.load_program()
    matrixmul.popCorn()
    matrixmul.execute()

Normal program without Optimization

def add(size=10):
    a = tuple([float(i) for i in range(size)])
    b = tuple([float(j) for j in range(size)])
    c = [None for i in range(size)]
    for i in range(size):
        c[i] = a[i]+b[i]

    #print "a", a
    #print "b", b
    print "c", c[:1000]

add(1000000)

I compared the performance of both the programs using the tool “time” available in Linux and I noted down the “sys time”
Heres the comparision:

 

Size Using GPU Without GPU ( i.e. CPU)
100 0.130s 0.030s
1000 0.100s 0.010s
10000 0.130s 0.010s
100000 0.150s 0.050s
1000000 0.170s 0.150s
10000000 0.600s 1.150s

 
Cleary you see that the GPU outperforms CPU at higher values of size as the program is able to use multiple threads provided by the GPU. At lower values of size there is an appreciable access time associated with GPU, so CPU performs faster.

Simple query request with Wolfram api

Wolfram Alpha is really a good website if you want to perform some computation or ask some basic query like “distance between Paris and London”.So I wanted to test out Wolfram API (alpha)
The following program gets a query from the user and retrieves a result. Ofcourse, you will need an app id which could be got from wolfram.com website.

import sys
import urllib2
import urllib
import httplib
from xml.etree import ElementTree as etree

class wolfram(object):
    def __init__(self, appid):
        self.appid = appid
        self.base_url = 'http://api.wolframalpha.com/v2/query?'
        self.headers = {'User-Agent':None}

    def _get_xml(self, ip):
        url_params = {'input':ip, 'appid':self.appid}
        data = urllib.urlencode(url_params)
        req = urllib2.Request(self.base_url, data, self.headers)
        xml = urllib2.urlopen(req).read()
        return xml

    def _xmlparser(self, xml):
        data_dics = {}
        tree = etree.fromstring(xml)
        #retrieving every tag with label 'plaintext'
        for e in tree.findall('pod'):
            for item in [ef for ef in list(e) if ef.tag=='subpod']:
                for it in [i for i in list(item) if i.tag=='plaintext']:
                    if it.tag=='plaintext':
                        data_dics[e.get('title')] = it.text
        return data_dics

    def search(self, ip):
        xml = self._get_xml(ip)
        result_dics = self._xmlparser(xml)
        #return result_dics 
        #print result_dics
        print result_dics['Result']

if __name__ == "__main__":
    appid = sys.argv[1]
    query = sys.argv[2]
    w = wolfram(appid)
    w.search(query)
      

My experience so far

Having never so far contributed to Open Source(except for couple of patches to PSF) , it was an wonderful opportunity helping me to learn a lot about Open Source and Python(as a programming language). Realized that Software Engineering is better understood practically
than theoretically(at school). With my mentors support , I was able to learn a lot of things about how to program better , and also how to program in a Pythonic way. There has been couple of hiccups on my side, but it was always overcome with my mentors help. So, participating in Gsoc has been really amazing. With Mid-Term evaluations coming in three days, I will be a little busy , finishing over my pending work .

Work achieved so far (me and Boris):
1) Have a VManager to handler VM functionalities
2) Have a Diskhandler to transfer data to and fro the Virtual Hard disk
3) Manager functionality which synchronizes all the operations which constitutes downloading the packages, calculates the dependencies, transfer data on to the disk, call the VM, execute the tests and get the results back. (needs to be enhanced little)
4) A communication channel between master and slave(needs to be enhanced little)
5) Creating tasks for execution
6) Creating simple recipes
7) Task Manager for execution of tasks.

VManager

This particular module to be included in PyTI is one of the most important part in the architecture. Even though the main goal of vmanager is to manage the vms, there are other secondary goals like getting the data from the virtual hard disk on to the host and viceversa . Even though it is designed with PyTI in mind, I guess it can work out pretty good for the rest of the projects(except the ones which requires networking) which plan to manage VirtualMachines. Ofcourse some tinkering has to be done to cater the project needs.

I will just illustrate with a simple example to start, save the state(snapshot), rollback and stopping the virtual box.

from vms import *
config = {'name': 'test123', 'memory':'123', \
              'disk_location': 'dsl-4.4.10.iso', \
              'hd_location': 'disk.vdi'}
a = VirtualMachine('hey', "vbox:///session", config=config) 
a.start()
a.createSnapshot('hello', 'blah')
a.rollback('hello')

It is pretty clear from the above examples , each of the operations performed. Since vmanager uses libvirt library, it wont be difficult to migrate to other hypervisors if ever the need arises.

For reading the virtual hard disk , I wrote a diskhandler code , which uses libguestfs library .

A small illustration for mounting the disk , uploading and downloading data from and to the host machine

from diskhandler import *
d = DiskOperations('/home/yeswanth/a.vdi')
d.mount()
d.upload("/home/yeswanth/a.txt", "/root/a.txt")
d.download("/root/b.txt", "/home/yeswanth/b.txt")
d.close_connection()

Fore extensive read on vmanager or PyTI , please do read our documentation

Research over master slave architecture

The environment part can be divided into two parts
1) Master – Slave Architecture
2) Raw Data API (which depends on the execution project)

My last week went researching on Master – Slave Architecture .

I had looked upon two projects : one is Bitten and another is Condor both for their Master Slave Architecture and thought I should jot down some of the points which might be
helpful while designing PyTI.

Bitten

Bitten is a Python based CI for collecting various software metrics. It builds on Trac . It uses a distributed build model having a master and slave architecture where in one or more
slaves run the actual tests , and a master gathers the results .

Bitten uses a build recipe ( a configuration file which determines what commands to execute, and where certain artifacts and reports can be found after a command is executed)
Build recipes are used for communication between the master and the slave. It is written in XML format for giving commands to the slave.
Bitten’s master slave protocol is just a simple peer to peer communication protocol where in either the master or slave can initiate exchanges.

Condor

Condor is a project which is aimed at utilizing idle cpu cycles. It works on distributed systems. Any job which needs to be done is given to condor and condor finds an idle machine and executes the job there and if the user of the machine wants his machine back , it can preempt the job and move to another machine.

Now a job can be given to condor , by giving a job ad with parameters and preferences for the job and the tasks to be executed. Also machines which volunteer for using their resources also communicate with Condor with their preferences. So condor sets up the job by matching each others’ preferences. The machines communicate with condor by providing a configuration file to Condor .

PyTI (PyPI Testing Infrastructure) – My Gsoc Proposal

Project Overview

The goal of the project is to test distributions from PyPI repository to assess
quality and also to check if a distribution is malicious or not . In order to
achieve that we create a testing infrastructure for PyPI repository. There will
be a mechanism to get newly uploaded distributions from PyPI , install them in
an isolated VM environment , run tests on them (quality check , unittests) and
also determine tests they have harmful components(malicious) or not. The
project can be divided into two parts , one(environment) to subscribe to
uploaded packages,set up the environment and the other one(execution) to run
the tests and report the results back(to the environment part).

Detailed work

This project can be divided into two components : one is execution part and
another is environment part . Since each of these two parts are comprehensible
enough on its own, each will be handled by a single student. Execution part
takes cares of installing the distributions(to be tested) along with their
dependencies , run tests on these distributions and assess different quality
parameters. Tests may include unittests, or quality tests(like pep8,pylint) or
custom tests to check if the program is malicious or not .

Environment part of the project

This proposal concerns the environment part. The environment part of the
project is responsible for creating an abstraction for the execution part . It
handles delivery of distributions (and its dependencies), to the execution part
( to run tests on them). It handles all the protocols required to communicate
to the PyPI repository and also to the different architecture used in the
project. It subscribes to uploaded packages from PyPI for testing them (testing
done by the execution part). It is also responisible for setting up the
environment required for testing and to deliver the packages to the execution
part for testing them .

Terminology

  • Raw data: the data generated by tasks execution.
  • Report: evaluation of the different features/attributes of the data.
  • Task : execution which produce raw data and "output". eg build, install,
    unittest, pylint…

Architecture

  • Master – Slave architecture where the master dispatches jobs to the slave and
    the slave executes them.
  • The communication between master and slave happen through an API called
    command API
  • The slave communicates with the vm , sends the distributions require for testing
    and receive raw data (after installing the distributions and conducting tests)
    using another API called raw data API
  • Tests are run on VMs and each VM is handled by a slave

Raw Data API

  • The task is to build a raw data API for the communication between the VM and
    slave.
  • The raw data API handles sending the data into the corresponding VMs.
  • The raw data API also handles the raw data (after the execution part has finished)
    on VM to be sent to the slave

Command API

  • The task is to build a command API to communicate between the Master and the
    Slave .
  • The command API handles the task requests issued by the Master to and
    assigns them to the slave.
  • Task requests can involve different configurations to be made on a VM,
    what distributions to be tested,etc.

Implementation

  • For both the API I propose the use of XML format or json format for creating the API as I
    think both can be used easily and both have good support.

Slave

The slave performs the following tasks

  • Initialises an isolated VM and configures the VM using the configuration
    provided by the API call to it.
  • It should be able to communicate with PyPI repository and get the
    distributions to be tested.
  • Gets the distribution to be tested from the repository , computes
    dependencies and also gets the dependencies from the repository.
  • Passes all the packages to the VM.
  • Receive the raw data from the VM.

Implementation

  • The slave is required to differentiate between the different VMs and also
    have a track of the activities happening with each of the VMs.
  • The slave intializes and configures the VM by making an API call to it.
  • When the packages are sent into the VM , they can be stored in a folder
    and the execution part can keep polling into the folder to see if any package
    has been received to start testing.

Master

  • Master subscribes to uploaded packages in PyPI
  • It dispatches jobs to the Slave using command API
  • It receives the test results from the slave

Implementation

  • Inorder to subscribe packages from PyPI , we can use pubhubsubbub protocol to
    get real time feed as and when a package has been uploaded.

Example scenario

  1. Developer uploads his distribution on pypi. (External)
  2. Pypi notifies PYTI(Here the master gets notified). (Environment)
  3. Master asks a slave(local or remote machine) to test a distribution
    using command API.(Environment)
  4. Slave computes the dependencies for the distribution(to be tested) and
    downloads the distribution(to be tested) along with the dependencies from
    the repository. (Execution)
  5. Slave starts a VM with settings as instructed by the Master (Environment)
  6. When the VM has started,the slave sends distribution(along with the
    dependencies) into the VM.(Environment)
  7. Inside the VM , the distribution is installed and different tests are
    conducted on it (unittests, quality check ,etc) (Execution)
  8. At the end, raw data(data obtained by testing) is sent to slave. (Execution)
  9. Slave sends raw data to master. (Environment)
  10. Then slave shutdown the VM and cleans it. (Environment)