Latest Entries »

Calculation of Pi

After my initial encountering with OpenCL, I wanted to explore more. So I wrote a small program to calculate Pi .
This uses a very simple algorithm to calculate Pi

  1. Inscribe a circle in a square
  2. Randomly generate points in the square
  3. Determine the number of points in the square that are also in the circle
  4. Let r be the number of points in the circle divided by the number of points in the square
  5. PI ~ 4 r
  6. Note that the more points generated, the better the approximation

This gave the value of Pi as 3.141172 . Pretty decent, I suppose :)
 

The code

 

import pyopencl as cl 
import numpy
import random

class circle(object):
    def __init__(self, r=100):
        self.r = r
    def inside_the_circle(self, x, y):
        if x**2+y**2 <= r**2: 
           return 1
        else:
           return 0

class CL:
    def __init__(self, size=1000):
        self.size = size
        self.ctx = cl.create_some_context()
        self.queue = cl.CommandQueue(self.ctx)

    def load_program(self):
        f = """
        __kernel void picalc(__global float* a, __global float* b, __global float* c)
        {
         unsigned int i = get_global_id(0);

        if (a[i]*a[i]+ b[i]*b[i] < 100*100)
           {
            c[i] = 1;
           }
        else
           {
            c[i]=0;
           }
        }
        """
        self.program = cl.Program(self.ctx, f).build()

    def popcorn(self):
        mf = cl.mem_flags
        x = [random.uniform(-100,100) for i in range(self.size)]

        y = [random.uniform(-100,100) for i in range(self.size)]

        self.a = numpy.array(x, dtype=numpy.float32)
        self.b = numpy.array(y, dtype=numpy.float32)

        self.a_buf = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=self.a)
        self.b_buf = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=self.b)
        self.dest_buf = cl.Buffer(self.ctx, mf.WRITE_ONLY, self.b.nbytes)

    def execute(self):
        self.program.picalc(self.queue, self.a.shape, None, self.a_buf, 
                            self.b_buf, self.dest_buf)
        self.c = numpy.empty_like(self.a)
        cl.enqueue_read_buffer(self.queue, self.dest_buf, self.c).wait()
        
    def calculate_pi(self):
        number_in_circle = 0
        for i in self.c:
            number_in_circle = number_in_circle + i
        pi = number_in_circle*4 / self.size
        print 'pi = ', pi



if __name__ == '__main__':
    ex = CL(1000000)
    ex.load_program()
    ex.popcorn()
    ex.execute()
    ex.calculate_pi()

 

I recently had opportunity to explore an awesome library called OpenCL (Open Computing Language) which enables me to create programs which helps me utilize the computation power of my Graphic Card. I wanted to try out how much faster a normal program (addition of elements to two arrays) would work if I parallize the program using OpenCL.

Source
Using OpenCL

import pyopencl as cl
import numpy
import sys

class CL(object):
    def __init__(self, size=10):
        self.size = size
        self.ctx = cl.create_some_context()
        self.queue = cl.CommandQueue(self.ctx)

    def load_program(self):
        fstr="""
		__kernel void part1(__global float* a, __global float* b, __global float* c)
		{
       		unsigned int i = get_global_id(0);

	       c[i] = a[i] + b[i];
		}
	     """
        self.program = cl.Program(self.ctx, fstr).build()

    def popCorn(self):
        mf = cl.mem_flags

        self.a = numpy.array(range(self.size), dtype=numpy.float128)
        self.b = numpy.array(range(self.size), dtype=numpy.float128)

        self.a_buf = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
                               hostbuf=self.a)
        self.b_buf = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
                               hostbuf=self.b)
        self.dest_buf = cl.Buffer(self.ctx, mf.WRITE_ONLY, self.b.nbytes)

    def execute(self):
        self.program.part1(self.queue, self.a.shape, None, self.a_buf, self.b_buf, self.dest_buf)
        c = numpy.empty_like(self.a)
        cl.enqueue_read_buffer(self.queue, self.dest_buf, c).wait()
        print "a", self.a
        print "b", self.b
        print "c", c

if __name__ == '__main__':
    matrixmul = CL(10000000)
    matrixmul.load_program()
    matrixmul.popCorn()
    matrixmul.execute()

Normal program without Optimization

def add(size=10):
    a = tuple([float(i) for i in range(size)])
    b = tuple([float(j) for j in range(size)])
    c = [None for i in range(size)]
    for i in range(size):
        c[i] = a[i]+b[i]

    #print "a", a
    #print "b", b
    print "c", c[:1000]

add(1000000)

I compared the performance of both the programs using the tool “time” available in Linux and I noted down the “sys time”
Heres the comparision:

 

Size Using GPU Without GPU ( i.e. CPU)
100 0.130s 0.030s
1000 0.100s 0.010s
10000 0.130s 0.010s
100000 0.150s 0.050s
1000000 0.170s 0.150s
10000000 0.600s 1.150s

 
Cleary you see that the GPU outperforms CPU at higher values of size as the program is able to use multiple threads provided by the GPU. At lower values of size there is an appreciable access time associated with GPU, so CPU performs faster.

Simple query request with Wolfram api

Wolfram Alpha is really a good website if you want to perform some computation or ask some basic query like “distance between Paris and London”.So I wanted to test out Wolfram API (alpha)
The following program gets a query from the user and retrieves a result. Ofcourse, you will need an app id which could be got from wolfram.com website.

import urllib2
import urllib
import httplib
from xml.etree import ElementTree as etree

class wolfram(object):
    def __init__(self, appid):
        self.appid = appid
        self.base_url = 'http://api.wolframalpha.com/v2/query?'
        self.headers = {'User-Agent':None}

    def _get_xml(self, ip):
        url_params = {'input':ip, 'appid':self.appid}
        data = urllib.urlencode(url_params)
        req = urllib2.Request(self.base_url, data, self.headers)
        xml = urllib2.urlopen(req).read()
        return xml

    def _xmlparser(self, xml):
        data_dics = {}
        tree = etree.fromstring(xml)
        #retrieving every tag with label 'plaintext'
        for e in tree.findall('pod'):
            for item in [ef for ef in list(e) if ef.tag=='subpod']:
                for it in [i for i in list(item) if i.tag=='plaintext']:
                    if it.tag=='plaintext':
                        data_dics[e.get('title')] = it.text
        return data_dics

    def search(self, ip):
        xml = self._get_xml(ip)
        result_dics = self._xmlparser(xml)
        #return result_dics 
        #print result_dics
        print result_dics['Result']

if __name__ == "__main__":
    appid = sys.argv[0]
    query = sys.argv[1]
    w = wolfram(appid)
    w.search(query)
      

Pycon India

My first visit to any Pycon Conference(this time at Pune, India) happened last week. It was really great( as expected !!! ) . Incidentally I gave a talk there too, on PyTI (Python Testing Infrastructure) . I got to make few more friends and learnt some interesting stuff out there. One of the notable talk to be watched is definitely the keynote by Raymond Hettinger (again no surprises here as keynote is the soul of any conference ) . I had a great time in Pune and also Mumbai which I visited on sunday :)

Just a week back, we decided to replace GuestFS and use VDFuse. Though GuestFS comes with certain advantages, w.r.t VDFuse in terms
of support for several file formats which makes it easy to extend to other file formats and also python bindings, we faced a lot
of problems working with it. Some of them are:

  • Installing GuestFS is a big headache as there are lot of dependencies to be installed and even after that its not necessary that
    it works always. This might turn out to be one reason for developers to not try out PyTI.
  • GuestFS is not very stable . There were lot of times it didnt work and couple of issues of GuestFS hanging without any reason.
  • It takes a lot of time initializing GuestFS for reading or writing into the disk

Yes, as I mentioned before, we moved to VDFuse for now. It grows out of the difficulties of GuestFS and it is a decent solution for
reading or writing to the disk.

Lessons learnt

Whenever I used a good software before, I never bothered or thought about what are all difficulties the software faced during its
development. And now working on PyTI for over a month, I came to realize that there is a lot of sweat behind any good project.
This reminded me to write a blog on what difficulties we faced for PyTI.

If you have been following my blog , you already know that PyTI uses libvirt for virtualization and libguestfs for mounting the
file system on host. There have been couple of problems we faced using these libraries and I will illustrate how we tackled those problems. Also mentioned is the lessons I learnt through the journey.

1) Snapshot Error:
PyTI is interfaced to Virtualbox using libvirt. There was once a problem occurred that I couldnt take snapshots of any disks , except VMs attached to live boot images. After some amount of searching here and there, I realized that the disk which I was taking snapshot was already to more than one VM . So disconnecting the VM from one of the disk solved the problem .

Lesson learnt : Its always easy to blame others for mistakes. (in this case the “library” libvirt)

2) Support for shared folders:
One of the initial solution we thought of implementing for transferring data from host and guest was
by using shared folders. As you might be aware that Virtual Box has a feature of “shared folders” which allows you to access a host
folder from guest machine. Since any VM used in PyTI does not have network access, this we thought as the most viable solution.We were relying on it a big time. Soon, we realized that libvirt library does not yet support shared folder(at that time). Then we decided to
scrap the idea and go for Flux based solutions . Now we use libguestfs library.

Lesson learnt : Its easy to have a solution and assume that it will always work out.

3) Downloading files from virtual hard disk:
We had a problem downloading files from Virtual Hard disk . After some searches, we realized that VirtualBox creates a diff image
when a snapshot is taken and libguestfs tries to access from diff image and not the original disk. Having two disks solved our problem
one for IO , which we dont take snapshot and another for testing purposes which we take snapshot of (but dont require to access data from it on the host)

Lesson learnt : Read the documentation first

My experience so far

Having never so far contributed to Open Source(except for couple of patches to PSF) , it was an wonderful opportunity helping me to learn a lot about Open Source and Python(as a programming language). Realized that Software Engineering is better understood practically
than theoretically(at school). With my mentors support , I was able to learn a lot of things about how to program better , and also how to program in a Pythonic way. There has been couple of hiccups on my side, but it was always overcome with my mentors help. So, participating in Gsoc has been really amazing. With Mid-Term evaluations coming in three days, I will be a little busy , finishing over my pending work .

Work achieved so far (me and Boris):
1) Have a VManager to handler VM functionalities
2) Have a Diskhandler to transfer data to and fro the Virtual Hard disk
3) Manager functionality which synchronizes all the operations which constitutes downloading the packages, calculates the dependencies, transfer data on to the disk, call the VM, execute the tests and get the results back. (needs to be enhanced little)
4) A communication channel between master and slave(needs to be enhanced little)
5) Creating tasks for execution
6) Creating simple recipes
7) Task Manager for execution of tasks.

VManager

This particular module to be included in PyTI is one of the most important part in the architecture. Even though the main goal of vmanager is to manage the vms, there are other secondary goals like getting the data from the virtual hard disk on to the host and viceversa . Even though it is designed with PyTI in mind, I guess it can work out pretty good for the rest of the projects(except the ones which requires networking) which plan to manage VirtualMachines. Ofcourse some tinkering has to be done to cater the project needs.

I will just illustrate with a simple example to start, save the state(snapshot), rollback and stopping the virtual box.

from vms import *
config = {'name': 'test123', 'memory':'123', \
              'disk_location': 'dsl-4.4.10.iso', \
              'hd_location': 'disk.vdi'}
a = VirtualMachine('hey', "vbox:///session", config=config) 
a.start()
a.createSnapshot('hello', 'blah')
a.rollback('hello')

It is pretty clear from the above examples , each of the operations performed. Since vmanager uses libvirt library, it wont be difficult to migrate to other hypervisors if ever the need arises.

For reading the virtual hard disk , I wrote a diskhandler code , which uses libguestfs library .

A small illustration for mounting the disk , uploading and downloading data from and to the host machine

from diskhandler import *
d = DiskOperations('/home/yeswanth/a.vdi')
d.mount()
d.upload("/home/yeswanth/a.txt", "/root/a.txt")
d.download("/root/b.txt", "/home/yeswanth/b.txt")
d.close_connection()

Fore extensive read on vmanager or PyTI , please do read our documentation

Over the last one week , all I did was play with Virtual Machines. The tests on the distributions will happen over a Virtual Machine . So its very important that we can control these VMs with a script.

Candidate : VirtualBox
Configuration:
Operating System : Damn Small Linux
Virtual Hard disk : A VDI image of 2.5 GB
RAM: 256 MB
Library: used libvirt to control the virtual machines
Features tested: Start, Stop, Snapshot, Rollback

Using libvirt was nice. It supports a range of hypervisors to be controlled with the same library, though I had my fair share of difficulties especially with the documentation.

Another feature I worked on this week is mounting the virtual hard disk image on the host. I used libguestfs library to achieve this. The features I have added for PyTI for now are
uploading and downloading files from and to the virtual hard disk on to the host machine.

Would really thank Alexis, one of my mentor for his help in giving me feedback , refactoring my code and testing it .

Let me just illustrate my findings after searching for a good VM to be used.

Our requirements :
1) Supported by libvirt.
2) Open Source and Free (as in freedom)
3) Has a good community with active developers.
4) Fast.

How did I evaluate ?
1) Read blogs
2) Google searches (eg . which is better VBox or KVM? ,etc)
3) Look for references.
4) Benchmarks on performances ( again subject to other people’s findings and reporting)

Hurdles
1) Some of the websites were biased about their solution
2) I got caught between community wars( one saying VM ‘X’ is the best , another VM ‘Y’ is the best, both were convincing enough)
3) Libvirt lack of good documentation about the different features available for different VMs.

A comparision table:

Criteria Xen Qemu+KQemu KVM VirtualBox
Host OS Linux,Solaris,BSD Most Linux
Guest OS Most(Kernel needto
be modified)
Most Most Most
Speed Performance loss on
diskintensive operations
Qemu is usually slow.But Kqemu has

good performance

Goodperformance Goodperformance
Virtual Hard
DiskImage Format
VMDK,VHD VMDK,VHD,

QCOW2,RAW

VMDK,VHD,

QCOW2,RAW

VDI,VMDK,VHD
Community Support Community not very
active in the

past few years

Support droppedfor KQemu RedHat is currentlysupporting it Oracle is currentlysupporting  it

Now considering this table I had a rough idea of shortlisting the list to two candidates : KVM and VirtualBox.

Reasons for choosing Vbox over KVM
1)Vbox support for multiple platforms
2)KVM’s native disk image QCOW2 is very slow. Ofcourse I could use other disk images but that would require external tools.
3)KVM’s dependency on linux kernel

And the winner is VirtualBox

Follow

Get every new post delivered to your Inbox.