Archive for January, 2012


One of my friends asked me what x86 and x86_64 meant! That was the inspiration, so i decided to write a blog post explaining what actually happens inside the CPU.

Processors can be broadly classified as 32 bit processors and 64 bit processors, the processors that are in the market come under the 64 bit category.

Examples:

32 bit  processors : Pentium 4

64 bit processors: Core 2 Duo

Basically the difference lies in the processing, addressing and the data bus transfer. A 64 bit processor will be able to transport more data than a 32 bit processor. A 32 bit processor can process up to 32 bits in a clock cycle whereas the 64 bit can process up to 64 bits per clock cycle. Processing includes transmission as well. A 32 bit processor can process data and memory addresses that are represented by 32 bits. 64 bit refers to processor with registers that can store 64 bit numbers.

Difference in RAM addressing :

64 bit systems allow systems to address up to 2 ^ 10 GB of data. In a 32 bit system we can address only 4GB of memory, this includes the physical memory as well as the graphic cards in modern systems.

When making the transition from 32-bit to 64-bit  PCs, users won’t actually see a change  in browsing and word processing programs. Benefits of 64-bit processors would be seen with more demanding applications such as video encoding, scientific research, searching massive databases; tasks where being able to load massive amounts of data into the system’s memory is required.

So what makes 64 bit better :

  • Pointers in a 64 bit system take 8 bytes instead of 4 bytes (32 bit). The effect on RAM usage is not much , but in the worst case scenario, a huge part of the CPU cache may be wasted in this process .
  •  There are many more general-purpose CPU registers in 64-bit mode. Registers are the fastest memory in your entire system. There are only 8 in 32-bit mode and 16 general purpose registers in 64-bit mode. In practice applications run 30 % faster.
  • A 32 bit application may run on a 64 bit processor but the vice versa doesn’t hold good. So 64 bit applications are faster.

Problems with a 64 bit OS :

I have come across a lot of driver related issues in  64 bit machines though they have been addressed by recent patches and fixes.

HTH

De-duplication articles

Was going through the de-duplication techniques , what companies do what are the pros and cons of de-duplication. This is one article that says it all. Wonderful post.

Courtesy : http://nsrd.info/blog/2011/08/07/7-common-problems-with-deduplication/

De-Duplication

Since I was into de-duplication as a part of my undergraduate thesis , I found the articles from NetApp and EMC very interesting.

De-duplication is the process of removing replicas of a file, the file may be of any type (example .jpg,.txt,.doc etc). There are two major approaches to data de-duplication, one is file level and the other is database level.

Few companies employ file level de-duplication while a majority of them employ block level de-duplication.

Block Level De-duplication :

De-duplication at the data block level compares blocks of data  with other subsequent blocks. Block level deduplication allows you to de-duplicate data within a given object. If an object (file, database, etc.) contains blocks of data that are identical to each other, then block level deduplication eliminates storing the redundant data and reduces the size of the object in storage.

In De-duplication a single copy of the file is maintained and other copies of the file are made into references to that particular file, which drastically reduces the file size in case the redundancies for that particular file is very high. References may be similar to soft link approach used in Linux.

For example if a image file that is 3MB , is stored in 5 different locations the total size occupied by that file is 15MB. In case of data de-duplication a single copy of the file is maintained and the rest of the copies are made as references to the original file location (or) rather single file location. So the result after de-duplication comes way lesser than the former may be just slightly higher than 3MB.

Files may be named differently, this poses a great challenge hence the md5/SHA-1  of the file is calculated and checked for duplicates. Links are established between similar files. For my project I use the Amazon S3 for storing data on the cloud .I found it to be an easy and efficient way of storing and accessing my data. Amazon AWS provides support for various languages like C#, Java and PHP etc. The howto’s are provided under the Developer section of the Amazon AWS website.

The links given below provide some useful resources regarding de-duplication.

http://www.informationweek.com/blog/229205878

http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/134-inline-or-post-process.html/

http://www.evaluatorgroup.com/document/data-de-duplication-%E2%80%93why-when-where-and-how-infostor-article-by-russ-fellows/

And of course Wiki

http://en.wikipedia.org/wiki/Data_deduplication

Since a lot of research is being carried out on how to decrease the storage costs : de-duplication proves to be an effective tool in this regard.

It’s very annoying when a linux box warns you that you are the super user and you may harm your computer by doing this blah blah blah. I was just trying to install google chrome for my Fedora 16. Got the rpm from the google website but unfortunately it didn’t open , the reason “cannot run as root” . To overcome this there’s a small work around.

1. Go to terminal and type

xhost +

The above command disables access control for X11 display.

2. Next open your google chrome configuration file located in /usr/bin/google-chrome and this to the end of the line “–user-data-dir”.

So your config file will look something like this.

export LD_LIBRARY_PATH

export CHROME_VERSION_EXTRA=”stable”

# We don’t want bug-buddy intercepting our crashes. http://crbug.com/24120
export GNOME_DISABLE_CRASH_DIALOG=SET_BY_GOOGLE_CHROME

exec -a “$0” “$HERE/chrome” “$@” –user-data-dir

3. Save the file and quit , you must be up and running.

HTH


Yum

Yum is a package management utility that is used to install packages. The greatest benefit of using yum is that it automatically configures the dependency packages required for the installation and installs them as well. YUM is expanded as YellowDog Updater Modified. It is used in rpm compatible operating systems like Red Hat and Fedora/CentOS. It makes use of XML for storing repository information.

Configuring yum

YUM is stored under /etc under yum.repos.d.

The basic syntax for yum repo is

[name of repository]

baseurl=ftp://mywebsite.com

enabled=0 (or) 1

gpgcheck = 0 (or) 1

Baseurl defines the url from which the packages are to be retrieved. Enabled is whether the current repo is in on or off state. GPGcheck can be 0 or 1, depending on whether a GPG signature check needs to be performed or not.

The baseurl may be defined for HTTP/FTP websites and a few more I assume. The baseurl can be configured to use a local file system as well just use file:///(path)

For eg: baseurl = file:///root

HTH

A few tips on configuring your yum repository.

1. First update your yum repository.

yum update

2. Make sure you clean your repository so that there isn’t much of a mess.

yum clean

3. Once the yum repository is cleaned check for the packages.

yum list

The above command shows the various packages in the repository.

Every time one installs packages make sure to do the above steps

HTH

Installing Broadcom drivers on Fedora 16 is quite easy but sometimes configuring the repositories proves to be a hindrance. First to install broadcom drivers make sure you add rpmfusion repositories to your yum repositories.

1. Go to rpmfusion.org/configuration, select the desired rpm from the list. In my case i chose rpm fusion Non Free for Fedora 14,15,16.

2. Run the rpm file.

3. Make sure your broadcom drivers are present.

yum install b43*

4. Now install kmod-wl

yum install kmod-wl

5. Restart your machine

6. Make sure your kernel header files are up to date.

yum install kernel-PAE-devel kernel-headers

HTH