Tutorial on Hadoop with VMware Player


Map Reduce (Source: google)

Map Reduce (Source: google)

Functional Programming
According to WIKI, In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state. Since there is no hidden dependency (via shared state), functions in the DAG can run anywhere in parallel as long as one is not an ancestor of the other. In other words, analyze the parallelism is much easier when there is no hidden dependency from shared state. Map/reduce is a special form of such a directed acyclic graph which is applicable in a wide range of use cases. It is organized as a “map” function which transform a piece of data into some number of key/value pairs. Each of these elements will then be sorted by their key and reach to the same node, where a “reduce” function is use to merge the values (of the same key) into a single result.
Map Reduce

A way to take a big task and divide it into discrete tasks that can be done in parallel. Map / Reduce is just a pair of functions, operating over a list of data.

MapReduce is a patented software framework introduced by Google to support distributed computing on large data sets on clusters of computers.

The framework is inspired by map and reduce functions commonly used in functional programming,[3] although their purpose in the MapReduce framework is not the same as their original forms.
Hadoop
A Large scale Batch Data Processing System.

It uses MAP-REDUCE for computation and HDFS for storage.

Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers.

It is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar to those of the Google File System and of MapReduce. HDFS is a highly fault-tolerant distributed file system and like Hadoop designed to be deployed on low-cost hardware. It provides high throughput access to application data and is suitable for applications that have large data sets.

Hadoop is an open source Java implementation of Google’s MapReduce algorithm along with an infrastructure to support distributing it over multiple machines. This includes it’s own filesystem ( HDFS Hadoop Distributed File System based on the Google File System) which is specifically tailored for dealing with large files. When thinking about Hadoop it’s important to keep in mind that the infrastructure it has is a huge part of it. Implementing MapReduce is simple. Implementing a system that can intelligently manage the distribution of processing and your files, and breaking those files down into more manageable chunks for processing in an efficient way is not.

HDFS breaks files down into blocks which can be replicated across it’s network (how many times it’s replicated it determined by your application and can be specified on a per file basis). This is one of the most important performance features and, according to the docs “…is a feature that needs a lot of tuning and experience.” You really don’t want to have 50 machines all trying to pull from a 1TB file on a single data node, at the same time, but you also don’t want to have it replicate a 1TB file out to 50 machines. So, it’s a balancing act.

Hadoop installations are broken into three types.

v  The NameNode acts as the HDFS master, managing all decisions regarding data replication.

v  The JobTracker manages the MapReduce work. It “…is the central location for submitting and tracking MR jobs in a network environment.”

v  Task Tracker and Data Node, which do the grunt work

Hadoop - NameNode, DataNode, JobTracker, TaskTracker

Hadoop – NameNode, DataNode, JobTracker, TaskTracker

The JobTracker will first determine the number of splits (each split is configurable, ~16-64MB) from the input path, and select some TaskTracker based on their network proximity to the data sources, then the JobTracker send the task requests to those selected TaskTrackers.

Each TaskTracker will start the map phase processing by extracting the input data from the splits. For each record parsed by the “InputFormat”, it invoke the user provided “map” function, which emits a number of key/value pair in the memory buffer. A periodic wakeup process will sort the memory buffer into different reducer node by invoke the “combine” function. The key/value pairs are sorted into one of the R local files (suppose there are R reducer nodes).

When the map task completes (all splits are done), the TaskTracker will notify the JobTracker. When all the TaskTrackers are done, the JobTracker will notify the selected TaskTrackers for the reduce phase.

Each TaskTracker will read the region files remotely. It sorts the key/value pairs and for each key, it invoke the “reduce” function, which collects the key/aggregatedValue into the output file (one per reducer node).

Map/Reduce framework is resilient to crash of any components. The JobTracker keep tracks of the progress of each phases and periodically ping the TaskTracker for their health status. When any of the map phase TaskTracker crashes, the JobTracker will reassign the map task to a different TaskTracker node, which will rerun all the assigned splits. If the reduce phase TaskTracker crashes, the JobTracker will rerun the reduce at a different TaskTracker.
Let’s try Hands on Hadoop
Objective of the tutorial is to set up multi-node Hadoop cluster using the Hadoop Distributed File System (HDFS) on Ubuntu Linux with the use of VMware Player.

Hadoop and VMware Player

Hadoop and VMware Player

Installations / Configurations Needed:

Laptop

Physical Machine

Laptop with 60 GB HDD, 2 GB RAM, 32bit Support, OS – Ubuntu 10.04 LTS – the Lucid Lynx

IP Address-192.168.1.3 [Used in configuration files]

Virtual Machine

See VMware Player sub section

Download Ubuntu ISO file

Ubuntu 10.04 LTS – the Lucid Lynx ISO file is needed to install on virtual machine created by VMware Player to set up multi-node Hadoop cluster.

Download Ubuntu Desktop Edition

Download Ubuntu Desktop Edition

http://www.ubuntu.com/desktop/get-ubuntu/download

Note: Login with user “root” to avoid any kind of permission issues (In your machine and Virtual Machine).

Update the Ubuntu packages: sudo apt-get update

VMware Player [Freeware]

Download it from http://downloads.vmware.com/d/info/desktop_downloads/vmware_player/3_0

Download VMware Player

Download VMware Player

Select VMware Player to Download

Select VMware Player to Download

VMware Player Free Product Download

VMware Player Free Product Download

Install VMware Player on your physical machine with the use of the downloaded bundle.

VMware Player - Ready to install

VMware Player – Ready to install

VMware Player - installing

VMware Player – installing

Now, create virtual machine with the use of it and install Ubuntu 10.04 LTS on it with the use of ISO file and do appropriate configurations for the virtual machine.

Browse Ubuntu ISO

Browse Ubuntu ISO

Proceed with instructions and let the set up finish.

Virtual Machine in VMware Player

Virtual Machine in VMware Player

Once you are done with it successfully*, Select Play virtual Machine.

Start Virtual Machine in VMware Player

Start Virtual Machine in VMware Player

Open Terminal (Command prompt in Ubuntu) and check the IP address of the Virtual Machine.

NOTE: IP address may change so if Virtual machine cannot be connected by SSH from physical machine then have a look on IP address 1st.

Ubuntu Virtual Machine - ifconfig

Ubuntu Virtual Machine – ifconfig

Apply following configuration in physical & virtual machine for Java 6 and Hadoop installation only.

Installing Java 6

sudo apt-get install sun-java6-jdk

sudo update-java-alternatives -s java-6-sun [Verify Java Version]

Setting up Hadoop  0.20.2

Download Hadoop from http://www.apache.org/dyn/closer.cgi/hadoop/core and place under /usr/local/hadoop

HADOOP Configurations

Hadoop requires SSH access to manage its nodes, i.e. remote machines [In our case virtual Machine] plus your local machine if you want to use Hadoop on it.

On Physical Machine

Generate an SSH key

Generate an SSH key

Generate an SSH key

Enable SSH access to your local machine with this newly created key.

Enable SSH access to your local machine

Enable SSH access to your local machine

Or you can copy it from $HOME/.ssh/id_rsa.pub to $HOME/.ssh/authorized_keys manually.

Test the SSH setup by connecting to your local machine with the root  user.

Test the SSH setup

Test the SSH setup

Use ssh 192.168.1.3 from physical machine as well. It will give same result.

On Virtual Machine

The root user account on the slave (Virtual Machine) should be able to access physical machine via a password-less SSH login.

Add the Physical Machine’s public SSH key (which should be in ) to the authorized_keys file of Vitual Machine (in this user’s ). You can do this manually

(Physical Machine)$HOME/.ssh/id_rsa.pub -> (VM)$HOME/.ssh/authorized_keys

SSH Key may look like (Can’t be same though J)

ssh

rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAwjhqJ7MyXGnn5Ly+0iOwnHETAR6Y3Lh3UUKb

aCIP2/0FsVOWhBvcSLMEgT1ewrRPKk9IGoegMCMdHDGDfabzO4tUsfCdfvvb9KFRcB

U3pKdq+yVvCVxXtoD7lNnMtckUwSz5F1d04Z+MDPbDixn6IAu/GeX9aE2mrJRBq1Pz

n3iB4GpjnSPoLwQvEO835EMchq4AI92+glrySptpx2MGporxs5LvDaX87yMsPyF5tutu

Q+WwRiLfAW34OfrYsZ/Iqdak5agE51vlV/SESYJ7OqdD3+aTQghlmPYE4ILivCsqc7w

xT+XtPwR1B9jpOSkpvjOknPgZ0wNi8LD5zyEQ3w== root@mitesh-laptop

Use ssh 192.168.1.3 from virtual machine to verify ssh access and have a feel of it to understand ssh working.

For more understanding, Ping 192.168.1.3 and 192.168.28.136 from each other.

For detail information on Network Settings in VMWare Player visit http://www.vmware.com/support/ws55/doc/ws_net_configurations_common.html VMware Player has similar concepts.

Using 0.0.0.0 for the various networking-related Hadoop configuration options will result in Hadoop binding to the IPv6 addresses of Ubuntu box.

To disable IPv6 on Ubuntu 10.04 LTS, open /etc/sysctl.conf in the editor of your choice and add the following lines to the end of the file:

#disable ipv6

net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

Ubuntu - Disable IPv6

Ubuntu – Disable IPv6

 <HADOOP_INSTALL>/conf/hadoop-env.sh -> set the JAVA_HOME environment variable to the Sun JDK/JRE 6 directory.

 

# The java implementation to use.  Required.

export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.20

 

<HADOOP_INSTALL>/conf/core-site.xml ->

 

Configure the directory where Hadoop will store its data files, the network ports it listens to, etc. Our setup will use Hadoop’s Distributed File System,

Hadoop - core-site.xml

Hadoop – core-site.xml

HDFS, even though our little “cluster” only contains our single local machine.

<property>

  hadoop.tmp.dir

  /usr/local/hadoop/tmp/dir/hadoop-${user.name}

</property>

 <HADOOP_INSTALL>/conf/mapred-site.xml ->

<property>

  <name>mapred.job.tracker</name>

  <value>192.168.1.3:54311</value>

</property>

Hadoop - mapred-site.xml

Hadoop – mapred-site.xml

 <HADOOP_INSTALL>/conf/hdfs-site.xml

 

<property>

  <name>dfs.replication</name>

  <value>2</value>

</property>

Physical Machine vs Virtual Machine (Master/Slave) Settings on Physical Machine only

<HADOOP_INSTALL>/conf/masters

The conf/masters file defines the namenodes of our multi-node cluster. In our case, this is just the master machine.

192.168.1.3

<HADOOP_INSTALL>/conf/slaves

 This conf/slaves file lists the hosts, one per line, where the Hadoop slave daemons (datanodes and tasktrackers) will be run. We want both the master box and the slave box to act as Hadoop slaves because we want both of them to store and process data.

192.168.1.3

192.168.28.136

NOTE: Here 192.168.1.3 & 192.168.28.136 are the IP addresses of Physical Machine and Virtual machine respectively which may vary in your case. Just Enter IP Addresses in files and you are done!!!

Let’s enjoy the ride with Hadoop:

All Set for having “HANDS ON HADOOP”.

Formatting the name node

ON Physical Machine and Virtual Machine

The first step to starting up your Hadoop installation is formatting the Hadoop filesystem which is implemented on top of the local filesystem of your “cluster” (which includes only your local machine if you followed this tutorial). You need to do this the first time you set up a Hadoop cluster. Do not format a running Hadoop filesystem, this will cause all your data to be erased.

hadoop namenode -format

hadoop namenode -format

Starting the multi-node cluster

1.    Start HDFS daemons

Run the command /bin/start-dfs.sh on the machine you want the (primary) namenode to run on. This will bring up HDFS with the namenode running on the machine you ran the previous command on, and datanodes on the machines listed in the conf/slaves file.

Physical Machine

Hadoop - start-dfs.sh

Hadoop – start-dfs.sh

VM

Hadoop - DataNode on Slave Machine

Hadoop – DataNode on Slave Machine

1.    Start MapReduce daemons

Run the command /bin/start-mapred.sh on the machine you want the jobtracker to run on. This will bring up the MapReduce cluster with the jobtracker running on the machine you ran the previous command on, and tasktrackers on the machines listed in the conf/slaves file.

Physical Machine

Hadoop - Start MapReduce daemons

Hadoop – Start MapReduce daemons

VM

TaskTracker in Hadoop

TaskTracker in Hadoop

Running a MapReduce job

Here’s the example input data I have used for the multi-node cluster setup described in this tutorial.

All ebooks should be in plain text us-ascii encoding.

http://www.gutenberg.org/etext/20417

http://www.gutenberg.org/etext/5000

http://www.gutenberg.org/etext/4300

http://www.gutenberg.org/etext/132

http://www.gutenberg.org/etext/1661

http://www.gutenberg.org/etext/972

http://www.gutenberg.org/etext/19699

Download above ebooks and store it in local file system.

Copy local example data to HDFS

Hadoop - Copy local example data to HDFS

Hadoop – Copy local example data to HDFS

Run the MapReduce job

hadoop-0.20.2/bin/hadoop jar hadoop-0.20.2-examples.jar wordcount examples example-output

Failed Hadoop Job

Failed Hadoop Job

Retrieve the job result from HDFS

To read the file directly from HDFS without copying it to the local file system. In this tutorial, we will copy the results to the local file system though.

mkdir /tmp/example-output-final

bin/hadoop dfs -getmerge example-output-final /tmp/ example-output-final

Hadoop - Word count example

Hadoop – Word count example

Hadoop - MapReduce Administration

Hadoop – MapReduce Administration

Hadoop - Running and Completed Job

Hadoop – Running and Completed Job

Task Tracker Web Interface

Hadoop - Task Tracker Web Interface

Hadoop – Task Tracker Web Interface

Hadoop - NameNode Cluster Summary

Hadoop – NameNode Cluster Summary

References

http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)

http://www.michael-noll.com/wiki/Writing_An_Hadoop_MapReduce_Program_In_Python

http://java.dzone.com/articles/how-hadoop-mapreduce-works

http://ayende.com/Blog/archive/2010/03/14/map-reduce-ndash-a-visual-explanation.aspx

http://www.youtube.com/watch?v=Aq0x2z69syM

http://www.gridgainsystems.com/wiki/display/GG15UG/MapReduce+Overview

http://map-reduce.wikispaces.asu.edu/

http://blogs.sun.com/fifors/entry/map_reduce

http://www.vmware.com/support/ws55/doc/ws_net_configurations_common.html

http://www.ibm.com/developerworks/aix/library/au-cloud_apache/

 

ssh

rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAwjhqJ7MyXGnn5Ly+0iOwnHETAR6Y3Lh3UUKb

aCIP2/0FsVOWhBvcSLMEgT1ewrRPKk9IGoegMCMdHDGDfabzO4tUsfCdfvvb9KFRcB

U3pKdq+yVvCVxXtoD7lNnMtckUwSz5F1d04Z+MDPbDixn6IAu/GeX9aE2mrJRBq1Pz

n3iB4GpjnSPoLwQvEO835EMchq4AI92+glrySptpx2MGporxs5LvDaX87yMsPyF5tutu

Q+WwRiLfAW34OfrYsZ/Iqdak5agE51vlV/SESYJ7OqdD3+aTQghlmPYE4ILivCsqc7w

xT+XtPwR1B9jpOSkpvjOknPgZ0wNi8LD5zyEQ3w== root@mitesh-laptop

How to Configure CloudAnalyst in Eclipse


Create New Java Project

New Java Project in Eclipse

New Java Project in Eclipse

Create Java Project in Eclipse

Create Java Project in Eclipse

New Java Project in Eclipse: Java Settings

New Java Project in Eclipse: Java Settings

Java™ Application Development on Linux® – Free 599 Page eBook

Enterprise Java Virtualization:

Understanding the TCO Implications

InfoWorld’s Java IDE Comparison Strategy Guide:

Java Essential Training

Apache Jakarta Commons: Reusable Java™ Components

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Go to File->Import

New Java Project in Eclipse: Import Source Code from Existing Project

New Java Project in Eclipse: Import Source Code from Existing Project

New Java Project in Eclipse: Import resources from Local File System

New Java Project in Eclipse: Import resources from Local File System

Run CloudAnalyst in Eclipse

Run CloudAnalyst in Eclipse

Done!!!

CloudAnalyst GUI

CloudAnalyst GUI

Single Sign-On for Java and Web Applications

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Transforming a Generic Java IDE to Your Application Specific IDE:

The Java Virtual Appliance—No OS Required

BEA WebLogic® Operations Control: Application Virtualization for Enterprise Java

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

How to Run and Configure CloudAnalyst


  1. Download CloudAnalyst

http://www.cloudbus.org/cloudsim/CloudAnalyst.zi

2.       Extract Files from the Zip file which will give following folder structure

CloudAnalyst Folder Structure

CloudAnalyst Folder Structure

     If you want to Run from Command line then

Run CloudAnalyst from Command Line

Run CloudAnalyst from Command Line

1.       Or click on run.bat file

2.       Done!!!

Run CloudAnalyst from Command Line: Dashboard

Run CloudAnalyst from Command Line: Dashboard

1.       Click on Show Region Boundaries:

Run CloudAnalyst from Command Line: Dashboard - Regions

Run CloudAnalyst from Command Line: Dashboard – Regions

Java™ Application Development on Linux® – Free 599 Page eBook

Enterprise Java Virtualization:

Understanding the TCO Implications

InfoWorld’s Java IDE Comparison Strategy Guide:

Java Essential Training

Apache Jakarta Commons: Reusable Java™ Components

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Click on Configure Simulation

CloudAnalyst - Configure Simulation

CloudAnalyst – Configure Simulation

Here you can configure:

·              Data Center Configuration

o   Physical Hardware Details of Data center

CloudAnalyst - Physical Hardware Details of Data center

CloudAnalyst – Physical Hardware Details of Data center

You will get:

CloudAnalyst - Data Center Configurations

CloudAnalyst – Data Center Configurations

CloudAnalyst - Data Center Configurations details

CloudAnalyst – Data Center Configurations details

CloudAnalyst - Data Center Configurations - Copy

CloudAnalyst – Data Center Configurations – Copy

·         Advanced

CloudAnalyst - Advanced Configurations

CloudAnalyst – Advanced Configurations

·         User Base: models a group of users and generates traffic representing the users.

·         Application Deployment Configurations (Cloudlets)

o   Service Broker Policy

CloudAnalyst - Service Broker Configurations

CloudAnalyst – Service Broker Configurations

CloudAnalyst - Main Configurations

CloudAnalyst – Main Configurations

1.       You can Save this Configuration as well in case you want to use it later. It is stored as .sim file. XML data is generated and saved as Sim file.

CloudAnalyst - Save Configurations

CloudAnalyst – Save Configurations

1.       Saved configuration can be loaded anytime easily into CloudAnalyst

CloudAnalyst - Load Configurations

CloudAnalyst – Load Configurations

So you need to enter data each time you want to run simulation.

1.       I have created 2 DC in different region, each with 20 physical host

CloudAnalyst - Configurations - Data Centers in different Regions

CloudAnalyst – Configurations – Data Centers in different Regions

CloudAnalyst - Configurations - Round Robin Load Balancing Policy

CloudAnalyst – Configurations – Round Robin Load Balancing Policy

6 User bases and each in different region; 25 VMs are allocated to fulfil requests from all user bases from both the Data Center

CloudAnalyst - Configurations - User Base and Regions

CloudAnalyst – Configurations – User Base and Regions

1.       Once your are done with Configuration; click on Done!!!

CloudAnalyst - Configurations - Internet Characteristics

CloudAnalyst – Configurations – Internet Characteristics

Click on

CloudAnalyst - Configure Internet Characteristics

CloudAnalyst – Configure Internet Characteristics

1.       Run Simulation

CloudAnalyst - Run Simulation

CloudAnalyst – Run Simulation

1.       Simulation Results Window will Open

CloudAnalyst - Simulation Results

CloudAnalyst – Simulation Results

Close it.

1.       Main Window will give all statistics

CloudAnalyst - Simulation Statistics

CloudAnalyst – Simulation Statistics

1.       If you will try to run Simulation again then it will give Error

CloudAnalyst - Simulation Error

CloudAnalyst – Simulation Error

CloudAnalyst - Restart Simulation Error

CloudAnalyst – Restart Simulation Error

1.       Click on Exit

1.       

Single Sign-On for Java and Web Applications

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Transforming a Generic Java IDE to Your Application Specific IDE:

The Java Virtual Appliance—No OS Required

BEA WebLogic® Operations Control: Application Virtualization for Enterprise Java

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Sample Ant Build


Your first ANT build

 <!–?xml version=”1.0″?>

<project name=”Ant test project” default=”build” basedir=”.”>

<target name=”build” >

<javac srcdir=”src” destdir=”build/src” debug=”true” includes=”**/*.java“/>

</target>

</project>

First line of the build.xml file represents the document type declaration

Second line is comment entry.

Third line is the project tag. Each buildfile contains one project tag and all the instruction are written in the project tag.

The project tag:

<project name=”Ant test project” default=”build” basedir=”.”>  requires three attributes namely name, default and basedir.

Here is the description of the attributes:

Attribute             Description

name                    Represents the name of the project.

default                  Name of the default target to use when no target is supplied.

basedir                  Name of the base directory from which all path calculations are done.

All the attributes are required.

One project may contain one or more targets. In this example there is only one target.

<target name=”build” >

<javac srcdir=”src” destdir=”build/src” debug=”true” includes=”**/*.java”/>

</target>

Which uses task javac to compile the java files.

Here is the code of our AntTestExample.java file which is to be compiled by the Ant utility.

class AntTestExample{

public static void main (String args[]){

System.out.println(“This is Ant Test Example “);

}

}

Run the ANT file with ant command.

Above mentioned process compiles the file and places in the buildsrc directory.

Refer to examples

Java™ Application Development on Linux® – Free 599 Page eBook

Enterprise Java Virtualization:

Understanding the TCO Implications

InfoWorld’s Java IDE Comparison Strategy Guide:

Java Essential Training

Apache Jakarta Commons: Reusable Java™ Components

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Single Sign-On for Java and Web Applications

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Transforming a Generic Java IDE to Your Application Specific IDE:

The Java Virtual Appliance—No OS Required

BEA WebLogic® Operations Control: Application Virtualization for Enterprise Java

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Writing Custom Task in Ant


Writing Own Task

  • Create a Java class that extends org.apache.tools.ant.Task
  • For each attribute, write a setter method that is public void and takes a single argument
  • Write a public void execute() method, with no arguments, that throws a BuildException — this method implements the task itself

Refer to write own task for example.

Java™ Application Development on Linux® – Free 599 Page eBook

Enterprise Java Virtualization:

Understanding the TCO Implications

InfoWorld’s Java IDE Comparison Strategy Guide:

Java Essential Training

Apache Jakarta Commons: Reusable Java™ Components

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Single Sign-On for Java and Web Applications

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Transforming a Generic Java IDE to Your Application Specific IDE:

The Java Virtual Appliance—No OS Required

BEA WebLogic® Operations Control: Application Virtualization for Enterprise Java

Enabling Rapid ROI: With Java™ – Based Business Intelligence Applications:

Tutorial- Application Development on Force.com from 30 day Free Trial


Force.com is a cloud computing platform as a service offering from Salesforce, the first of its kind allowing developers to build multi-tenant applications that are hosted on their servers as a service.

Features of force.com

The multitenant architecture of Force.com consists of the following features:

•Shared infrastructure. Every customer (or tenant) of Force.com shares the same infrastructure. You are assigned a logical environment within the Force.com infrastructure.

•Single version There is only one version of the Force.com platform in production. The same platform is used to deliver applications of all sizes and shapes, used by 1 to 100,000 users.

•Continuous, zero-cost improvements When Force.com is upgraded to include new features or bug fixes, the upgrade is enabled in every customer’s logical environment with zero to minimal effort required.

•Infrastructure Explosure Force.com is targeted toward corporate application developers and independent software vendors. Unlike the other PaaS offerings, it does not expose developers directly to its own infrastructure

•Integration with other Technologies: FORCE.com integrates with other technologies using open standards such as SOAP and REST, the programming languages and metadata representations used to build applications are proprietary to Force.com.

•Relational Database
–To store and manage the business data. Data is stores in the objects.
•Application Services
–logging, transaction processing, validation
•Declarative Meta-Data
–Customized configured simple XML and documented schema’s
•Programming Languages
–Apex
force.com
force.com - Infrastructure, Application and Operational Services
The layers of technologies and services make up the platform.
force.com - Application Architecture
force.com - How it works?

force.com – How it works?

This slideshow requires JavaScript.

Note:
30 day free trial doesn’t provide Workflow support else we can create full featured application. In Trial, we can create a Visualforce page but cannot enable Sites for our organization nor register our Force.com domain name and expose the Visualforce page we created as a public product catalog on the Web.

Workflow Support is available in Force.com One App: Start with one custom app- for your organization only.

force.com - 30 day Free Trial

force.com – 30 day Free Trial

Related articles

Open Source ILS: Installation Guide for Koha on Ubuntu 11.10


Open Source ILS: Installation Guide for Koha on Ubuntu 11.10 or Ubuntu 11.04 or Ubuntu 10.10 or Ubuntu 10.04 LTS with MySQL 5

Open Source ILS Installation Guide for Koha on Ubuntu 11.10

Open Source ILS Installation Guide for Koha on Ubuntu 11.10

According to Wikipedia.com, An integrated library system (ILS), also known as a library management system (LMS), is an enterprise resource planning system for a library, used to track items owned, orders made, bills paid, and patrons who have borrowed.

An ILS usually comprises a relational database, software to interact with that database, and two graphical user interfaces (one for patrons, one for staff). Most ILSes separate software functions into discrete programs called modules, each of them integrated with a unified interface. Examples of modules might include:

  • acquisitions (ordering, receiving, and invoicing materials)
  • cataloging (classifying and indexing materials)
  • circulation (lending materials to patrons and receiving them back)
  • serials (tracking magazine and newspaper holdings)
  • the OPAC (public interface for users)

Each patron and item has a unique ID in the database that allows the ILS to track its activity.
Example
Open-source

Proprietary

  • Ex Libris Group, Aleph and Voyager (latter acquired from Endeavor Information Systems in 2006)
  • Millennium, former Innopac, from Innovative Interfaces, Inc.
  • SirsiDynix, Symphony — current version and Unicorn — a legacy system.
  • Library•Solution, Library•Solution for Schools, and CARL•X from The Library Corporation
  • LibraryWorld
  • Insignia Library System (ILS), Insignia Software Corporation
  • NOSA
  • EOS International
  • SydneyPLUS International
  • Talis (UK and Ireland)
  • Horizon from former company Dynix, later absorbed by SirsiDynix
  • SLIM21, former SLIM++, from Algorhythms Consultants Pvt Ltd.
  • Virtua, former VTLS, from VTLS Inc.
  • Voyager from former company Endeavor Information Systems, later acquired by Ex Libris
  • (Polish) MOL, Patron and MOLIK – interface created for children

Why Open Source ILS?

  • Open Source
  • Easy customization
  • No restrictions on use
  • User driven
  • No vendor lock-in

Installation Guide for Koha on Ubuntu 10.04 LTS with MySQL 5
Koha was created in 1999 by Katipo Communications for the Horowhenua Library Trust in New Zealand, and the first installation went live in January 2000.

Koha is a free software library automation package. In use worldwide, its development is steered by a growing community of users collaborating to achieve their technology goals.

Features

  • Full-featured ILS. In use worldwide in libraries of all sizes, Koha is a true enterprise-class ILS with comprehensive functionality including basic or advanced options. Koha includes modules for circulation, cataloging, acquisitions, serials, reserves, patron management, branch relationships, and more.
  • Full text searching Koha uses an RDBMS coupled with an external search engine to provide powerful searching that is truly scalable
  • Library Standards Compliant. Koha is built using library standards and protocols that ensure interoperability between Koha and other systems and technologies, while supporting existing workflows and tools.
  • Web-based Interfaces. Koha’s OPAC, circ, management and self-checkout interfaces are all based on standards-compliant World Wide Web technologies–XHTML, CSS and Javascript–making Koha a truly platform-independent solution.
  • Free Software / Open Source. Koha is distributed under the Free Software General Public License (GPL) version 2 or later.
  • No Vendor Lock-in. It is an important part of the free software promise that there is no vendor lock-in: libraries are free to install and use Koha themselves if the have the in-house expertise or to purchase support or development services from the best available sources. Libraries should be free to change support company and export their data at any time, make sure your support company allows this.
  • Simple, clear interface for librarians and members (patrons)
  • Various Web 2.0 facilities like tagging and RSS feeds
  • Union catalog facility
  • Customizable search
  • Circulation and borrower management
  • Full acquisitions system including budgets and pricing information (including supplier and currency conversion)
  • Simple acquisitions system for the smaller library
  • Ability to cope with any number of branches, patrons, patron categories, item categories, items, currencies and other data
  • Serials system for magazines or newspapers
  • Reading lists for members

Software Required
The current release is 3.6; Download it from http://download.koha-community.org/koha-latest.tar.gz

  • Ubuntu Server- Ubuntu 10.04 LTS – the Lucid Lynx
  • Apache
  • MySQL
  • Perl

Koha is free software and is licensed under the GNU General Public License, either version 2 of the License, or (at your option) any later version.
All commands can be performed as a system user with sudo privileges, as indicated or by running the command directly as root.

1. Prepare System and Install Dependencies

1.1 Install Ubuntu Jaunty via CD

Your locale should be set to UTF-8, as should Apache2 and MySQL 5.

This step is VERY IMPORTANT for a UNICODE compliant system. Please read over the following document carefully:

http://wiki.koha-community.org/wiki/Encoding_and_Character_Sets_in_Koha

You can verify your system locale by typing the following command:     $ locale

1.2 Install the Yaz and Zebra packages

$ sudo apt-get update

$ sudo apt-get install yaz idzebra-2.0 idzebra-2.0-doc

Install the Yaz and Zebra packages

Install the Yaz and Zebra packages

Install the Yaz and Zebra packages 2

Install the Yaz and Zebra packages

Install the Yaz and Zebra packages 3

Install the Yaz and Zebra packages

Install the Yaz and Zebra packages 4

Install the Yaz and Zebra packages

1.3 Get Koha

1.3.1 Option B: Download Koha

http://koha-community.org/download-koha/

Installation Instructions
Once you have downloaded Koha, please unpack it and find the installation and upgrade instructions in the INSTALL file for your system, or the general INSTALL file.

install.Ubuntu in Koha Distribution Directory
install.Ubuntu in Koha Distribution Directory

1.4 Install additional Ubuntu dependencies

IMPORTANT:  You should only use CPAN for Perl dependencies which are NOT available from the package maintainer. You have been warned!

Using the ubuntu.packages file included in the Koha source tree, run the following:

$ sudo dpkg –set-selections < install_misc/ubuntu.packages

Install additional Ubuntu dependencies

Install additional Ubuntu dependencies

Now start dselect (you may need to ‘sudo apt-get install dselect’):

Install dselect

Install dselect

$ sudo dselect

Choose [I]nstall and accept packages to be installed (hit return) (may take a while)

Debian dselect package - Choose [I]nstall and accept packages to be installed

Debian dselect package – Choose [I]nstall and accept packages to be installed

Debian dselect package - Choose [I]nstall and accept packages to be installed 1

Debian dselect package – Choose [I]nstall and accept packages to be installed

Debian dselect package - Choose [I]nstall and accept packages to be installed 2

Debian dselect package – Choose [I]nstall and accept packages to be installed

Debian dselect package - Choose [I]nstall and accept packages to be installed 3

Debian dselect package – Choose [I]nstall and accept packages to be installed

Configuring nullmailer - Mailname of System

Configuring nullmailer – Mailname of System

Koha - Configuring nullmailer - List of Remote Servers

Koha – Configuring nullmailer – List of Remote Servers

Choose [C]onfigure, [R]emove and [Q]uit until dselect has completed.

Koha - Debian dselect package - Configure Packages

Koha – Debian dselect package – Configure Packages

Koha - Debian dselect package - Remove unwarrented Packages

Koha – Debian dselect package – Remove unwarrented Packages

Koha - Debian dselect package - Quit dselect

Koha – Debian dselect package – Quit dselect

Koha - Debian dselect package - Quit dselect

Koha – Debian dselect package – Quit dselect

1.5 Install Perl dependencies that aren’t packaged into Ubuntu sources

$ sudo cpan MARC::Crosswalk::DublinCore GD GD::Barcode::UPCE Email::Date

HTML::Scrubber Algorithm::CheckDigits::M43_001 Biblio::EndnoteStyle

Locale::Currency::Format

Note: you may need to run CPAN initialization if you’ve not run cpan before: /etc/perl/CPAN/Config.pm initialized.

CPAN is the world-wide archive of perl resources. It consists of about 100 sites that all replicate the same contents all around the globe. Many countries have at least one CPAN site already. The resources found on CPAN are easily accessible with the CPAN.pm module. If you want to use CPAN.pm, you have to configure it properly. If you do not want to enter a dialog now, you can answer ‘no’ to this question and I’ll try to autoconfigure. (Note: you can revisit this dialog anytime later by typing ‘o conf init’ at the cpan prompt.)

Are you ready for manual configuration? [yes]

When the configuration is completed CPAN will install the Perl modules.

Koha - Install Perl dependencies

Koha – Install Perl dependencies

Koha - Install Perl dependencies 1

Koha – Install Perl dependencies

2. Configuration of dependencies

2.1 Update root MySQL password (if dselect didn’t do it for you already)

$ sudo mysqladmin password

2.2 Create the Koha database

Create the database and user with associated privileges:

$ mysqladmin -uroot -pcreate

$ mysql -uroot -p

Welcome to the MySQL monitor.  Commands end with ; or g.

Your MySQL connection id is 22

Server version: 5.0.32-Debian_7etch3-log Debian etch distribution. Type ‘help;’ or ‘h’ for help. Type ‘c’ to clear the buffer.

mysql> grant all on.* to ”@’localhost’ identified by ”;

Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;

Query OK, 0 rows affected (0.00 sec)

mysql> quit

Koha - Create MySQL Database and Grant Permissions

Koha – Create MySQL Database and Grant Permissions

2.3 Test your SAX Parser and correct where necessary

You must be sure you’re using the XML::LibXML SAX parser, not Expat or PurePerl, both of which have outstanding bugs with pre-composed characters. You can test your SAX parser by running:

$ cd koha

$ misc/sax_parser_print.pl

You should see something like::

XML::LibXML::SAX::Parser=HASH(0x81fe220)

If you’re using PurePerl or Expat, you’ll need to edit your ini file, typically located at:

/etc/perl/XML/SAX/ParserDetails.ini

You will need to move the entire section for ‘[XML::LibXML::SAX::Parser]’ to the bottom of the ini file.

Koha - Test your SAX Parser

Koha – Test your SAX Parser

Koha - Test your SAX Parser 1

Koha – Test your SAX Parser

2.4 Install DBD::mysql Perl module

In order to handle UTF-8 correctly, Koha requires at least version 4.004 of the DBD::mysql Perl module. However, Debian Etch has a stable package only for version 3.0008, so it is necessary to install the module  from CPAN. DBD::mysql’s test suite needs to use a MySQL ‘test’ DB which doesn’t exist anymore. So there are two options to install DBD::mysql:

2.4.1 Install without test suite

Force install DBD::mysql:

$ sudo cpan

cpan> force install DBD::mysql

Koha - Install DBD::mysql Perl module

Koha – Install DBD::mysql Perl module

Koha - Install DBD::mysql Perl module 1

Koha – Install DBD::mysql Perl module

2.4.2 Create test database in order to install DBD::mysql

Because of DBD::mysql’s test suite, it is necessary to temporarily create a test database and user:

$ mysql -uroot -p

Create the database and user with associated privileges:

Welcome to the MySQL monitor.  Commands end with ; or g.

Your MySQL connection id is 22

Server version: 5.0.32-Debian_7etch3-log Debian etch distribution. Type ‘help;’ or ‘h’ for help. Type ‘c’ to clear the buffer.

mysql> create database test;

Query OK, 1 row affected (0.00 sec)

mysql> grant all on test.* to ‘test’@’localhost’ identified by ‘test’;

Query OK, 0 rows affected (0.00 sec)

(test database, user, and password can be different if need be)

mysql> flush privileges;

Query OK, 0 rows affected (0.00 sec)

mysql> quit

Next install DBD::mysql:

$ sudo cpan

cpan> o conf makepl_arg

(get current value of this CPAN parameter)

cpan> o conf makepl_arg “–testdb=test –testuser=test –testpass=test”

cpan> install DBD::mysql

cpan> o conf makepl_arg ”

OR

cpan> o conf makepl_arg ”

(restore this setting so as to not interfere with future CPAN installs).

Finally, remove the test database:

$ mysql -uroot -p

mysql> drop database test;

Query OK, 1 row affected (0.00 sec)

mysql> exit

Bye

3. Run the Koha installer

sudo perl -MCPAN -e ‘install Class::Accessor’

sudo aptitude install libxml2-utils

sudo aptitude install xml-core

sudo apt-get install libxml++2.6-dev

wget http://search.cpan.org/CPAN/authors/id/E/ES/ESUMMERS/MARC-Charset-0.98.tar.gz

tar -xzf MARC-Charset-0.98.tar.gz

cd MARC-Charset-0.98

perl Makefile.PL (took a long time)

Run Koha Installer - perl Makefile.pl

Run Koha Installer – perl Makefile.pl

 

Koha Installation Mode - Dev, Standard, Single

Koha Installation Mode – Dev, Standard, Single

Koha Use Account and Groups

Koha Use Account and Groups

Koha - Database Configurations

Koha – Database Configurations

Koha - Zebra Configuration Files

Koha – Zebra Configuration Files

Koha SRU Port

Koha SRU Port

Koha - Install PazPar2 Configuration files

Koha – Install PazPar2 Configuration files

install Koha

install Koha

make

make test

sudo make install

Koha sudo make install

Koha sudo make install

wget http://search.cpan.org/CPAN/authors/id/K/KA/KADOS/MARC-XML-0.88.tar.gz

tar -xzf MARC-XML-0.88.tar.gz

cd MARC-XML-0.88

perl Makefile.PL

make

make test

sudo make install

cd ..

wget ftp://xmlsoft.org/libxml2/libxml2-2.6.31.tar.gz

tar -xzf libxml2-2.6.31.tar.gz

cd libxml2-2.6.31/

./configure

make (took a long time)

sudo make install

cd ..

wget http://search.cpan.org/CPAN/authors/id/P/PH/PHISH/XML-LibXML-Common-0.13.tar.gz

tar -xzf XML-LibXML-Common-0.13.tar.gz

cd XML-LibXML-Common-0.13/

perl Makefile.PL

make

sudo make install

cd ..

wget http://search.cpan.org/CPAN/authors/id/P/PA/PAJAS/XML-LibXML-1.65.tar.gz

tar -xzf XML-LibXML-1.65.tar.gz

cd XML-LibXML-1.65

perl Makefile.PL

make

sudo make install

4. Configure and start Apache

$ sudo ln -s /etc/koha/koha-httpd.conf /etc/apache2/sites-available/koha

(note that the path to koha-httpd.conf may be different depending on your installation choices)

Koha - Configure and start Apache

Koha – Configure and start Apache

Add the following lines to /etc/apache2/ports.conf:

Listen 80

Listen 8080

If not running named virtual hosts (The default koha installation does not use named virtual hosts.), comment out the following line:

NameVirtualHost *:80

Run the following commands:

$ sudo a2enmod rewrite deflate

Koha - sudo a2enmod rewrite deflate

Koha – sudo a2enmod rewrite deflate

http://aslamnajeebdeen.com/blog/how-to-fix-apache-could-not-reliably-determine-the-servers-fully-qualified-domain-name-using-127011-for-servername-error-on-ubuntu

sudo /etc/init.d/apache2 start

$ sudo a2ensite koha

$ sudo apache2ctl restart

Note: you may still see the usual Apache default site if your VirtualHost configuration isn’t correct.  The command “sudo a2dissite default” may be a quick fix, but may have side-effects.  See the Apache HTTPD manual section on virtual hosts for full instructions.

5. Configure and start Zebra

Note: it’s recommended that you daemonize the Zebra process and add it to your startup profile. For a non-production test/development installation, running Zebra from the command line can be useful. Pick from the two available options below, or roll your own 🙂

Note: it’s also recommended that you create a Koha system user, which you will have specified during the install process. Alternatively, Zebra can be configured to run as the root user.

To add a user do:

$ sudo adduser koha

Option 1: run the Zebra processes from the command line:

1.1 Zebra Search Server

This process send responses to search requests sent by Koha or Z39.50/SRU/SRW clients.

$ sudo -u ${KOHA_USER} zebrasrv -f /etc/koha/koha-conf.xml

(note that the path to koha-conf.xml may be different depending on your installation choices)

Note: the user you run Zebra as will be the only user with write permission on the Zebra index; in development mode, you may wish to use your system user.

1.2 Zebra Indexer

Added/updated/deleted records in Koha MySQL database must be indexed into Zebra. A specific script must be launched each time a bibliographic or an authority record is edited.

$ sudo -u ${KOHA_USER} misc/migration_tools/rebuild_zebra -z -b -a

NOTE: This script should be run as the kohauser (the default is ‘koha’).

Option 2: run the Zebra process as a daemon, and add to startup process:

Note that references to $SCRIPT_DIR refer to the directory where Koha’s command-line scripts are installed, e.g., /usr/share/koha/bin.

1.1 Zebra Search Server

$ sudo ln -s ${SCRIPT_DIR}/koha-zebra-ctl.sh  /etc/init.d/koha-zebra-daemon

(Note: ${SCRIPT_DIR} is /usr/share/koha/bin/ by default in a standard install)

$ sudo update-rc.d koha-zebra-daemon defaults

( Note: see man chkconfig(8) on other distros )

$ sudo ${SCRIPT_DIR}/koha-zebra-ctl.sh start

Koha - install Zebra

Koha – install Zebra

1.2 Zebra Indexer

Add an entry in Koha user crontab to scheduled added/updated/deleted records indexing by Zebra with this command:

koha>/misc/migration_tools/rebuild_zebra -z -b –a

Koha Install Pearl and Zebra Indexer

Koha Install Pearl and Zebra Indexer

Koha - libbusiness-isbn-perl

Koha – libbusiness-isbn-perl

Koha - rebuild Zebra

Koha – rebuild Zebra

Koha - XML-LibXML

Koha – XML-LibXML

Rebuild_Zebra can be done after completing step-6

See check misc/cronjobs/crontab.example for usage examples.

NOTE: This job should be setup under the kohauser (the default is ‘koha’).

6. Run the Web Installer, populate the database, initial configuration of settings

Point your browser to http://<servername&gt;:8080/

Koha Web Installer

Koha Web Installer

Koha Web Installer 1

Koha Web Installer

Koha Web Installer - missing modules

Koha Web Installer – missing modules

Koha - install HTTP:OAE

Koha – install HTTP:OAE

Koha - install PDF::API2::Simple

Koha – install PDF::API2::Simple

Koha - install Text::CSV::Encoded

Koha – install Text::CSV::Encoded

Koha Web Installer - Dependencies Installed

Koha Web Installer – Dependencies Installed

Koha Web Installer - Dsatabse Settings

Koha Web Installer – Dsatabse Settings

Koha Web Installer - Database Settings

Koha Web Installer – Database Settings

Koha Web Installer - Database Tables and Default Data

Koha Web Installer – Database Tables and Default Data

Koha Web Installer - Database Tables Created

Koha Web Installer – Database Tables Created

Koha Web Installer - Install Basic Configuration Settings

Koha Web Installer – Install Basic Configuration Settings

Koha Web Installer - Select your MARC Flavour

Koha Web Installer – Select your MARC Flavour

Koha Web Installer - Select your MARC Flavour - Selecting Default Settings

Koha Web Installer – Select your MARC Flavour – Selecting Default Settings

Koha Web Installer - Select your MARC Flavour - MySQL Data

Koha Web Installer – Select your MARC Flavour – MySQL Data

Koha - login

Koha – login

Koha - homepage

Koha – homepage

Koha - Set Library

Koha – Set Library

Koha Administration

Koha Administration

Koha - Add Library Patron

Koha – Add Library Patron

Koha Advance Search

Koha Advance Search

Koha Tools

Koha Tools

Koha Reports

Koha Reports

About Koha - Pearl Modules

About Koha – Pearl Modules

About Koha - Translation
About Koha – Translation

 

About Koha - Koha Licenses
About Koha – Koha Licenses
About Koha - Koha Team
About Koha – Koha Team
About Koha - Server Information
About Koha – Server Information

Done.
Other interesting Java related Posts you may like 🙂 :
What is Java? (clean-clouds.com)

Object-Oriented Programming Concepts (clean-clouds.com)

Java Language Fundamentals (clean-clouds.com)

Operators, Expressions and Control Structures in Java (clean-clouds.com)

Creating, Compiling and Running a Simple program in Java (clean-clouds.com)

Generics and Collections in Java (clean-clouds.com)

Applet and Swing in Java (clean-clouds.com)

JDBC in Java (clean-clouds.com)

Basic I/O in Java (clean-clouds.com)

Threads in Java (clean-clouds.com)

Packages in Java (clean-clouds.com)

Exceptions in Java (clean-clouds.com)

Using JavaDoc to document your Java Program (clean-clouds.com)

Java Coding Standards (clean-clouds.com)

Open Source ILS: Installation Guide for Koha on Ubuntu 11.10 or Ubuntu 11.04 or Ubuntu 10.10 or Ubuntu 10.04 LTS with MySQL 5 (clean-clouds.com)

REST and OO REST (clean-clouds.com)

Java Application in One JAR (clean-clouds.com)

Obfuscation in Java (clean-clouds.com)

Encryption~Hybrid Approach (clean-clouds.com)
Related articles

View My Profile on Cloudbook.net

Getting Started with Ubuntu 10.10

A Complete Beginner’s Manual for Ubuntu 10.04 (Lucid Lynx)

Ubuntu: An Absolute Beginners Guide:

The Incredible Guide to NEW Ubuntu (Karmic Koala)

Old Computer, New Life: Restoring Old Hardware With Ubuntu

Computer Inside Your Computer: How To Use VirtualBox: