Wednesday, June 18, 2008

Persistence, Concurrency and RTTI


I got a request to write about Java tenets: Persistence, Concurrency and RTTI. My two cents ....

Persistence:
Object persistence (a.k.a Serialization in Java) is ability to read/write state of object on a stream. Persistence is useful/required in situations when object's state needs to be retained/stored across invocation of a program or when an object needs to be synchronized across JVM's. Examples of persistence are:
  1. Storing object state to a file - File input and output streams
  2. Storing object state to a database
  3. Sending/Receiving/Synchronizing object state across JVM - typically over sockets

Quoting -Carting applications generally use Persistence. A Quote object contains items that you wish to purchase from the online portal. Multiple Quote objects taken together make up a Cart object. The 'Save Cart' option actually serializes the entire Cart object into a Blob colum inside a database table. The Cart object can be reloaded from the db at a later instant to view the complete order.

This brings up two important aspects of Persistence:

  1. The entire object state is serialized to the stream (except for transient variables). Object's properties can be primitives or references to other Object's. Thus each such object also needs to be serialized. In a nutshell, serializing an object results into serialization of all objects contained in the current object till nth level.
  2. Reflections api is used to create a object by reading its state from the stream.

An object can be serialized only if it implements the marker interface 'Serializable'.

Concurrency:
Java supports MultiThreading. A Thread object is a lightweight process and multiple Threads execute concurrently in JVM. This brings up a potential problem of data corruption by simultaneous access via multiple Threads or in other words maintaining sanctity of data being accessed/mutated by various Threads. Use of synchronized keyword, wait() and notify() methods help us in solving the aforementioned issues. I will not go deep into implementation details of wait(), notify() and synchronized as these will be addressed in the Multi Threading blog.

Each object has a monitor associated with it. When a Thread comes across a synchronized method or block during its execution, it obtains the monitor on the object before entering the method or block. Only one Thread can obtain the monitor on an object. Thus in case a second Thread wants to execute another/same synchronized method/block of the same object, then it blocks till the first Thread release the previously obtained monitor.

RTTI (RunTime Type Identification):
RTTI helps us in identifying the object type at runtime as there may be a need to check the object type prior to executing a chunk of business logic. Implementing Polymorphism with interfaces to achieve extensibilty introduces use of base class reference when objects communicate with each other by sending messages. For example, consider the following hierarchy: Animal interface has implmentation classes Wild and Tame which are further extended by Lion, Tiger, Elephant, Deer, Snake, Dog, Cat etc

With appropriate OO design we will introduce api's like getNumberOfLegs in the Animal interface which will be implemented in the concrete derived classes. While passing a list of animals, each of the element in the list will be Animal reference. But what if we wanted to increment a counter for all animals that were Wild? We can add an isWild() api to Animal class and implement it in Wild and Tame classes. This will get automatically inherited into all the subclasses. The basic purpose of introducing Wild and Tame subclasses gets defeated because within Tame, we are implementing isWild() which returns false. Not a good design.

Instead, while iterating over the Animal list, we can use the instanceof operator, which is Java api for RTTI. We can check if the current element is an instance of Wild and then conditionally increment the counter. Having said this, use of instanceof operator is very heavy and affects the performance. A code having number of if .. else if .. else if ... statements involving instanceof operator suggests bad design.

Tuesday, June 17, 2008

Parameter Passing


Objects communicate with each other by sending messages. Sending a message basically means invoking one of the methods from the published contract of other object. Most of the methods will involve passing arguments back and forth between objects.

A method, in C/C++ or Java, can have only one return type, but the signature of method can have a number of arguments in it. For example, the following method declaration is valid:

public String getName( List myList, int count ) {

...
...
}

The above declaration says that the getName method accepts two parameters and returns a String. There is no syntactically legal construct that allows us to return two String objects.

C/C++ developers use a typical coding practice. The return from a method will always be an int. A return of zero indicates successful execution of the method whereas a non-zero return value indicates some error in method execution. This error code is then passed to another api to get the complete error report. A value that needs to be returned from the method is accepted as input argument itself which is subsequently modified in the method. A typical C/C++ api signature is as follows which clearly mentions input and output parameters.

/**
* This function will return all the views,
* for a given Item.
*/
extern C_API int ITEM_list_all_view_revs(

obj_o item, /** < (I) Tag of the item */
int* count, /** < (O) Number of views attached to the above Item */
obj_o** views /**< (OF) count objects of views for the Item */ );


This is where we have to be careful while passing arguments to ITEM_list_all_view_revs api. The obj_o is an input which will be read by the method to list the views. This will be pass by value. The int* count and obj_o** views will be mutated by the method. These have to be pass by reference.

Is it the same when we invoke methods in Java? Yes and No. Yes, we can pass arguments to methods and mutate the same in side the method body. No, there is no such thing as passing by reference in Java. All parameters/arguments in Java are always pass by value. Surprized?

Java does not have the concept of pointers. In case of primitives, the memory location at which the variable is defined holds the value of the variable. For example, writing int i = 5 creates a variable i at say memory location 100 which holds value 5. In case of objects, the object is created on the heap and its memory location address is stored as a value at the memory location where the reference to the object is created. For example, writing List myList = new ArrayList() creates an object of type ArrayList on the heap at say memory location 1000. A reference myList is created at a different memory location say 200 and the value stored at this memory location is 1000.

Suppose we were to call the above declared java method, the code fragment would be somthing like:


List myList = new ArrayList();
int count = 5;

ParameterPassing pp = new ParameterPassing();
pp.getName( myList, count );


Essentially what is getting passed is the value at the memory location of variables myList and count. myList is a reference to object and hence holds the address of the object created on heap whereas count is a primitive and directly holds the value. Thus in the getName method if we were to call the myList.add("xyz") method, then it will go and modify the object state on the heap. But if we do count++ within the getName method then the value is modified withing the method scope but not persisted back to the memory location where it originated from.

The crux of paramter passing in Java is as follows:
  1. Parameters/Arguments are always passed by value
  2. Objects are always mutated by reference


Another implementation detail to be kept in mind is that method invocation results into creation of local variables of the parameters passed to the method. Thus passing myList to getName method creates a local variable/reference to the ArrayList object created on the heap. Suppose in the method, we assign a different reference to the myList variable, myList = new ArrayList(), then this is valid only in the current method scope. This is because the locally created myList reference now points to a new object created on the heap.

A small program to explain the above concept:

import java.util.*;

public class ParameterPassing {

public String getName( List myList, int count ) {

myList.add("XYZ");
count++;

System.out.println(myList.get(0));
System.out.println(count);

myList = new ArrayList();

return "";
}

public static void main( String args[]) {

List myList = new ArrayList();
int count = 5;

ParameterPassing pp = new ParameterPassing();
pp.getName( myList, count );


System.out.println(myList.get(0));
System.out.println(count);
}
}

Following is the output:
XYZ
6
XYZ
5

The count in the main method holds the old value 5 because the count++ in getName method actually increments the local variable. Similarly myList = new ArrayList() points the local reference to the new object created on the heap.

Monday, June 16, 2008

OOPs ! I defined it again ....


All the interviews that I have given and conducted till date always had (and will continue to have) the usual suspect : 'Define concepts/principles of Object Oriented Programming". Most of the interviewers arrive at a fair enough conclusion about the interviewee depending upon his explanation on the aforementioned question. Before I dive into crisp definitions of each of the concept, I want to point out three important entities often misunderstood or misquoted or misconceived while reading books.

C&C (common and crap) answers:
"An Object is an instance of a class ..... Class is from which objects are created/derived". Practically i.e. looking from coding point of view it is a correct definition, but unfortunately doesn't explain much about the exact use of object and class. To put it in layman terms "X equals Y and Y equals X". When one gets to designing an application, this definition will not help.

"An interface is Java's solution to support multiple inheritance". Agreed, but that is use of interface and not the definition.

C&C answers again (this time crisp and correct) :
Object: An object is an entity that has the following four characteristics, viz: Identity, State, Behaviour and Properties

Class: A class provides an Identity to the object and models the state, behaviour and properties of the object. A class defines a skeleton for objects and acts as a blueprint from which object can be created.

Interface: Interface is a contract between a class and the outside world. The behaviour of an object i.e. the methods form the object's interface to outside world. By implementing an interface, a class promises to provide the behaviour published by the Interface. Interfaces help in extensibility.

The four Object Oriented Principles:

Inheritance: Mechanism to structure and organize our software. Provides an ability to a sub-calss to inherit the properties, state and behaviour of the base class. Inheritance enables us to define a more general class and derive specialized classes from it. This helps in better data analysis.

Encapsulation: Separating object's interface from its implementation. Hiding object data and allowing modification of object data by publishing method. Controlling the access to data and methods that manipulate the data using access modifiers.

Polymorphism: The ability of a reference of a base class to denote an object of its own class or any of its subclasses at runtime. This is a definition that many of you might be reading for the first time. The common definitions being "one name many forms" or "method overloading and overriding". But on deeper introspection, the above is possible when a reference of a base class can denote different objects at runtime. The only exeception being method overloading as this is compile time polymorphism i.e. which method will be executed can be determined at compile time itself.

Abstraction: Simplifying complexities by seggregating/breaking them into smaller units and modelling a structure/heirarchy chain by abstracting common behaviour into classes.

All the OOP principals work hand-in-hand. It is very difficult to write a code that just implements the Abstraction principal and so forth.

I hope you have a better perspective of the principles now. I will write more in the blogs-to-come. Till then ... happy digestion.

Friday, June 13, 2008

Hello Primitives


Java is not 100% object oriented because of two main constraints:

  1. int, float, char etc are primitives and not objects
  2. Multiple inheritance not supported.

Java has an answer/solution to the above mentioned constraints. Wrapper classes are provided which allow us to create an object even when dealing with simple numbers. Thus instead of using int, we can create an object of type Integer or a Float object instead of float primitive. Java's answer to multiple inheritance is Interfaces; a class can implement more than one interfaces.

Even with these solutions/workarounds, the main reason behind Java not being purely object oriented is that we can still write a Java program with the use of primitives. This introduces use of an entity that is not an object, which violates the principle of object oriented program. Java allows the use of primitives for performance reasons. If it weren't for primitives, our mundane loops would have caused a catastrophe. A simple program to illustrate the Power of Primitives:


import java.util.Calendar;

public class HelloPrimitives {

public static void main(String args[]) {

//Nested loops using primitives
long lTimeInMillis = Calendar.getInstance().getTimeInMillis();
int i = 0;
int j = 0;

for( int counter = 0; counter < 1000; counter++ ) {

for( int innerCounter = 0; innerCounter < 3000; innerCounter++ ) {

j = innerCounter;
}

i = counter;
}

lTimeInMillis = Calendar.getInstance().getTimeInMillis() - lTimeInMillis;

//Nested loops using wrapper classes
Long oTimeInMillis = new Long( Calendar.getInstance().getTimeInMillis( ) );
Integer objI = null;
Integer objJ = null;
Integer objCounter = new Integer( 0 );

for( ; objCounter.intValue() < 1000; ) {

Integer integerInnerCounter = new Integer( 0 );

for( ; objInnerCounter.intValue() < 3000; ) {

objJ = new Integer( objInnerCounter.intValue() );
objInnerCounter = new Integer( objInnerCounter.intValue() + 1 );
}

objI = new Integer( objCounter.intValue() );
objCounter = new Integer( objCounter.intValue() + 1 );
}

oTimeInMillis = new Long( Calendar.getInstance().getTimeInMillis() - oTimeInMillis.longValue() );

System.out.println( "Time taken by primitives = " + lTimeInMillis );
System.out.println( "Time taken by wrapper classes = " + oTimeInMillis.longValue() );

}
}

The following output is observed on my machine (numbers are machine specific):
D:\amey\blogs\java\progs\hello_primitives>java HelloPrimitives
Time taken by primitives = 15
Time taken by wrapper classes = 1219


It can be seen that use of primitives improves the performance significantly. Stark reasons for this are

  1. Memory allocation for a primitive and object is very different. For primitive datatype, the value is directly stored at the memory address of the reference. Whereas objects are created on heap, and the memory location of the reference holds the memory location of the object on the heap.
  2. The values can be directly accessed in case of primitives while for objects, we need to invoke a method using the dot ',' operator. This calls for pushing the current method on the stack and popping it back at later instance which is time consuming.

One more tip for increasing performance is controlling number of times a primitive or object gets created. In the above program, we are creating innerCounter and objInnerCounter within the inner loop. Try initializing them just once above the outermost for loop and see the difference in timings.

Also increase the inner loop to 30,000.

int innerCounter = 0; //initialize only once
for( int counter = 0; counter < 1000; counter++ ) {
for( ; innerCounter < 30000; ) {
.....
.....
}


.....
.....
Integer objCounter = new Integer( 0 );
Integer objInnerCounter = new Integer( 0 ); //initialize only once
for( ; objCounter.intValue() < 1000; ) {



for( ; objInnerCounter.intValue() < 30000; ) {
.....
.....
}
}
The output on my machine is:
D:\amey\blogs\java\progs\hello_primitives>java HelloPrimitives
Time taken by primitives = 0
Time taken by wrapper classes = 16

Wow !

Cheers,
Amey

Wednesday, June 11, 2008

Hello World


The first ever program written by a newbie is HelloWorld, be it C, C++, Java or any other programming language. Gives a sense of satisfaction to see the output on the screen. But few get there in the first shot. As for me, it took me god damm five freaking hours to understand what was going wrong.

Most of the times, the first ever HelloWorld program is copied by us till the last 'white character'. Seldom any chance of making any mistakes there. The first problem that hits us is:


C:\>javac HelloWorld.java
'javac' is not recognized as an internal or external command,operable program or batch file.C:\>

javac.exe is a file that gets installed on our machine as a part of jdk installation. This executable is required for generating the byte code i.e. essentially the .class file from our .java files. javac.exe is present in the /bin folder. This bin folder should be available in the 'Path' variable for the command prompt to locate javac.exe.

Copy the path till /bin and set it into the 'Path' variable withing the command prompt and then try the 'javac' command. Don't forget to append %PATH% at the end otherwise you will end up messing up the existing path variables in the current window. Sample from my environment:


C:\>set PATH=D:\apps\JDK_1.5.0.11\bin;%PATH%
C:\>javac
Usage: javac where possible options include:

-g Generate all debugging info
-g:none Generate no debugging info
-g:{lines,vars,source} Generate only some debugging info
-nowarn Generate no warnings
-verbose Output messages about what the compiler is doing
-deprecation Output source locations where deprecated APIs are used
-classpath Specify where to find user class files
-cp Specify where to find user class files
-sourcepath Specify where to find input source files
-bootclasspath Override location of bootstrap class files
-extdirs Override location of installed extensions
-endorseddirs Override location of endorsed standards path
-d Specify where to place generated class files
-encoding Specify character encoding used by source files
-source Provide source compatibility with specified release
-target Generate class files for specific VM version
-version Version information
-help Print a synopsis of standard options
-X Print a synopsis of nonstandard options
-J Pass directly to the runtime system
C:\>

Similar problems are observed with the 'jar' command. jar.exe is an executable which helps us in creating java archive (jar), web archive (war) and enterprise archive (ear) files. In case jar -uvf xyz.jar com/* command gives you an error, the solution is same as above.
Many a times I have resources asking me that even if /bin is in the classpath, we get 'javac not recognised command ....' error. JRE is abbreviation for Java Runtime Environment. The bin folder under jre holds set of executables and binaries required by JVM (Java Virtual Machine). Thus development time executables, like javac, jar, javadoc, rmic to name a few, will not be present in this bin folder. Runtime executables like java, rmiregistry etc will be available in jre/bin.
Hope this blog serves as a good appetizer.
Cheers, Amey

Tuesday, June 10, 2008

Java Programmers aren't BORN !!!


In most of the interviews, which I have conducted till date, the candidates frequently ask me: “How can I become a good Java programmer?” or “How can I improve my Java skills” or “How did you learn Java”. I thought of sharing some of my methodologies, which might prove useful to some of you.

There is really no short cut to mastering Java, so in this column I will try to talk about the pedagogy. Let us picture for a moment, that a fan is asking Sachin Tendulkar a similar question about cricket:
Fan: "Good morning Sachin, my son would like to start a career in cricket. What advice can you give him?"
Sachin: "Is he playing in Ranjhi?"
Fan: "Oh no, he does not play cricket, but he follows it religiously on television. He has all the statistics on his finger tips"
Sachin: "Oooooh. What other sports does he play?"
Fan: "He plays soccer …… on his computer, you heard about Fifa98/99 …2004. Other than that, we once played table tennis together."

The point I am trying to make here is, you cannot learn Java just by reading fat books by international authors. You need to dirty your hands by writing programs, go through the pain of writing and compiling programs without IDEs and code assist.

Where and how does it all start?
Becoming a programmer starts early in life. You would have been at a great advantage if you were good in mathematics and physics. Here I am not looking for the tools that you learn doing maths, but rather the interest in "thinking" subjects. There are two types of java programmers; ones who know the legal constructs by heart and others who can think in Java.

Then the next step is to want to program for relaxation. Yes, you heard me right. It is not always necessary that the code you write will be used in the project that one is working on. You can code utility methods for some one working on a different project or just practice what you learnt, but all of this apart from the regular deliverables, chat, orkut and mailers. When I embarked on my programming career, all I did during free time was to download code samples from the Internet and study the behavior. It really did not make any sense at first, but as I read through tutorials and code samples, things started correlating.

I believe that being a programmer is not a job; it is a life. In order to learn it, you need to eat, breathe and sleep Java. There should be very few waking moments where you are not thinking about Java.

Let's say, for example, that your official work hours are 9:00am to 6:00pm. It is what happens after those hours that will determine your future as a programmer, i.e. what are you doing between 6:00pm and 9:00am. Before you set off for work, you could be programming in Java. And what about the time from 6:00pm (when you get home) until 23:59:59? If we take into account a few responsibilities, you have an additional 1.5 hours in the morning, and 3 hours in the evening.I am not making this up. If you want to be a good Java programmer, you need to have the dedication to further your own knowledge "after hours". Instead of orkut, surf for Java concepts, Object orientation, Design Patterns, etc.The key is the drive behind what you do. If your motivation to be a Java programmer is just to do a job, and to thereby earn a salary, you will never be good. Becoming an excellent Java programmer is neither difficult nor too easy.

Follow these steps:
  1. Read Java Complete Reference thrice. The first time for legal constructs and sample codes, the second time for understanding the concepts and the third for understanding how to use the concepts programmatically.
  2. Program for pleasure, not for money. Spend at least few hours of your own time per day on learning more about programming in Java.
  3. Never stop learning. The half-life of our IT knowledge is 18 months. We cannot afford to stand still; otherwise we will be obsolete in a very short period of time.
  4. Don't read books - STUDY them. When I read a new Java book, I open it next to my workstation and then I type in the code as I progress. This is a bit slower, but you learn faster.

There are very few people who have the potential to become good programmers. It is a tiny percentage of the programmers on the apex of the pyramid that can do it. Our goal is to reach the apex. These are some thoughts on how to become an excellent Java programmer. They are tough jobs to do, and by no means complete, but are meant to get you thinking.

Sunday, June 8, 2008

My First Address To Freshers


Hello Everyone,

I would have personally liked to address you before the training kicks off, but as I had to fly to Japan, I have requested Mr. K to do the honors.

I underwent a corporate training for Java and J2EE back in 2004 when I was associated with a different employer. That’s when I met Mr. J, who was the trainer. For four days we talked at length from OOP concepts to Reflections and that was indeed an enlightening experience.I was on the interview panel and I found some common misunderstanding of concepts amongst the candidates that I interviewed. That is the reason I why I to take this opportunity to share some of my experiences specifically with Mr. J and generally with “How to make the most from the training session”. These are some of my cheat sheets and it is not an obligation for you all to follow these steps in case our views differ.

Concepts: The most important and fundamental key to be good developer is to have your concepts clear. This training course is not similar to the commercial ones available in the markets where they teach you the syntactically legal constructs of writing Java/C++ programs. We have tailored the course contents to put more emphasis on topics that we feel should be discussed at length. Ask as many questions that arise in your mind and follow a simple principal “No question is stupid, no question is pathetic”. Once the concepts are clear, writing code will be very simple.I would like to emphasize that you all put enough efforts to get the OOP concepts crystal clear by the time you all finish with the training. This was one thing in common that I found most of you lacked.

Implementation: The exercises given during the training need to be essentially carried out individually rather than in groups. Try to relate what you are typing. Don’t feel shy to ask Mr. J when you get confused over what really has to be done. I bet Mr. J will never spoon feed any of you; so try to give your best shot. Remember, once you get on with the projects, you will be all alone. I feel this is the best time when you all can start cranking up your debugging skills, imagination as to how the code might be working internally and so on. Try to squeeze as much information from Mr. J as possible. This includes best implementation practices, code reuse, how to decide the best fit java classes for a particular functionality when we need to make a choice from couple of types available at hand etc.

Post training: It will be good if you all can read some more literature after every session. Google.com is one unlimited source of information and “Complete Reference by Herbert Schields” is another ready reference. Read this book in parallel to the training and get your doubts cleared if you find a discrepancy between what is written and what was taught. It often happens that the wording in the book misleads us to believe in something that is actually not true. Be careful.

Mr. J: He is a well read and a no nonsense guy. I bet you won’t feel sleepy or bored during any of his lectures as he makes the discourses humorous enough. Try to understand the similarities and differences when he compares two technologies or the implementations in different technologies. Make ample notes from his presentations, specifically the definitions or different terms, java tenets, java constraints and so forth. They have proved very helpful to me in the long run. Take his assignments seriously as the feedback he provides on the implementation is genuine enough, it will train your mind while coding and also improve writing good performance code in the first shot. Trust me, you will be able to make a lot of use of these assignments when you come aboard on live projects. If time permits try to go the extra mile to implement additional features that might be discussed in the class.

Last but not the least. We are all here to help you out in case you are stuck with some implementation and Pankaj is not around. But don’t just get someone to fix the code and leave, try and understand the logic behind fix. The bottom line to successful coding is having the right logic in place as after that its only the semantics that have to be followed.