Techiella: April 2015

Tuesday, April 28, 2015

Processing XML with Java – A Performance Benchmark

The use of structured documents in XML has a wide area of application in different types of fields. In many cases it is necessary to process documents of a considerable size where runtime is relevant and the execution window is clearly limited. As we saw, there are two types of XML APIs: memory- based APIs and streaming-based APIs. Memory- based XML APIs maintain a long lived structural data in memory and only when the parsing process is finished modifications are allowed, while streaming-based APIs use small memory footprint, allocating and freeing memory constantly, allowing the process of infinite size XML documents (in theory).

Generally, for XML handling, dom4j, and DOM are good choices, with the preference between them determined by Java-specific features or cross-language compatibility, depending on project requirements. Although less flexible in XML transformations, OJXQI is a very good choice when you need to do standard modifications with good performance. VTD array of integers’ structure proves to be the best model in almost all tests. It is a model that consumes less memory (compared to other memory-based APIs), the processing time is very fast and even their ability to update a document, maintaining its structure in memory, proved being far superior in relation to the other memory-based APIs (for tested scenario). The use of VTD API is more complex in comparison to other memory-based APIs, where it is necessary an additional effort to dominate the API’s features.

For streaming-based APIs, StAX has proved to be an API with better overall performance compared to SAX and XOM. This kind of APIs do not maintain long-lived structural data in memory, so there are no advantages in using this type of API when you need to perform a set of transformations that somehow change the order of elements in the XML hierarchy. Typically, these types of APIs are used only for forward-only applications or simple modifications using XSLT language.

Memory-based APIs maintain the structure of the whole document in memory, resulting in some overhead, however, for updates that somehow change the document structure, this type of APIs lead to some advantages over the streaming-based APIs since those need to perform increased I/O operations to do same transformation. Manipulating a document using memory-based APIs is much more accessible and quick, since for streaming-based APIs we need to constantly use temporary buffers to keep information in memory. In summary, we can conclude that choosing from the two approaches studied for processing XML documents depends mostly on project’s requirements.

Wednesday, April 22, 2015

Can you override Static Methods in Java?

Well... the answer is NO if you think from the perspective of how an overriden method should behave in Java. But, you don't get any compiler error if you try to override a static method. That means, if you try to override, Java doesn't stop you doing that; but you certainly don't get the same effect as you get for non-static methods. Overriding in Java simply means that the particular method would be called based on the run time type of the object and not on the compile time type of it (which is the case with overriden static methods). Okay... any guesses for the reason why do they behave strangely? Because they are class methods and hence access to them is always resolved during compile time only using the compile time type information. Accessing them using object references is just an extra liberty given by the designers of Java and we should certainly not think of stopping that practice only when they restrict it :-)

Example: let's try to see what happens if we try overriding a static method:-

class SuperClass{

......

public static void staticMethod(){

System.out.println("SuperClass: inside staticMethod");

}

......

}

public class SubClass extends SuperClass{

......

//overriding the static method

public static void staticMethod(){

System.out.println("SubClass: inside staticMethod");

}

......

public static void main(String []args){

......

SuperClass superClassWithSuperCons = new SuperClass();

SuperClass superClassWithSubCons = new SubClass();

SubClass subClassWithSubCons = new SubClass();

superClassWithSuperCons.staticMethod();

superClassWithSubCons.staticMethod();

subClassWithSubCons.staticMethod();

...

}

Output:-

SuperClass: inside staticMethod

SubClass: inside staticMethod

Notice the second line of the output. Had the staticMethod been overriden this line should have been identical to the third line as we're invoking the 'staticMethod()' on an object of Runtime Type as 'SubClass' and not as 'SuperClass'. This confirms that the static methods are always resolved using their compile time type information only.

Sunday, April 19, 2015

How to find Race Conditions in Java

Race condition in Java with Example tutorial

Finding Race conditions in any language is most difficult job and Java is no different, though since readability of Java code is very good and synchronized constructs are well defined heaps to find race conditions by code review. finding race conditions by unit testing is not reliable due to random nature of race conditions. since race conditions surfaces only some time your unit test may passed without facing any race condition. only sure shot way to find race condition is reviewing code manually or using code review tools which can alert you on potential race conditions based on code pattern and use of synchronization in Java. I solely rely on code review and yet to find a suitable tool for exposing race condition in java.

Code Example of Race Condition in Java

Based on my experience in Java synchronization and where we use synchronized keyword I found that two code patterns namely "check and act" and "read modify write" can suffer race condition if not synchronized properly. both cases rely on natural assumption that a single line of code will be atomic and execute in one shot which is wrong e.g. ++ is not atomic.

"Check and Act" race condition pattern

classical example of "check and act" race condition in Java is getInstance() method of Singleton Class, remember that was one questions which we have discussed on 10 Interview questions on Singleton pattern in Java as "How to write thread-safe Singleton in Java". getInstace() method first check for whether instance is null and than initialized the instance and return to caller. Whole purpose of Singleton is that getInstance should always return same instance of Singleton. if you call getInstance() method from two thread simultaneously its possible that while one thread is initializing singleton after null check, another thread sees value of _instance reference variable as null (quite possible in java) especially if your object takes longer time to initialize and enters into critical section which eventually results in getInstance() returning two separate instance of Singleton. This may not happen always because a fraction of delay may result in value of _instance updated in main memory. here is a code example

public Singleton getInstance(){

if(_instance == null){ //race condition if two threads sees _instance= null

_instance = new Singleton();

}

an easy way to fix "check and act" race conditions is to synchronized keyword and enforce locking which will make this operation atomic and guarantees that block or method will only be executed by one thread and result of operation will be visible to all threads once synchronized blocks completed or thread exited form synchronized block.

read-modify-update race conditions

This is another code pattern in Java which cause race condition, classical example is the non thread safe counter we discussed in how to write thread safe class in Java. this is also a very popular multi-threading question where they ask you to find bugs on concurrent code. read-modify-update pattern also comes due to improper synchronization of non-atomic operations or combination of two individual atomic operations which is not atomic together e.g. put if absent scenario. consider below code

if(!hashtable.contains(key)){

hashtable.put(key,value);

}

here we only insert object into hashtable if its not already there. point is both contains() and put() are atomic but still this code can result in race condition since both operation together is not atomic. consider thread T1 checks for conditions and goes inside if block now CPU is switched from T1 to thread T2 which also checks condition and goes inside if block. now we have two thread inside if block which result in either T1 overwriting T2 value or vice-versa based on which thread has CPU for execution. In order to fix this race condition in Java you need to wrap this code inside synchronized block which makes them atomic together because no thread can go inside synchronized block if one thread is already there.

These are just some of examples of race conditions in Java, there will be numerous based on your business logic and code. best approach to find Race conditions is code review but its hard because thinking concurrently is not natural and we still assume code to run sequentially. Problem can become worse if JVM reorders code in absent of proper synchronization to gain performance benefit and this usually happens on production under heavily load, which is worst. I also suggest doing load testing in production like environment which many time helps to expose race conditions in java. Please share if you have faced any race condition in java projects.

Struts 1 vs Struts 2

Struts 1	Struts 2
Action Classes: Struts 1 requires Action classes to extend an abstract base class. A common problem in Struts 1 is programming to abstract classes instead of interfaces.	Action Classes: An Struts 2 Action may implement an Action interface, along with other interfaces to enable optional and custom services. Struts 2 provides a base ActionSupport class to implement commonly used interfaces. Albeit, the Action interface is not required. Any POJO object with a execute signature can be used as an Struts 2 Action object.
Threading Model: Struts 1 Actions are singletons and must be thread-safe since there will only be one instance of a class to handle all requests for that Action. The singleton strategy places restrictions on what can be done with Struts 1 Actions and requires extra care to develop. Action resources must be thread-safe or synchronized.	Threading Model: Struts 2 Action objects are instantiated for each request, so there are no thread-safety issues. (In practice, servlet containers generate many throw-away objects per request, and one more object does not impose a performance penalty or impact garbage collection.)
Servlet Dependency: Struts 1 Actions have dependencies on the servlet API since the HttpServletRequest and HttpServletResponse is passed to the execute method when an Action is invoked.	Servlet Dependency: Struts 2 Actions are not coupled to a container. Most often the servlet contexts are represented as simple Maps, allowing Actions to be tested in isolation. Struts 2 Actions can still access the original request and response, if required. However, other architectural elements reduce or eliminate the need to access the HttpServetRequest or HttpServletResponse directly.
Testability: A major hurdle to testing Struts 1 Actions is that the execute method exposes the Servlet API. A third-party extension, Struts TestCase, offers a set of mock object for Struts 1.	Testability: Struts 2 Actions can be tested by instantiating the Action, setting properties, and invoking methods. Dependency Injection support also makes testing simpler.
Harvesting Input: Struts 1 uses an ActionForm object to capture input. Like Actions, all ActionForms must extend a base class. Since other JavaBeans cannot be used as ActionForms, developers often create redundant classes to capture input. DynaBeans can used as an alternative to creating conventional ActionForm classes, but, here too, developers may be redescribing existing JavaBeans.	Harvesting Input: Struts 2 uses Action properties as input properties, eliminating the need for a second input object. Input properties may be rich object types which may have their own properties. The Action properties can be accessed from the web page via the taglibs. Struts 2 also supports the ActionForm pattern, as well as POJO form objects and POJO Actions. Rich object types, including business or domain objects, can be used as input/output objects. The ModelDriven feature simplifies taglb references to POJO input objects.
Expression Language: Struts 1 integrates with JSTL, so it uses the JSTL EL. The EL has basic object graph traversal, but relatively weak collection and indexed property support.	Expression Language: Struts 2 can use JSTL, but the framework also supports a more powerful and flexible expression language called "Object Graph Notation Language" (OGNL).
Binding values into views: Struts 1 uses the standard JSP mechanism for binding objects into the page context for access.	Binding values into views: Struts 2 uses a "ValueStack" technology so that the taglibs can access values without coupling your view to the object type it is rendering. The ValueStack strategy allows reuse of views across a range of types which may have the same property name but different property types.
Type Conversion: Struts 1 ActionForm properties are usually all Strings. Struts 1 uses Commons-Beanutils for type conversion. Converters are per-class, and not configurable per instance.	Type Conversion: Struts 2 uses OGNL for type conversion. The framework includes converters for basic and common object types and primitives.
Validation: Struts 1 supports manual validation via a validate method on the ActionForm, or through an extension to the Commons Validator. Classes can have different validation contexts for the same class, but cannot chain to validations on sub-objects.	Validation: Struts 2 supports manual validation via the validate method and the XWork Validation framework. The Xwork Validation Framework supports chaining validation into sub-properties using the validations defined for the properties class type and the validation context.
Control of Action Execution: Struts 1 supports separate Request Processors (lifecycles) for each module, but all the Actions in the module must share the same lifecycle.	Control of Action Execution: Struts 2 supports creating different lifecycles on a per Action basis via Interceptor Stacks. Custom stacks can be created and used with different Actions, as needed.

Java 7 vs Java 8

Java 7	Java 8
Features Added: -Support for dynamically-typed languages (InvokeDynamic): Extensions to the JVM, the Java language, and the Java SE API to support the implementation of dynamically-typed languages at performance levels near to that of the Java language itself - Strict class-file checking: Class files of version 51 (SE 7) or later must be verified with the typechecking verifier; the VM must not fail over to the old inferencing verifier. - Small language enhancements (Project Coin): A set of small language changes intended to simplify common, day-to-day programming tasks: Strings in switch statements, try-with-resources statements, improved type inference for generic instance creation (\"diamond\"), simplified varargs method invocation, better integral literals, and improved exception handling (multi-catch) - Upgrade class-loader architecture: A method that frees the underlying resources, such as open files, held by a URLClassLoader - Concurrency and collections updates: A lightweight fork/join framework, flexible and reusable synchronization barriers, transfer queues, concurrent linked double-ended queues, and thread-local pseudo-random number generators. - Internationalization Upgrade: Upgrade on Unicode 6.0, Locale enhancement and Separate user locale and user-interface locale. - More new I/O APIs for the Java platform (NIO.2), NIO.2 filesystem provider for zip/jar archives, SCTP, SDP, TLS 1.2 support. - Security & Cryptography implemented Elliptic-curve cryptography (ECC). - Upgrade to JDBC 4.1 and Rowset 1.1. - XRender pipeline for Java 2D, Create new platform APIs for 6u10 graphics features, Nimbus look-and-feel for Swing, Swing JLayer component, Gervill sound synthesizer. - Upgrade the components of the XML stack to the most recent stable versions: JAXP 1.4, JAXB 2.2a, and JAX-WS 2.2. - Enhanced Managed Beans.	Code name is Spider. Features Added: - JSR 335, JEP 126: Language-level support for lambda expressions. - JSR 223, JEP 174: Project Nashorn, a JavaScript runtime which allows developers to embed JavaScript code within applications. - JSR 308, JEP 104: Annotation on Java Types. - Unsigned Integer Arithmetic. - JSR 337, JEP 120: Repeating annotations. - JSR 310, JEP 150: Date and Time API. - JEP 178: Statically-linked JNI libraries. - JEP 153: Launch JavaFX applications (direct launching of JavaFX application JARs). - JEP 122: Remove the permanent generation. - Java 8 is not supported on Windows XP. But as of JDK 8 update 5, it still can run under Windows XP after forced installation by directly unzipping from the installation executable.

Java 7

Java 8

Features Added:

-Support for dynamically-typed languages (InvokeDynamic): Extensions to the JVM, the Java language, and the Java SE API to support the implementation of dynamically-typed languages at performance levels near to that of the Java language itself

- Strict class-file checking: Class files of version 51 (SE 7) or later must be verified with the typechecking verifier; the VM must not fail over to the old inferencing verifier.

- Small language enhancements (Project Coin): A set of small language changes intended to simplify common, day-to-day programming tasks: Strings in switch statements, try-with-resources statements, improved type inference for generic instance creation (\"diamond\"), simplified varargs method invocation, better integral literals, and improved exception handling (multi-catch)

- Upgrade class-loader architecture: A method that frees the underlying resources, such as open files, held by a URLClassLoader

- Concurrency and collections updates: A lightweight fork/join framework, flexible and reusable synchronization barriers, transfer queues, concurrent linked double-ended queues, and thread-local pseudo-random number generators.

- Internationalization Upgrade: Upgrade on Unicode 6.0, Locale enhancement and Separate user locale and user-interface locale.

- More new I/O APIs for the Java platform (NIO.2), NIO.2 filesystem provider for zip/jar archives, SCTP, SDP, TLS 1.2 support.

- Security & Cryptography implemented Elliptic-curve cryptography (ECC).

- Upgrade to JDBC 4.1 and Rowset 1.1.

- XRender pipeline for Java 2D, Create new platform APIs for 6u10 graphics features, Nimbus look-and-feel for Swing, Swing JLayer component, Gervill sound synthesizer.

- Upgrade the components of the XML stack to the most recent stable versions: JAXP 1.4, JAXB 2.2a, and JAX-WS 2.2.

- Enhanced Managed Beans.

Code name is Spider. Features Added:

- JSR 335, JEP 126: Language-level support for lambda expressions.

- JSR 223, JEP 174: Project Nashorn, a JavaScript runtime which allows developers to embed JavaScript code within applications.

- JSR 308, JEP 104: Annotation on Java Types.

- Unsigned Integer Arithmetic.

- JSR 337, JEP 120: Repeating annotations.

- JSR 310, JEP 150: Date and Time API.

- JEP 178: Statically-linked JNI libraries.

- JEP 153: Launch JavaFX applications (direct launching of JavaFX application JARs).

- JEP 122: Remove the permanent generation.

- Java 8 is not supported on Windows XP. But as of JDK 8 update 5, it still can run under Windows XP after forced installation by directly unzipping from the installation executable.

Java 6 vs Java 7

Java 6	Java 7
Features added: - Support for older win9x versions dropped. - Scripting lang support: Generic API for integration with scripting languages, & built-in mozilla javascript rhino integration - Dramatic performance improvements for the core platform, and swing. - Improved web service support through JAX-WS JDBC 4.0 support - Java compiler API: an API allowing a java program to select and invoke a java compiler programmatically. - Upgrade of JAXB to version 2.0: including integration of a stax parser. - Support for pluggable annotations - Many GUI improvements, such as integration of swingworker in the API, table sorting and filtering, and true swing double-buffering (eliminating the gray-area effect).	Features Added: - Upgrade class-loader architecture: A method that frees the underlying resources, such as open files, held by a URLClassLoader - Concurrency and collections updates: A lightweight fork/join framework, flexible and reusable synchronization barriers, transfer queues, concurrent linked double-ended queues, and thread-local pseudo-random number generators. - Internationalization Upgrade: Upgrade on Unicode 6.0, Locale enhancement and Separate user locale and user-interface locale. - More new I/O APIs for the Java platform (NIO.2), NIO.2 filesystem provider for zip/jar archives, SCTP, SDP, TLS 1.2 support. - Security & Cryptography implemented Elliptic-curve cryptography (ECC). - Upgrade to JDBC 4.1 and Rowset 1.1. - XRender pipeline for Java 2D, Create new platform APIs for 6u10 graphics features, Nimbus look-and-feel for Swing, Swing JLayer component, Gervill sound synthesizer. - Upgrade the components of the XML stack to the most recent stable versions: JAXP 1.4, JAXB 2.2a, and JAX-WS 2.2. - Enhanced MBeans." Support for dynamically-typed languages (InvokeDynamic): Extensions to the JVM, the Java language, and the Java SE API to support the implementation of dynamically-typed languages at performance levels near to that of the Java language itself - Strict class-file checking: Class files of version 51 (SE 7) or later must be verified with the typechecking verifier; the VM must not fail over to the old inferencing verifier. - Small language enhancements (Project Coin): A set of small language changes intended to simplify common, day-to-day programming tasks: Strings in switch statements, try-with-resources statements, improved type inference for generic instance creation ("diamond"), simplified varargs method invocation, better integral literals, and improved exception handling (multi-catch).

Java 6

Java 7

Features added:

- Support for older win9x versions dropped.

- Scripting lang support: Generic API for integration with scripting languages, & built-in mozilla javascript rhino integration

- Dramatic performance improvements for the core platform, and swing.

- Improved web service support through JAX-WS JDBC 4.0 support

- Java compiler API: an API allowing a java program to select and invoke a java compiler programmatically.

- Upgrade of JAXB to version 2.0: including integration of a stax parser.

- Support for pluggable annotations

- Many GUI improvements, such as integration of swingworker in the API, table sorting and filtering, and true swing double-buffering (eliminating the gray-area effect).

Features Added:

- Upgrade class-loader architecture: A method that frees the underlying resources, such as open files, held by a URLClassLoader

- Internationalization Upgrade: Upgrade on Unicode 6.0, Locale enhancement and Separate user locale and user-interface locale.

- More new I/O APIs for the Java platform (NIO.2), NIO.2 filesystem provider for zip/jar archives, SCTP, SDP, TLS 1.2 support.

- Security & Cryptography implemented Elliptic-curve cryptography (ECC).

- Upgrade to JDBC 4.1 and Rowset 1.1.

- XRender pipeline for Java 2D, Create new platform APIs for 6u10 graphics features, Nimbus look-and-feel for Swing, Swing JLayer component, Gervill sound synthesizer.

- Upgrade the components of the XML stack to the most recent stable versions: JAXP 1.4, JAXB 2.2a, and JAX-WS 2.2.

- Enhanced MBeans." Support for dynamically-typed languages (InvokeDynamic): Extensions to the JVM, the Java language, and the Java SE API to support the implementation of dynamically-typed languages at performance levels near to that of the Java language itself

- Strict class-file checking: Class files of version 51 (SE 7) or later must be verified with the typechecking verifier; the VM must not fail over to the old inferencing verifier.

- Small language enhancements (Project Coin): A set of small language changes intended to simplify common, day-to-day programming tasks: Strings in switch statements, try-with-resources statements, improved type inference for generic instance creation ("diamond"), simplified varargs method invocation, better integral literals, and improved exception handling (multi-catch).

Java 5 VS Java 6

Java 5	Java 6
Features added: - Generics: provides compile-time (static) type safety for collections and eliminates the need for most typecasts (type conversion). - Metadata: also called annotations; allows language constructs such as classes and methods to be tagged with additional data, which can then be processed by metadata-aware utilities. - Autoboxing/unboxing: automatic conversions between primitive types (such as int) and primitive wrapper classes (such as integer). - Enumerations: the enum keyword creates a typesafe, ordered list of values (such as day.monday, day.tuesday, etc.). Previously this could only be achieved by non-typesafe constant integers or manually constructed classes (typesafe enum pattern). - Swing: new skinnable look and feel, called synth. - Var args: the last parameter of a method can now be declared using a type name followed by three dots (e.g. Void drawtext(string... Lines)). In the calling code any number of parameters of that type can be used and they are then placed in an array to be passed to the method, or alternatively the calling code can pass an array of that type. - Enhanced for each loop: the for loop syntax is extended with special syntax for iterating over each member of either an array or any iterable, such as the standard collection classesfix the previously broken semantics of the java memory model, which defines how threads interact through memory. - Automatic stub generation for rmi objects. - Static imports concurrency utilities in package java.util.concurrent. - Scanner class for parsing data from various input streams and buffers. - Assertions - StringBuilder class (in java.lang package) - Annotations	Features added: - Support for older win9x versions dropped. - Scripting lang support: Generic API for integration with scripting languages, & built-in mozilla javascript rhino integration - Dramatic performance improvements for the core platform, and swing. - Improved web service support through JAX-WS JDBC 4.0 support - Java compiler API: an API allowing a java program to select and invoke a java compiler programmatically. - Upgrade of JAXB to version 2.0: including integration of a stax parser. - Support for pluggable annotations - Many GUI improvements, such as integration of swingworker in the API, table sorting and filtering, and true swing double-buffering (eliminating the gray-area effect).

Java 5

Java 6

Features added:

- Generics: provides compile-time (static) type safety for collections and eliminates the need for most typecasts (type conversion).

- Metadata: also called annotations; allows language constructs such as classes and methods to be tagged with additional data, which can then be processed by metadata-aware utilities.

- Autoboxing/unboxing: automatic conversions between primitive types (such as int) and primitive wrapper classes (such as integer).

- Enumerations: the enum keyword creates a typesafe, ordered list of values (such as day.monday, day.tuesday, etc.). Previously this could only be achieved by non-typesafe constant integers or manually constructed classes (typesafe enum pattern).

- Swing: new skinnable look and feel, called synth.

- Var args: the last parameter of a method can now be declared using a type name followed by three dots (e.g. Void drawtext(string... Lines)). In the calling code any number of parameters of that type can be used and they are then placed in an array to be passed to the method, or alternatively the calling code can pass an array of that type.

- Enhanced for each loop: the for loop syntax is extended with special syntax for iterating over each member of either an array or any iterable, such as the standard collection classesfix the previously broken semantics of the java memory model, which defines how threads interact through memory.

- Automatic stub generation for rmi objects.

- Static imports concurrency utilities in package java.util.concurrent.

- Scanner class for parsing data from various input streams and buffers.

- Assertions

- StringBuilder class (in java.lang package)

- Annotations

Features added:

- Support for older win9x versions dropped.

- Scripting lang support: Generic API for integration with scripting languages, & built-in mozilla javascript rhino integration

- Dramatic performance improvements for the core platform, and swing.

- Improved web service support through JAX-WS JDBC 4.0 support

- Java compiler API: an API allowing a java program to select and invoke a java compiler programmatically.

- Upgrade of JAXB to version 2.0: including integration of a stax parser.

- Support for pluggable annotations

- Many GUI improvements, such as integration of swingworker in the API, table sorting and filtering, and true swing double-buffering (eliminating the gray-area effect).

Saturday, April 18, 2015

Find the nth highest salary in Oracle

The answer and explanation to finding the nth highest salary in SQL

The SQL below will give you the correct answer – but you will have to plug in an actual value for N of course. This SQL to find the Nth highest salary should work in SQL Server, MySQL, DB2, Oracle, Teradata, and almost any other RDBMS:

SELECT * /*This is the outer query part */
FROM Employee Emp1
WHERE (N-1) = ( /* Subquery starts here */
SELECT COUNT(DISTINCT(Emp2.Salary))
FROM Employee Emp2
WHERE Emp2.Salary > Emp1.Salary)

How does the query above work?

The query above can be quite confusing if you have not seen anything like it before – pay special attention to the fact that “Emp1″ appears in both the subquery (also known as an inner query) and the “outer” query. The outer query is just the part of the query that is not the subquery/inner query – both parts of the query are clearly labeled in the comments.

Find the nth highest salary in Oracle using RANK

Oracle also provides a RANK function that just assigns a ranking numeric value (with 1 being the highest) for some sorted values. So, we can use this SQL in Oracle to find the nth highest salary using the RANK function:

select * FROM (
select EmployeeID, Salary
,rank() over (order by Salary DESC) ranking
from Employee
)
WHERE ranking = N;

SQL Tuning/SQL Optimization Techniques

1) The sql query becomes faster if you use the actual columns names in SELECT statement instead of than '*'.

For Example: Write the query as

SELECT id, first_name, last_name, age, subject FROM student_details;

Instead of:

SELECT * FROM student_details;

2) HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes.
For Example: Write the query as

SELECT subject, count(subject) 
FROM student_details 
WHERE subject != 'Science' 
AND subject != 'Maths' 
GROUP BY subject;

Instead of:

SELECT subject, count(subject) 
FROM student_details 
GROUP BY subject 
HAVING subject!= 'Vancouver' AND subject!= 'Toronto';

3) Sometimes you may have more than one subqueries in your main query. Try to minimize the number of subquery block in your query.
For Example: Write the query as

SELECT name 
FROM employee 
WHERE (salary, age ) = (SELECT MAX (salary), MAX (age) 
FROM employee_details) 
AND dept = 'Electronics';

Instead of:

SELECT name 
FROM employee
WHERE salary = (SELECT MAX(salary) FROM employee_details) 
AND age = (SELECT MAX(age) FROM employee_details) 
AND emp_dept = 'Electronics';

4) Use operator EXISTS, IN and table joins appropriately in your query.
a) Usually IN has the slowest performance.
b) IN is efficient when most of the filter criteria is in the sub-query.
c) EXISTS is efficient when most of the filter criteria is in the main query.

For Example: Write the query as

Select * from product p 
where EXISTS (select * from order_items o 
where o.product_id = p.product_id)

Instead of:

Select * from product p 
where product_id IN 
(select product_id from order_items

5) Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship.
For Example: Write the query as

SELECT d.dept_id, d.dept 
FROM dept d 
WHERE EXISTS ( SELECT 'X' FROM employee e WHERE e.dept = d.dept);

Instead of:

SELECT DISTINCT d.dept_id, d.dept 
FROM dept d,employee e 
WHERE e.dept = e.dept;

6) Try to use UNION ALL in place of UNION.
For Example: Write the query as

SELECT id, first_name 
FROM student_details_class10 
UNION ALL 
SELECT id, first_name 
FROM sports_team;

Instead of:

SELECT id, first_name, subject 
FROM student_details_class10 
UNION 
SELECT id, first_name 
FROM sports_team;

7) Be careful while using conditions in WHERE clause.
For Example: Write the query as

SELECT id, first_name, age FROM student_details WHERE age > 10;

Instead of:

SELECT id, first_name, age FROM student_details WHERE age != 10;

Write the query as

SELECT id, first_name, age 
FROM student_details 
WHERE first_name LIKE 'Chan%';

Instead of:

SELECT id, first_name, age 
FROM student_details 
WHERE SUBSTR(first_name,1,3) = 'Cha';

Write the query as

SELECT id, first_name, age 
FROM student_details 
WHERE first_name LIKE NVL ( :name, '%');

Instead of:

SELECT id, first_name, age 
FROM student_details 
WHERE first_name = NVL ( :name, first_name);

Write the query as

SELECT product_id, product_name 
FROM product 
WHERE unit_price BETWEEN MAX(unit_price) and MIN(unit_price)

Instead of:

SELECT product_id, product_name 
FROM product 
WHERE unit_price >= MAX(unit_price) 
and unit_price <= MIN(unit_price)

Write the query as

SELECT id, name, salary 
FROM employee 
WHERE dept = 'Electronics' 
AND location = 'Bangalore';

Instead of:

SELECT id, name, salary 
FROM employee 
WHERE dept || location= 'ElectronicsBangalore';

Use non-column expression on one side of the query because it will be processed earlier.

Write the query as

SELECT id, name, salary 
FROM employee 
WHERE salary < 25000;

Instead of:

SELECT id, name, salary 
FROM employee 
WHERE salary + 10000 < 35000;

Write the query as

SELECT id, first_name, age 
FROM student_details 
WHERE age > 10;

Instead of:

SELECT id, first_name, age 
FROM student_details 
WHERE age NOT = 10;

8) Use DECODE to avoid the scanning of same rows or joining the same table repetitively. DECODE can also be made used in place of GROUP BY or ORDER BY clause.
For Example: Write the query as

SELECT id FROM employee 
WHERE name LIKE 'Ramesh%' 
and location = 'Bangalore';

Instead of:

SELECT DECODE(location,'Bangalore',id,NULL) id FROM employee 
WHERE name LIKE 'Ramesh%';

9) To store large binary objects, first place them in the file system and add the file path in the database.

10) To write queries which provide efficient performance follow the general SQL standard rules.

a) Use single case for all SQL verbs
b) Begin all SQL verbs on a new line
c) Separate all words with a single space
d) Right or left aligning verbs within the initial SQL verb

Performance of the SQL queries of an application often play a big role in the overall performance of the underlying application. The response time may at times be really irritating for the end users if the application doesn't have fine-tuned SQL queries. There are sevaral ways of tuning SQl statements, few of which are:-

Understanding of the Data, Business, and Application - it's almost impossible to fine-tune the SQl statements without having a proper understanding of the data managed by the application and the business handled by the application. The understanding of the application is of course of utmost importance. By knowing these things better, we may identify several instances where the data retrieval/modification by many SQL queries can simply be avoided as the same data might be available somewhere else, may be in the session of some other integrating application, and we can simply use that data in such cases. The better understanding will help you identify the queries which could be written better either by changing the tables involved or by establishing relationships among available tables.
Using realistic test data - if the application is not being tested in the development/testing environments with the volume and type of data, which the application will eventually face in the production environment, then we can't be very sure about how the SQL queries of the application will really perform in actual business scenarios. Therefore, it's important to have the realistic data for development/testing purposes as well.
Using Bind Variables, Stored Procs, and Packages - Using identical SQL statements (of course wherever applicable) will greatly improve the performance as the parsing step will get eliminated in such cases. So, we should use bind variables, stored procedures, and packages wherever possible to re-use the same parsed SQL statements.
Using the indexes carefully - Having indexes on columns is the most common method of enhancing performance, but having too many of them may degrade the performance as well. So, it's very critical to decide wisely about which all columns of a table we should create indexes on. Few common guidelines are:- creating indexes on the columns which are frequently used either in WHERE clause or to join tables, avoid creating indexes on columns which are used only by functions or operators, avoid creating indexes on the columns which are required to changed quite frequently, etc.
Making available the access path - the optimizer will not use an access path that uses an index only because we have created that index. We need to explicitly make that access path available to the optimizer. We may use SQL hints to do that.
Using EXPLAIN PLAN and TKPROF - these tools can be used to fine tune SQL queries to a great extent. EXPLAIN PLAN explains the complete access path which will be used by the particular SQL statement during execution and the second tool TKPROF displays the actual performance statistics. Both these tools in combination can be really useful to see, change, and in turn fine-tune the SQL statements.
Optimizing the WHERE clause - there are many cases where index access path of a column of the WHERE clause is not used even if the index on that column has already been created. Avoid such cases to make best use of the indexes, which will ultimately improve the performance. Some of these cases are: COLUMN_NAME IS NOT NULL (ROWID for a null is not stored by an index), COLUMN_NAME NOT IN (value1, value2, value3, ...), COLUMN_NAME != expression, COLUMN_NAME LIKE'%pattern' (whereas COLUMN_NAME LIKE 'pattern%' uses the index access path), etc. Usage of expressions or functions on indexed columns will prevent the index access path to be used. So, use them wisely!
Using WHERE instead of HAVING - usage of WHERE clause may take advantage of the index defined on the column(s) used in the WHERE clause.
Using the leading index columns in WHERE clause - the WHERE clause may use the complex index access path in case we specify the leading index column(s) of a complex index otherwise the WHERE clause won't use the indexed access path.
Indexed Scan vs Full Table Scan - Indexed scan is faster only if we are selcting only a few rows of a table otherwise full table scan should be preferred. It's estimated that an indexed scan is slower than a full table scan if the SQL statement is selecting more than 15% of the rows of the table. So, in all such cases use the SQL hints to force full table scan and suppress the use of pre-defined indexes.Okay... any guesses why full table scan is faster when a large percentage of rows are accessed? Because an indexed scan causes multiple reads per row accessed whereas a full table scan can read all rows contained in a block in a single logical read operation.
Using ORDER BY for an indexed scan - the optimizer uses the indexed scan if the column specified in the ORDER BY clause has an index defined on it. It'll use indexed scan even if the WHERE doesn't contain that column (or even if the WHERE clause itself is missing). So, analyze if you really want an indexed scan or a full table scan and if the latter is preferred in a particular scenario then use 'FULL' SQL hint to force the full table scan.
Minimizing table passes - it normally results in a better performance for obvious reasons.
Joining tables in the proper order - the order in which tables are joined normally affects the number of rows processed by that JOIN operation and hence proper ordering of tables in a JOIN operation may result in the processing of fewer rows, which will in turn improve the performance. The key to decide the proper order is to have the most restrictive filtering condition in the early phases of a multiple table JOIN. For example, in case we are using a master table and a details table then it's better to connect to the master table first to connecting to the details table first may result in more number of rows getting joined.
Simple is usually faster - yeah... instead of writing a very complex SQL statement, if we break it into multiple simple SQL statements then the chances are quite high that the performance will improve. Make use of the EXPLAIN PLAN and TKPROF tools to analyze both the conditions and stick to the complex SQL only if you're very sure about its performance.
Using ROWID and ROWNUM wherever possible - these special columns can be used to improve the performance of many SQL queries. The ROWID search is the fastest for Oracle database and this luxury must be enjoyed wherever possible. ROWNUM comes really handy in the cases where we want to limit the number of rows returned.
Usage of explicit cursors is better - explicit cursors perform better as the implicit cursors result in an extra fetch operation. Implicit cursosrs are opened the Oracle Server for INSERT, UPDATE, DELETE, and SELECT statements whereas the explicit cursors are opened by the writers of the query by explicitly using DECLARE, OPEN, FETCH, and CLOSE statements.
Reducing network traffic - Arrays and PL/SQL blocks can be used effectively to reduce the network traffic especially in the scenarios where a huge amount of data requires processing. For example, a single INSERT statement can insert thousands of rows if arrays are used. This will obviously result into fewer DB passes and it'll in turn improve performance by reducing the network traffic. Similarly, if we can club multiple SQL statements in a single PL/SQL block then the entire block can be sent to Oracle Server involving a single network communication only, which will eventually improve performance by reducing the network traffic.
Using Oracle parallel query option - Since Oracle 8, even the queries based on indexed range scans can use this parallel query option if the index is partitioned. This feature can result in an improved performance in certain scenarios.

Difference between WeakReference vs SoftReference vs PhantomReference vs Strong reference in Java

WeakReference and SoftReference were added into Java API from long time but not every Java programmer is familiar with it. Which means there is a gap between where and how to use WeakReference and SoftReference in Java. Reference classes are particularly important in context of How Garbage collection works. As we all know that Garbage Collector reclaims memory from objects which are eligible for garbage collection, but not many programmer knows that this eligibility is decided based upon which kind of references are pointing to that object. This is also main difference between WeakReference and SoftReference in Java. Garbage collector can collect an object if only weak references are pointing towards it and they are eagerly collected, on the other hand Objects withSoftReference are collected when JVM absolutely needs memory. These special behaviour of SoftReference and WeakReference makes them useful in certain cases e.g. SoftReference looks perfect for implementing caches, so when JVM needs memory it removes object which have only SoftReference pointing towards them. On the other hand WeakReference is great for storing meta data e.g. storing ClassLoader reference. If no class is loaded then no point in keeping reference of ClassLoader, a WeakReference makes ClassLoader eligible for Garbage collection as soon as last strong reference removed. In this article we will explore some more about various reference in Java e.g. Strong reference and Phantom reference.

WeakReference vs SoftReference in Java

For those who don't know there are four kind of reference in Java :

Strong reference
Weak Reference
Soft Reference
Phantom Reference

Strong Reference is most simple as we use it in our day to day programming life e.g. in the code, String s = "abc" , reference variable s has strong reference to String object "abc". Any object which has Strong reference attached to it is not eligible for garbage collection. Obviously these are objects which is needed by Java program. Weak Reference are represented usingjava.lang.ref.WeakReference class and you can create Weak Reference by using following code :



Counter counter = new Counter(); // strong reference - line 1
WeakReference<Counter> weakCounter = new WeakReference<Counter>(counter); //weak reference
counter = null; // now Counter object is eligible for garbage collection

Now as soon as you make strong reference counter = null, counter object created on line 1 becomes eligible for garbage collection; because it doesn't have any more Strong reference and Weak reference by reference variable weakCounter can not prevent Counter object from being garbage collected. On the other hand, had this been Soft Reference, Counter object is not garbage collected until JVMabsolutely needs memory. Soft reference in Java is represented using java.lang.ref.SoftReference class. You can use following code to create a SoftReference in Java

Counter prime = new Counter();  // prime holds a strong reference - line 2
SoftReference<Counter> soft = new SoftReference<Counter>(prime) ; //soft reference variable has SoftReference to Counter Object created at line 2

prime = null;  // now Counter object is eligible for garbage collection but only be collected when JVM absolutely needs memory

After making strong reference null, Counter object created on line 2 only has one soft reference which can not prevent it from being garbage collected but it can delay collection, which is eager in case of WeakReference. Due to this major difference between SoftReference and WeakReference, SoftReference are more suitable for caches and WeakReference are more suitable for storing meta data. One convenient example of WeakReference is WeakHashMap, which is another implementation of Map interface like HashMap or TreeMap but with one unique feature. WeakHashMap wraps keys as WeakReference which means once strong reference to actual object removed, WeakReference present internally on WeakHashMap doesn't prevent them from being Garbage collected.

Phantom reference is third kind of reference type available in java.lang.ref package. Phantom reference is represented by java.lang.ref.PhantomReference class. Object which only has Phantom reference pointing them can be collected whenever Garbage Collector likes it. Similar to WeakReference and SoftReference you can create PhantomReference by using following code :

DigitalCounter digit = new DigitalCounter(); // digit reference variable has strong reference - line 3
PhantomReference<DigitalCounter> phantom = new PhantomReference<DigitalCounter>(digit); // phantom reference to object created at line 3

digit = null;

As soon as you remove Strong reference, DigitalCounter object created at line 3 can be garbage collected at any time as it only has one more PhantomReference pointing towards it, which can not prevent it from GC'd.

Apart from knowing about WeakReference, SoftReference, PhantomReference and WeakHashMap there is one more class called ReferenceQueue which is worth knowing. You can supply a ReferenceQueue instance while creating any WeakReference, SoftReference or PhantomReference as shown in following code :

ReferenceQueue refQueue = new ReferenceQueue(); //reference will be stored in this queue for cleanup

DigitalCounter digit = new DigitalCounter();
PhantomReference<DigitalCounter> phantom = new PhantomReference<DigitalCounter>(digit, refQueue);

Reference of instance will be appended to ReferenceQueue and you can use it to perform any clean-up by polling ReferenceQueue. An Object's life-cycle is nicely summed up by this diagram.

WeakReference vs SoftReference vs Phantom vs Strong Java

That's all on Difference between WeakReference and SoftReference in Java. We also learned basics of reference classes e.g. Weak, soft and phantom reference in Java and WeakHashMap and ReferenceQueue. Careful use of reference can assist Garbage Collection and result in better memory management in Java.

Tuesday, April 28, 2015

Wednesday, April 22, 2015

Sunday, April 19, 2015

How to find Race Conditions in Java

Code Example of Race Condition in Java

Saturday, April 18, 2015

The answer and explanation to finding the nth highest salary in SQL

How does the query above work?

Find the nth highest salary in Oracle using RANK

WeakReference vs SoftReference in Java

Blog Archive