imma

Just my thoughts and glimpses of me

Simple Java Persistence API (JPA) Demo: JPA Query

JPA Query Language

I don’t know if this series still qualifies as simple but I thought I add some basic information about queries in JPA. In part II we looked at the EntityManager and more specifically the simple operations that it enables like persist, find, merge, remove and refresh. In addition to these operations, JPA comes with its own query language that allows you to create custom queries over your data set.

JPA abstracts the developer and the application away from the details of how data is represented in the data stores (more likely a rational database) and this abstraction effectively marries the relational and OO paradigms. However one of the corner stones of the relational paradigm is its query capabilities which has so far been unmatched by any software paradigm to date. The query facilities in the OO model are limited in as far as handling a large amount of data. While there are attempts at developing ORDBMS (Object Relational Database Management Systems) data stores, these have never truly caught on in the enterprise and so the bulk of enterprise data remain stored in relational databases. With every other application build on top of a relational database, it becomes important to build query capabilities into abstractions layers such as the JPA.

The default query language in relational paradigm is the Structured Query Language or simply SQL. SQL has a number of standards defined which every vendor of a relational database implements in slightly different manner thus making it a tricky language to adopt as the basis of an abstraction layer like the JPA that is expected to work across multiple relational database products without resulting to expensive and complex workarounds.

The Java Persistence API Query Language (JPA QL) is the result of attempts to abstract the query facilities of a relational paradigm. It borrows from the EJB QL but also fix the weaknesses that have plagued EJB QL. The specifics of what was borrowed from EJB QL and what was fixed are beyond the scope of this post. JPA provides the ability to retrieve JPA mapped entities, sorting them as well as filtering them. If you are familiar with SQL, then you have some degree of familiarity with JPA QL as it is syntax is closely modeled on SQL’s syntax.

Specifying a Query

There are three main ways of specifying JPA queries:

  • createQuery Method of the EntityManager: with this option you compose the query at run time and execute it there and then. The most immediate aspect of this approach to creating queries is that your queries are not checked/parced at deployment time which means that obvious errors are only discovered when the code is executed.
  • Named Queries: Named queries are defined along with the corresponding entity beans. Several named queries can be defined for each entity thus enabling filtering and sorting using various properties of the entity. Unlike with runtime queries, these queries are parsed at deployment time which means that any errors are discovered before code is executed that depends on your named queries.
  • Native Queries: this gives you the ability to define queries using SQL instead of EJB QL. You can create Native Named Queries as well.

Querying

Retrieving data

The most common query operation is the select operation which returns all or a subset of records in the database. With JPA QL, the select operation returns mapped collection of zero or more mapped entities. The operation can also return properties of a mapped entity. A simple select query looks as follows.

SELECT h FROM Hotel h

 

SELECT h.name FROM Hotel h

Notice how you select from the entity and not from a table as you would in SQL but the syntax of the query is not different from what you would write using SQL. The query returns zero or many Hotel entities from the database. The Hotel entity was defined in the first installment of this demo series. The second query in the above sample selects a property of the hotel entity.

Lazy vs Eager Loading: FETCH JOIN

When you design your entity classes with associations and relationships, loading and accessing these relationships at run time becomes important. For example, a hotel has rooms and you can decide if you want the rooms associated with each hotel to be loaded when the hotel entity is retrieved (eager loading) or when you explicitly access (lazy loading) the associated rooms. During the definition of the association between entities you can declare whether you can lazy or eager loading but JQL also allows you to load the objects in an association.

SELECT h FROM Hotel h JOIN FETCH h.rooms

With the above query, all the hotel objects returned will have their associated rooms loaded as well. This gives you eager loading without specifying it in the relationship between the Hotel and Room entities.

Filtering & Sorting

It is not always the aim of any data retrieval operation to return every last record in a database; some times we are interested in only a few of those records that meet a particular criteria for the purpose of our operations at hand. Within the context of the simple app setup for this series, we may just be interested in hotels that are in a particular town. The name of the town in question would form our filtering criteria. The sample below gives a JPA QL query that would enable us to retrieve a collection of Hotel entities that with a particular town property.

//Filtering

SELECT h FROM Hotel h WHERE h.name = “Nairobi”

//Sorting

SELECT h FROM Hotel h ORDER BY h.name

//Filter and Sort

SELECT h FROM Hotel h WHERE h.name = “Nairobi” ORDER BY h.name

Once again notice the similarity to an SQL statement that would return rows that meet the provided sort and filtering parameters. So far these are just simple queries that don’t show much of JPA QL capabilities but a necessary step in appreciating how JPA QL queries are written.

Of greater importance is showing how these queries can possibly be composed within the context of Java code.

Query q = em.createQuery(“SELECT h FORM Hotel h ORDER BY h.name”);

List<Hotel> results = q.getResultList();

A further example of using queries to filter

Query q = em.createQuery(“SELECT h FROM Hotel h WHERE h.name = :hotelName”);

q.setParameter(“hotelName”, hotelName);

List<Hotel> results = q.getResultList();

Something that may be a bit tricky for first time users of JPA is composing queries using the LIKE operator to filter

Query q = em.createQuery(“SELECT h FROM Hotel WHERE h.name LIKE :name”);

StringBuilder sb = new StringBuilder();

sb.append(“%”);

sb.append(name);

sb.append(“%”);

q.setParameter(“name”, sb.toString());

List<Hotel> results = em.getResultList();

Assume for a moment that you want a list of all hotels with a particular number of rooms (say more than 20 rooms for example) … here is how you go about formulating such a query:

Query q = em.createQuery(“SELECT h FROM Hotel h WHERE size(h.rooms) > 20 ORDER BY h.name”);

List<Hotel> results = q.getResultList();

This concludes this look at JPA QL. This is not a complete examination of the power of JPA QL but a glimpse at what is possible.

October 7, 2009 Posted by imma | Development | | No Comments Yet

Rising Functional Programming

The expected shift of computer processing to even greater degree of parallelism has sparked interested in new ways of developing software that will take full advantage of the horizontal increase in processing power. The key area that has received the bulk of attention is programming languages and tools. In a many-core world (as opposed to what is now called multi-core), shared state becomes very tricky so most of the mainstream programming languages would be difficult to use in producing software. While almost all the mainstream imperative languages do have a library to enable the development of code capable of parallelism, most of these methods are not baked into the language and sometimes the initial design of the language itself gets in the way. In the design of most of the mainstream imperative programming languages, immutable data type are rare or sometimes completely non-existent all together.

Increased interest in functional programming languages have given rise to new languages that serve as an adequate bridge between the existing imperative programming mindset and the much needed shift to a world of parallelism. Functional programming is certainly not new as many of the techniques have been implemented in languages like Scheme, Haskel, Erlang amongst others. However, these languages and the ideas that they implement have largely remained in academic circles until recently when the software industry has taken a more proactive role to transfer the knowledge of academia to the industry. Programming languages like F# and Scala borrow heavily from the aforementioned pioneers of functional programming.

The newest in this growing list of new programming languages is Google’s Noop. The following is a description of Noop from the project’s web site:

… new language experiment that attempts to blend the best lessons of languages old and new, while syntactically encouraging what we believe to be good coding practices and discouraging the worst offenses. Noop is initially targeted to run on the Java Virtual Machine.

The basic assumptions in the design and development of Noop are certainly interesting. Integrating testing into the programming language can greatly improve code quality and making the language truly object oriented will improve its readability. I have found functional programming languages to have a pleasantly concise syntax that effortlessly achieves what would have required a ton of boilerplate code in supposedly OO languages like Java or C# which include primitive data types.

October 6, 2009 Posted by imma | Architecture & Design, Development, IT | | No Comments Yet

Who Owns Your Computer Anyway?

On the face of it, that is a rather silly question since within it lies the answer. Software is an important component of your computing experience – without it, you would not have a computer in the first place. However, having installed countless pieces of software of varying licenses, I have come to wonder what an End User License Agreement (EULA) really means.

It is a legal document as far as I can tell so the wisdom of putting it in front of a lay person to indicate (with a handy little button) acceptance or refusal seems rather illogical. I have done my best to try to go through some of these license but the legalese is just too convoluted to make any immediate sense. In a perfect would, you would retain a lawyer who would then break it down accordingly and explain to you what the license means and does not mean. The practicality of matching down to your lawyer every time you want to install a piece of software seems rather counter productive at the very least.

These licenses are an integral feature of proprietary software in that there is a real chance that you may be breaking the law if you don’t abide by the stipulations contained therein. As they like saying, I am not a lawyer but I would expect that for anything to hold in the court of law (more so the act of entering an agreement), the parties should understand what their respective obligations are. Without a lawyer present any chance of making sense of an EULA for a lay person is slim at best.

Once you have installed the software (after appropriately agreeing to the terms of the EULA), do you have ownership of the software that is currently installed on your computer? With a proprietary piece of code, you don’t own it of course hence the EULA is likely to explain that you are not suppose to revise engineer it or temper with it in any way. However, the EULA is likely also to stipulate the if you lose your precious data as a result of using the program in question, the producer of the software is not responsible. Such a situation makes you want to know why exactly you are paying for the software in the first place; for all intends and purposes it may not work as advertised and you have no legal recourse for any such harm that may have result from your use of the software.

And there are software manufacturers whose programs behave more like Trojan horses. You install a single piece of software from a company and the next time you are updating or perhaps even better the software you installed has an auto update feature which periodically checks for updates. Here is the problem, the update would also (in addition to suggesting the new release) install additional, unrelated software on to your machine. In a sense the original program acts like a gateway for the software manufacturer to invite even more software onto your hard disk.

This constant need to out do each other in order to gain the end user’s favor does essentially look remarkably like what a virus writer would do. I recently had to update the Windows Live suite produced by Microsoft and somewhere along the way, I checked a box that would allow me to change the home page and default search engine on my browser which in this case I assume (Since Windows Live is a Microsoft product) would apply to and only affect Internet Explorer. The default search engine for address bar search on my Firefox installation is Bing … no, I didn’t want Bing and there is no simple way of going about undoing settings change. In yet another trespass, I have a  .NET plug-in for Firefox installed while in the process of installing something completely unrelated.

With increase competition and jockeying for dominance, major industry players are hacking each other to bits. Google’s decision to integrate their Chrome browser into Internet Explorer using a plug-in seems like a good move on the surface and understandably so but then again there are far greater implication of control and ownership with such a move. As Mozilla points out, it confuses the boundaries between where Internet Explorer is and where things happen because of Chrome’s extension. It is easy to get excited at the thought of Google putting its engineering prowess to work and bringing cutting edge technologies to the most dominant browser in the market but it has far greater implications than just new technologies. The very introduction of new technologies suggests that bugs will be discovered so keeping the boundaries between software components is good as this enables proactive management.

The Windows operating system has a number of utilities that have come up to address weaknesses in the manner in which the operating system runs and manages itself and the programs that has been installed on it. Recently, I had the misfortune of a failed installation – the installation process of a program stopped prematurely and this meant that the program’s uninstaller was not installed. This became a problem that could not easily be fixed using Windows Control panel because I was not able to remove the program. I attempt to reinstall the program in effort to get the uninstaller in place but I was not able to reinstall since the said program has been supposed successfully installed. Just deleting the program would be the most logical thing to do but traces of the software would still remain in the registry and hence lead to a slower system in the long run. This particular situation illustrates a very common problem with most software running on Windows: it is much easier to get a program installed than it is to get it removed/uninstalled properly. There are countless pieces of software that leave their skeletal remains on the hard disk and Windows Registry. Such sloppiness shows a disregard to respect the ownership of the computer hardware on which the software runs – including the operating system.

In closing, users will want and should get the latest and the greatest software available on the market but software producers need to allow users to kick them out of their hard disks and do so with finality and assurance that there are no skeletons left on the hard disk or the registry. Even more importantly, stop with the production of Trojan horses. The fact that I downloaded and use iTunes does not mean that I either desire or want the latest and great version of Safari.

Would you not prefer to have the tools to remain in control of your computer?

October 5, 2009 Posted by imma | IT | | No Comments Yet

Programming Paradigms

In the past I enjoyed the concept and practice of programming because it provided an opportunity to explore a way of thinking about a problem without the usual constraints that one may face in the real world. The greater challenge (hence satisfaction) is in defining a model that will account for any potential failures and still be able to accomplish its intended purpose. As time passed, I have come to focused specifically on design and the resulting architecture. Designing anything is a process of creating a model that can account for the solutions to aspects of the problem specified. That is reductive in and of itself but there are much more insightful aspects of problem solving that need to be taken into account in designing and developing a solution.

In any design effort, the ability to abstract from the problem remains imperative while the generally accepted adage of too much of [take-your-pick] is a poison applies, abstraction done right can provide a practical solution to a multitude of problems. Programming paradigms have always been about creating models that either provide a way for us to give instructions to computers or a way for us to describe the world in a manner that a computer can comprehend and hence process. Programming languages remain a way for humans (programmers, software engineers, etc) to interact with a computer – giving it instructions on what to do and how to handle the particulars of our reality. The models that are implicitly encoded into programming languages represent our thinking as far as the machine-like view of the world or bringing the machine closer to the way we appreciate the world.

What are generally referred to as low level programming languages were essentially intended to enable us to communicate with computers and as such they bare close relationship to the way in which computers operate. Think of the assembly language and how you program in it.

With time, additional abstractions were added that allows us to focus more on giving computers instructions as opposed to prescribing the manner in which the computer carries out our instructions. This focus on instructions gave raise to what are generally referred to as procedural programming languages in which the emphasis was on results of the operations that need to be accomplished. The ability to focus on what you want done and how it is achieved in steps, obviously led to a greater interest in using computers to carry out what are essentially repetitive tasks that could easily be encoded in a number of functions which can then be executed and produce the desired result (or report errors, if any).

This focus on the procedures that are needed to accomplish a task leads to a huge codebase that is both hard to maintain and/or evolve to meet new and/or changing circumstances. This great problem would apparently seem to come from the fact that the procedural way of software development, does not adequately account for how the real world operates. In the real world, things exist and operate as a single unit – there is no difference between what something is and what it does.

Personally, I get the impression that this is the time when programming became a bit more philosophical in a sense that there is a deliberate effort to model the world in terms of its nature and its essence. The nature of the world, describes what the world is: in OOP, this is simply described as the state of the an object which is typically denoted by properties/attributes/fields, depending on the terminology of your platform of choice. You may notice that the nature of objects so defined does not need to change in order to make things happen because OOP relies on message passing to get Objects with the appropriate nature to carry out the intention of their essence as defined by their nature (what you do is defined by your nature and your nature defines what you do).

While OOP allows for a better abstraction from the real world, the manner in which it has been implemented thus far has a serious short coming. All the OOP languages that I have come across are rather verbose as the design process need to describe any application elements of the problem space in code. With increasingly large programs, it comes much more challenging to maintain large programs or ensure that they are tested to the satisfaction of end users. So, testing frameworks have mushroomed around OOP languages such as Java with JUnit (among so many others).

For all intends and purposes, OOP still bears some lingering association with how a machine would go about processing instructions. The so-called Fourth Generation Languages (4GL) like the Structured Query Language (SQL) has shown us to go about expressing our intention to the machine and have the machine figure out the means of getting to our intentions or at the very least least as close to it as possible. The oft-referenced Moore’s law continues its march into ever more powerful machines albeit in a slightly different way. With powerful processors, driving our computers we do not have to be chained to the vagaries of machine type thinking.

Another more poignant point to consider is the increased use of computers for entertainment (gaming etc), business and socializing. The nature of the problems that face social networking applications are markedly different from what have faced businesses at the advent and development of the current mainstream programming language. A business environment invariably has some kind of structure around it which is encoded in policies, procedures, organization structure and the processes that the organization run. Starting from such a foundation, it is then possible to formulate a few procedures which can be executed at regular or ad hoc basis to great effect. However, consider the way in which social networking sites are used – a single person would have a Facebook account, a Twitter account, YouTube account in addition to web mail accounts. These applications have become people centric and the number of people involved an quickly become a challenge for social networking sites that have managed to garner a big enough following.

The social network craze reveals an interesting dimension of how programming languages have evolved over time. At the outside, a few academicians used computers to help with research and then the business world caught on and now we have to face the reality that perhaps programming languages need to be less rigid. Often when discussing IT related subjects, less rigid may easily lead to less secure though in this context less rigid but more robust would be the best outcome in the evolution of programming languages. Objects are good as way to model the world but they lack a certain degree of expressiveness in effectively illustrating and modeling the state of the world as a seen a person who cares more about getting things done and less about the steps taken to get to the end.

August 20, 2009 Posted by imma | Abstract, Architecture & Design, Development | | No Comments Yet

Cloud Ambition: Standardization Later!

If you have any interest in IT at all you may have come across the history of the evolution of computers from humongous building complexes to today’s sample of gizmos and gadgets. It is often the case that the forward match of computers and computing in general has had a moment where the innovation in question is an island; isolated from sharing data, peripherals and perhaps even experiences of the technicians. Within the Personal Computer (PC) industry standards bodies have become a norm as they define both hardware specifications and protocols that enable exchange of data between computer hardware and software that are made by different developers. Some have argued that too much concern about standards and standardization does lead to stagnation in innovation as each vendor only does what is necessary to meet the standards. Breaking away from the cover of standards would perhaps lead to user base isolation as they may not be able to work effectively in an almost always heterogeneous information systems environment.

As an example to illustrate the detrimental effects of standards and standardization on software development: consider the rate of development and innovation that went on at the height of the first browser wars between Netscape and Microsoft. These two players pushed the envelop by not waiting for the slow and arduous standards making process but instead made extensions to existing standards and release products into the market. The standards bodies and their standardization processes eventually catch up to what is happening in the real world but more often than not the drive to push the envelope was never reduced to strategic maneuvers and machinations. Those who are familiar with this subject will quickly espouse the virtues of standards – the harmony they provide, the freedom from vendor lock in and the rest of the good stuff. Perhaps in support of the overall benefits of standards and standardization, Microsoft (the victor in the first browser wars) is now saddled with the responsibility of making its browser standards compliant without breaking web sites and applications that have come to rely on their extensions.

The story of standards and standardization does not, however, end there as it is indeed very rare to find new, paradigm shifting products and services created by standards bodies. The main goal of standards bodies is to provide the venue and mechanism to build and maintain consensus amongst parties that should ideally have similar or at least comparable end results. Such an environment rarely lends itself to revolutionary ideas and/or thoughts; the responsibility of pushing forward the state of the arts falls on visionary and insight full individuals and corporations.

In this day and age, Cloud has become the new buzz word that you hear hump across the internet and tech news media. The cloud does present a unique architectural challenge but at the same time, it is perhaps the shape of things to come in the future. Earlier in the day, I was reading an article that posits the demise of the relational database management systems (RDBMS) because of the rise of the internet and its demands to scale up and down at will. We are talking about a technology, RDBMS, which countless enterprises and perhaps applications rely on in some way but perhaps of greater concern is the amount of data that is locked up in these databases. Don’t get me wrong, I am not raising an alarm over the data stored in RDBMS because when it comes down to it, a method of bridging RDBMS with whatever eventually succeeds it can be provided. Key aspects of the relational model has been standardized which means that degree of interoperability is much higher as vendors strive to at least implement the basic specifications.

With the incessant push towards the cloud, a data storage paradigm that suits the demands of cloud computing could emerge. Amazon does provide cloud services through its EC2 and S3 services with additional services from other vendors at various stages of development and deployment. Of particular interest in these cloud storage services is their use of APIs to interact with data. If you have looked at the evolution of data storage, you will remember that once upon a time data and the application processing the data are one and the same. The obvious advantage is that such an application would perform quite well but the value of the data locked up in its code becomes much less.

If anything, it has always been claimed that the data an application produces will outlast the application that created it. Thanks to the relational model, data became independent of applications. The push towards cloud computing, with its numerous APIs at this point, would perhaps seem like a step back from the uniformity and sanity that the relational model brought to application development.

Microsoft’s push towards the cloud is currently going by the name of Azure Services Platform. Architecturally, it is an interesting piece of infrastructure to put in place. It does represent the core of what Microsoft eventually intends to hoist upon the cloud in the hopes of giving its dominance on the desktop a new lease on life. Of greater significance is Microsoft’s approach to the cloud: Software + Services. Windows and Office represent a lucrative install base for Microsoft and from that perspective it would make sense for them to attempt to migrate some of the technologies that have proven invaluable on the desktop onto the cloud. First of all the revenue opportunity of enabling some of sort backward compatibility with existing Windows and Office install base is too big to ignore. Keep in mind that data is immortal (well, almost immortal). The more interesting perspective is that transferring desktop standards on to the web would also present the legions of Windows and Office developers the opportunity to work with tools that they are already familiar with. The aforementioned reasons are compelling enough to have Microsoft push for a relational implementation of some kind of cloud storage. That is already happening with their SQL Server Data Services push. If this works out in the long run is a matter of wait and see. From a strategic perspective, I really do hope that Microsoft sees the cloud as different from the desktop market as the rules are different. It would be interesting to find out how well the relational model would adapt to future applications and demands – being able to implement a cloud-aware relational model may not be enough as it needs to be capable of addressing future demands and hence stop serving as a transition bridge from one era of computer to the coming wave.

Google and Amazon are not tied down by existing user bases and a need to protect and leverage existing markets shares in the desktop space. These two companies are versed in ways of running internet businesses. Google has the leading search engine which Microsoft is yet to challenge successfully in the market. Amazon is well known for its online ecommerce stores but it is also branching out in other directions. Beyond search and advertisement, Google has assemble a veritable collection of online services and infrastructure such as the Google App Engine which sits on to of a proprietary BigTable implementation. Access to such storage services are through Google defined APIs and at this point of the game, interoperability between these clouds services is the last question on any of the player’s collective and individuals minds.

It is not only the commercial worlds that are having cloud dreams; the next release of the popular Ubuntu Linux will also include technologies to allow it to participate in the cloud. The next release of Ubuntu called Karmic Koala will include hooks to Amazon cloud services.

Links

March 3, 2009 Posted by imma | IT | | No Comments Yet

java.lang.NoSuchMethodError: net.sf.ehcache.Cache.<init>

This entry is about the aforementioned exception which was thrown in an application I am working.

Environment

The application uses the following libraries

  • Spring framework (version 2.5)
  • Hibernate JPA (version 3.2.x)
  • Acegi Security

Cause

As indicated in the exception, the error is thrown by the ehcache library and in my case version 1.2.3. In simple terms the init method can not be found in the aforementioned version of ecache. The stack trace of the exception indicates that the absence of the method affected the creation of a cache for use by Acegi. The cache bean has been configured in Spring to allow it to be injected into Acegi authentication & authorization beans.

How did version 1.2.3 of ecache end up in the application’s class path? Well, that boils down to my recent decision to switch from using TopLink as a JPA library to using Hibernate JPA. The hibernate JPA in my IDE uses version 1.2.3 of ecache instead of version 1.3.0 which I also have in my class path.

Solution

The solution that worked for me (suggested in the reference below) was to remove the old version of ecache and using echache version 1.3.0.

Reference

March 2, 2009 Posted by imma | Development | | No Comments Yet