Wednesday, 17 August 2011

About JSR-310 - A New Java Date/Time API

Recently, the London Java Community was elected onto the JCP (Java Community Process) board. The JCP is the mechanism for developing standard technical specifications for Java technology. The London Java Community's membership on the board enables us to deliver content on upcoming and in-progress Java Specification Requests (JSRs) back to our growing community. It also gives us the unique opportunity to engage people working on the JSR projects and developers who have expressed an interest in contributing. We're currently trailing the process of adopting a JSR. Although the process has not yet been fully defined, the first example we have selected is JSR-310, an improvement to the Java date system. Last month I presented a lightning talk on the topic of JSR-310 and Java Dates. This blog post will go into detail on some of the current limitations of dates in Java and will discuss how you can get involved in the ThreeTen project. The London Java Community JCP group believes this is an important JSR and will lead to an improvement to all developers using Java. If you are interested in finding out more about the LJC/LCP committee please contact @kittylyst on twitter, or drop me an email for more information.

Dates are a critical part of most software applications. Data models that represent something in the real world nearly always depend on timings or times of occurrence. Originally, Java had the Date class. There are a number of good discussions that have documented the limitations of the original Java Date class. One of the largest issues is that dates are mutable and therefore not thread safe, leading to an immediate issue surrounding scaling out applications that make use of Dates. Furthermore, Date is a DateTime, but confusingly there are also wrappers around Date in java.sql package for this. The functionality in Date and sql.Date are so similar that, especially for new developers, it can be unclear when and where to use each. Finally, there is no support for TimeZones in Java Dates, so building systems running in different regions and converting back to a centralised TimeZone was very complicated before the introduction of the Calendar class.

The Calendar class was added to the language later for TimeZone support and for more complex date problems. The problem with calendar was that it was still mutable, and formatting a date directly from a calendar was impossible – once again pushing the developer to understand the exact details of each implementation and when to use each.
The following code example is from a presentation given at JavaOne several years ago which highlights the above in one code fragment. A simple and readable piece of code shows 6 different bugs:

//Bug 1, 2007 although you think this is the date you actually need to subtract 1900 to get the true representation
//Bug 2, 12 dates are indexed from 0 so, in this instance, 12 is out of bounds
Date date = new Date(2007, 12, 13, 16, 40);
//Bug 3, The timezone isn't the ISO standard and requires and underscore to be correct
TimeZone zone = TimeZone.getInstance("Asia/HongKong");
//Bug 4, can't create a Calendar from a date, you need to use a static instance of and then apply the date
Calendar cal = new GregorianCalendar(date, zone);
DateFormat fm = new SimpleDateFormat("HH:mm Z");
//Bug 5 and 6, can't format on a calendar or a calendar object like this
String str = fm.format(cal);

Joda Time

Joda Time introduced several new key concepts into the Date arena that make working with Date and Times far simpler from a programmer perspective.
  • Instant - There is a Joda-Time class of "Instant". "DateTime" is also an instant, but one that adds getters for the human fields. DateTime is immutable and therefore thread safe. Immediately we have a good replacement for Date.
  • Interval - An interval is a time measured from one instant to the another. The limitation here is that both end points are in the same Chronology and TimeZone.
  • Duration - Duration is simply an amount of time measured in milliseconds, no timezone or chronology applies to a duration
  • Period - A period of time time defined in terms of fields. This is a really useful feature for working with dates from a human aspect. Whilst machines represent underlying dates in long numbers, we still split things down into months, days and hours. Consider saying "one month from now" in February and then saying the thing same in May. Even though the month is the same, the underlying number of days varies. Periods allow the programmer to specify this kind of logic without having to convert to seconds and then add these to the value of the object.
  • Chronology - From the developer perspective, a chronology is under-the-hood and is the calculation engine which holds the complex calendar rules. In most cases this, is is ignored as most users will be using ISO Chronology.
  • TimeZones - A representation of the TimeZone held within a DateTimeZone class.

JSR 310

JSR-310 builds on many of the concepts that were introduced in Joda Time. However, to have the most robust implementation available for Java, there are some key issues in Joda Time that had to be addressed in JSR 310. So Joda Time is not without its flaws, but this doesn't mean it shouldn't be used; it's currently the best production ready system there is. Stephen Colebourne covers this in his blog.

Human and Machine Timelines - I was born on 29th September 2005, this is a human representation - a Java representation is the number of milliseconds since 1970-01-01T00:00Z. JSR 310 separates the concepts out in the underlying implementation to make intent clear.

Pluggable Chronology - This can introduce unexpected state to the system by its pluggable nature. Consider calling the month of the day, you could get 13 if the chronology was incorrectly set. It is likely that people don't check this before they make the call. Keeping these things separate is good programming practice leading to better unit testing and fewer curve balls when debugging problems later.

Nulls - The treatment of null values is always up for debate, nulls could be treated as the epoch, rather than a null value. Having this fuzzy behaviour could lead to bugs where you simply didn't realise that you had passed a null by hiding errors. JSR-310 will seek to treat nulls as exactly that, and not assume they are the epoch.

Internal Implementation - Most user interacting with dates may not realise the depth of the implementation that is required around such a system, for example most people will use standard chronologies and calculations with dates. However, the internal computations are complex, especially when coupled with chronologies and the human machine timelines. Ensuring this architecture is production standard for all users is not a simple task.

In JSR 310 some key decisions were made to move Joda Time forward by reworking some of the design concepts. What the user sees in the Joda time API is the tip of the iceberg. A lot had to change in order to have these design considerations in place.
The main issue that might compromise the inclusion into Java 8 will be around the TCK testing suite, which is likely to be required in addition to the unit tests provided with the current ThreeTen implementation.

Getting Involved

The ThreeTen project (the reference implementation of JSR-310) is now on GitHub, so contributions can be made and pulled into the project. ThreeTen is developed entirely in the open and there are still a number of things to be done. Contributions around documentation, testing and performance optimisations are currently up for grabs. One such example would be working out the best way to derive the OffsetDateTime from an instant.
Getting involved is as easy as entering a few git commands and joining the mail group.

Conclusion

JSR 310 is currently one of the key JSRs and any additional community support should help ensure that the project enters the earliest version of Java possible.

Jim Gough

Please post all comments on the LJC blog.

Monday, 15 August 2011

A cURL drop-in replacement for simplexml_load_file()

When using shared hosting (such as that provided by Dreamhost, Servage, etc.), the very nature of multiple users hosting from a single server (without their own virtual machine) means that a number of concessions have to be made in order to keep the service secure. At the "obvious" end of this spectrum are OS-level measures such as blocking root access, disabling useradd etc., but there are also myriad additional concerns that arise when configuring PHP (and other server-side scripting environments) for use in a shared hosting environment.

In the case of PHP, common measures include disabling functions that allow for file execution (exec, system, proc_open, etc.), turning register_globals and enable_dl off and, the focus of this post, disabling certain functions that allow remote files to be loaded from a URL. This can be achieved by setting allow_url_fopen (or, less stringently, allow_url_include) to false in the server's php.ini file. The main problem this presents for users of shared hosting is that this setting can only be set in php.ini (and can't, therefore, be modified at runtime using ini_set()). The implications of setting allow_url_fopen to false are fairly wide-ranging, as it disables all access to remote files through "URL-aware fopen wrappers", which includes anything that calls include(), include_once(), require() and require_once() with a remote URL.

In a recent project we were working on, we were hit by this problem during testing. The problem actually stemmed from our use of simplexml_load_file(), which makes use of an fopen wrapper internally. Happily, the client URL (cURL) library was installed and functional on the shared hosting, so we were able to write a simple replacement using cURL and simplexml_load_string():

$url = "http://example.com/remote.xml";

if (ini_get('allow_url_fopen')) {
$xml = simplexml_load_file($url);
} else {
// Setup a cURL request
$curl_request = curl_init($url);
curl_setopt($curl_request, CURLOPT_HEADER, false);
curl_setopt($curl_request, CURLOPT_RETURNTRANSFER, true);

// Execute the cURL request
$raw_xml = curl_exec($curl_request);

// ...Check for errors from cURL...

$xml = simplexml_load_string($raw_xml);
}

The code is (hopefully!) self-explanatory, with the possible exception of the (fairly minimal) cURL options. Setting CURLOPT_HEADER to false omits the HTTP headers from the returned output, whilst setting CURLOPT_RETURNTRANSFER to true causes curl_exec() to return the response as a string rather than outputting it directly. Finally, the handling of cURL errors is a topic unto itself, but the error number from the last cURL operation can be obtained using the curl_errno() function, passing in the cURL handle returned from curl_init().

Friday, 12 August 2011

Java 7 Launch Event

Thursday the 7th of July saw one of the largest milestones in Java's recent history, the much anticipated launch of Java 7. My initial reaction was it didn't seem that long since Java 6 was launched, however it was almost 5 years ago, the technical equivalent of when dinosaurs still roamed the Earth.

I think it would be naive to assume that nothing has happened during the reign of Java 6, in fact it's quite the opposite. The first step towards a better Java was the open sourcing of the JDK, giving developers the opportunity to fix and work directly on the platform. Although I don't have a direct quote, at the LJC Open Conference in November 2010 it was stated that between the first version of Java 6 and the latest there was a performance gain of over 50%. In Java 6.14 we saw the introduction of the G1 Garbage Collector, another revolutionary change to the options that developers have in terms of tuning and performance. Politically Java has changed hands and governance, passing from Sun to Oracle, and people have their opinions on what this means for Java.

Up until Thursday I was very skeptical about where Oracle might go with Java and how that would change the language I have built my career on. Which brings me to the launch day itself, this began with a webcast held across the world by Oracle. If you missed the webcast you can view it here. We got to hear from some of the senior developers on the Java project, and also from community speakers from all corners of the globe. For the London Java Community, it was an absolute honor and pleasure to see a small group of users that has now risen almost 1800 members, represented by Ben Evans. I really enjoyed the webcast and thank Adrian Woodhead at lastfm for their hospitality in hosting the webcast for other members of the LJC. That gesture of sharing is something that was also not far from the content of Oracle's talk, as many people appreciate one of the biggest successes of Java is us - The Community. The value of this has certainly not been underestimated by Oracle and this was evident from the content of the webcast. So what do I think the main themes were to take away from the talk and the event?
  • Java 7 is an evolutionary release not an revolutionary release - Mark Reinhold stated that one of the best things about Java 7 is the fact it is shipping.
  • Increase in stability
  • Increase in performance
  • Increase in maintainability
  • ...Although not said explicitly, the feel of the community is Java 8 is definitely not as far away as the gap was from 6 to 7.
For the event there was then kind hospitality at the Oracle offices, with people from across the community together with a definite buzz and excitement in the room about Java once more.

What is new in Java 7? This could be an entire post/book in itself and each individual improvement could be gone into in great detail. I'll attempt to pick out a few key points below and give links for more information. I'd also recommend reading Mark Reinhold's blog. The best summary of the below I have read so far is in Java 7 developer MEAP, by Ben Evans and Martijn Verburg.
  • JSR 292 Invoke Dynamic
    • This is one of the major steps that we are seeing towards a change in the way that we view Java. What even is Java? I think we are seeing a separation of Java the language and the Java Virtual Machine. Strictly speaking, it's not really the Java VM now, but just the VM. Highlighted by the gentleman in the video wearing a Python jacket over a Java T-Shirt, the VM is now home to many dynamic languages. Invoke Dynamic is another step towards the multi-language support for the JVM. Simply put it supports the invocation of allows a non-Java call to be made and for the linkage to be determined at runtime... OK that wasn't so simple, but you can read more about the project here.
  • JSR 334 Project Coin
    • Project coin are all about small changes to the Java language to make life easier for daily use of the Java programming language. A few of my personal favorites are:
      • Ability to switch on String! Finally, it's only taken 15 years.
      • Diamond Operator, no longer do you need to declare the generic expression on the right hand side if they are present on the left: 
        • Old Way: List<String> myFingersAlreadyHurt = new ArrayList<String>();
        • New Way: List<String> jsr334SavedMyFingers = new ArrayList<>();
      • try-with-resources - lets get rid of some boiler plate code
    • You can find the full list here.
  • Better Unicode Support
    • Not too excited about this because it's not a problem I run into. However, from speaking to a few people about this at the event - this was going to save them a lot of hassle.
  • JSR 203: NIO.2 and File System
    • Finally, we have a decent API for interacting with Files and a scalable approach to asynchronous I/O.
    • I'm not an expert on the rest of this JSR yet, so might be worth having a read here if you want to know more.
  • JSR 166y Fork Join Framework
    • Doug Lea and his concurrency experts have created the Fork Join Framework. This allows for concurrent tasks that involve splitting a larger programmatic problem into smaller blocks of computation i.e. Merge Sort a nice framework in which to operate. The ForkJoinPool uses workers to perform the tasks placed upon it, which are also capable of stealing tasks from other workers if they are no longer busy.
I think the next few months in the Java community will be exciting ones, with July being very much the month of Cloud and is Java the right language for the cloud? For those in London that haven't been along to a Java event yet, please come along and join us on meetup.com and join us at our developer pub session to get involved in the discussion.