Monday, July 16, 2007

Class loading in Java and C#

Java and C# are often told to be almost identical languages. C# is often being accused of being just a remake of Java. I would like to show some very core difference between how things work in Java and C#. Of course Java and C# are competitors despite their differences as C++, PHP, Objective-C, Python and others are also competitors of Java in specific areas of programming.
I would like to excuse a bit for not havingdeep knowledge of C#. So if find something that you think is wrong let me know. I describe here the difference I have discovered during some project I'm working on for some time. I wouldn't like to talk about this project to wide public yet, but some of my friends already know and this blog will be probably the first place I'll write about this cool project.
The process of executing code has evolved over the years. Starting with just machine-code being executed from the specified location. Dynamic libraries came later - these are the .dll, .so and other stuff you know from any operating system. Major advantage of dynamic libraries is that not all code has to me included in the programm itself. This mechanism is used from 80's to date. The special category of class execution are scripting languages. Applications in scipting languages usually came as a collection of text files that are executed when you open them or access them from the internet (JavaScript, PHP and a lot of others). The major advantage is that these scripts are human-readable can be modified anytime including right before execution (in extreme cases also during the execution by the application itself - but please do not do this at home without suprvision of adult person as it leads to producing VERY MESSY code).
When Java came it brought revolutionary new mechanism of loading new code. This was called dynamic class loading. The idea was great - the application can load any class from any location including but not limited to local file-system, internet server, SQL database etc. However this feature is not used very often in Java.


C#'s approach

Well, in fact, there is nothing what to write about. The C# is not much different from any other C-family language. Although it uses syntax similar to Java and has garbage collection in runtime it still behaves much like any older language.
C# in fact doesn't load classes. Instead it loads so called "asseblies" which are in fact several classes group together. I'm not sure whether the assembly is a whole .dll, but I think it's not.  However this is not so important. I don't also know why the classes are not loaded separately but it can be bacause of performance reasons and the fact that the object-oriented design is partialy sacrificed after compilation to CIL. This enables non-virtual method calls that cannot be used in Java and probably other optimization enabled by the fact that the classes are not strictly separated and "know more" about each-other.
The most important difference is that all code in C# is loaded by system. The application has no option to take care of the class loading. This enables that compiled code of .dll can be cached and the .dll doesn't have to be compiled from CIL to machine code every time the application runs.
I don't know why Microsoft has chosen this approach, but reasons are probably three:
  1. C# is more C++ than Java and compability with C++ requires using .dlls and loading in assemblies
  2. performance reasons - enable caching of compiled code, enable non-virtual method call etc.
  3. Microsoft in general is known for poor architecture of it's applications and Java-like class loading could be just a thing that wasn't considered important
Java approach

Although Java is older than C# it uses much more advanced aproach as it doesn't have a heritage of older languages and was designed from the ground without compatibility with any other language.
The class loading process in Java is pretty simple, flexible and can be customized by developer. The key to class loading is surprisingly class called ClassLoader. This class is abstract and can be extended by developer. Class loaders are structured into a hierarchy with root class loader being so-called system class loader. If you don't use your own class loader the system class loader usually does all the stuff.
So what is the purpose of class loaders? Their name explains it pretty clearly, doesn't it? But to be sure here are the things system class loader does:
  • locates byte-code of classes
  • locates resources (.properties files, image files and whatever else in your .jar archives)
  • passes byte-code of classes to JVM
So what's the great thing about that? Almost every language can load files and code dynamically. The great thing is that you can override all these methods :-). You still don't know what is so great about it? The great think is that you can load the classes from anywhere (for example download from IMAP acount if you are really crazy) and can do anything with the byte-code before you pass it to the JVM.

So, is there any practical use for that? I think you think that I can't think seriously you would think that this example would make think that class loading in Java is extra useful. Except the project I'm currenty working on , which would be impossible without custom class loading, there are few examples I can think of:
  • you can your classes store encrypted (when used on untrusted server) and decrypt them in runtime
  • you can instrument the code for collecting statistics (for example number of method calls etc.)
  • you can download classes of specified version directly from Subversion
  • you can optimize byte-code and cache it
  • you can generate custom classes in your class loader
And how to create custom class loader? Well, use the Javadoc and your fantasy. It's really long story to tell all what you you can do and how. When starting with Javadoc try ClassLoader.loadClass(String) method first and follow all related to it :-).

But I will tell a short story now. What you can't do with custom class loader?
  • you can't load classes in package java.*
  • you can't make custom compilation of byte-code to machine code (however you can optimize byte-code before passing it to JVM if you feel good enough :-))
And one last thing. Remember that MyClass loaded by class loader A is not compatible with MyClass loaded with class loader B although the classes are exactly the same.

Ufff. That was interesting. Let me know if you understand the post or not. I will have to fix it few times probably before making it clear enough.

Saturday, July 14, 2007

The very first post

OK, here I'm. I have finally decided to make my own blog. In fact I have one already, but haven't written a lot posts there yet and haven't updated it for quite some time. This should change now. I'm working on some cool stuff now. Or at least I think it's cool stuff and I would like to share few ideas about them from time to time.

I would also like to be more personal from time to time and show to public something I usually only show my drawer from the inside. But this won't happen very often I guess.
So the question is "why another blog?". The answer probably "because it's mine". I thought a bit about it and I decided to start because I really like discussing stuff about programming technology and all that stuff. I spend quite some time chatting with friends about that stuff and when I discussed such stuff with a very good friend of mine we both wrote simultaneously to the other that I should also write down some ideas from the chats and share them with public. By the way the good friend is Tomas P. or T. Prochazka to be anonymous. Tomas if you don't want the two people who will read my blog know your name, let me know and will delete you :-).