First some background The original NSDL CRS on which the MSS is being based was intended to be written in Java using an Oracle database backend. One of the requirements that emerged was very much like the current NSDL Registry integration requirement to merge the new system with a collection registration system that was already under development in PHP. This requirement was subject to abrupt future change, so I decided to use a code generator that would allow easy switching of languages. And yes, until many of the custom functions had been written in PHP switching the site from PHP to JSP or ASP or Perl would have been as simple as flipping a switch and regenerating all of the forms. At one point I actually had fully functional versions of the CRS deployed in both Apache/PHP and Tomcat/servlets.
The need to work in a more collaborative environment and to release the project for wide deployment as open source software makes continued use of a commercial code generator impossible and this also renders 80% of the current code base unusable -- the code generated, while very efficient and highly modular, is a nightmare to maintain outside of the generator interface. Most of the generated code fortunately tends to be fairly straightforward lists of database tables/rows and forms to edit the rows. The most unusual and difficult to replicate feature is a field-level group-based access authorization interface that alters the ability to Create/Read/Update/Delete (CRUD) not just a complete record but individual fields as well.
I've been doing quite a bit of research into redefining the technologies that we'll be using to rebuild the MMS and integrate that work with the NSDL Registry, including both the choice of programming language, web development framework (to handle the standard CRUD stuff), and database. The most obvious choice would be to continue to use PHP and MySQL, given the fact that most of my development tools and recent experiences support that decision and I have a lot of existing and useful code written in PHP. But both the Registry and the MMS require strong support for Unicode and XML/RDF processing, and PHP has some current limitations with respect to native Unicode processing and appeared to have even more limitations in its ability to handle XML and RDF, plus much of my existing code could use an upgrade to pHP5. Another mitigating factor is that my partners at the U of W are as familiar with Python/Zope/Plone as I am with PHP and would be happiest if we were to use that environment.
Since no one has (yet) dictated that I use a particular language and framework, I set out to study my options. The basic requirements were:
- Strong support for XML/RDF
- Currently Java has far and away the best XML/RDF support and one of the options is to use a Java-based services layer to handle this. PHP has an available toolkit for working directly with Java libraries (http://php-java-bridge.sourceforge.net/). I should point out that there's a library to access Python libraries from PHP too. (http://www.csh.rit.edu/~jon/projects/pip/). Surprisingly, one of the most extensive non-Java RDF libraries is for PHP (http://www.wiwiss.fu-berlin.de/suhl/bizer/rdfapi/)
- Native Unicode support
- Currently this is great in Java, very good in Python, and crumby in PHP, although they're working on it (http://wiki.cc/php/Unicode) and this work is being pushed hard by Oracle and IBM, especially IBM whose widely respected and mature ICU library will be used.
- Available and useful tools for relieving me of the bulk of grunt coding chores (I don't enjoy writing code). In other words a good web development framework like Zope. There's a long list of subrequirements for this including simple creation of...
- lists with... column sorting, row highlighting, display of default sort column, paging, variable rows/page, interactive filtering, selectable link to item display/edit, variable column display based on ACL or user selection, selectable link to row detail, persistence of user-set display parameters during and across sessions
- pretty URLs
- support for flexible caching
- user/group authentication
- Support for i18n internationalization
- A good IDE and a good debugger (I especially don't enjoy writing code in text editors and I'm an integrated debugger junky)
- Broad database support (at least MySQL, Postgres, Oracle, MS SQL, SQLite)
I started with Python. Spent several weeks getting familiar with the language (like it a LOT) and looking at:
Python frameworks and helper modules:
- Zope without Plone
- Turbogears (my personal favorite)
- JonsPythonModules (I definitely liked the name)
- SQLObject (DB abstraction layer)
- SQLAlchemy (DB abstraction layer, combined with Myghty is very powerful -- http://www.myghty.org/trac/wiki/ZblogReadme)
- Wing IDE (my personal favorite)
I also looked at PHP frameworks again -- I had started looking at these last fall when I realized that I'd have to abandon my code generator:
- Symfony (my personal favorite)
- Qcodo (my other personal favorite, sigh)
- PHPHtmlLib (my other personal favorite, sigh)
- EZPDO (DB abstraction layer)
- Propel (DB abstraction layer)
Of course I already have a PHP IDE that I like alot.
I also gave a cursory glance at Ruby on Rails but haven't spent enough time with it, and I should.
This boils down to
- Python: Turbogears or roll my own from parts
- PHP: Symfony or Qcodo (they are very different), leaning heavily toward Symfony at least in part because of the incredibly nifty sample app, or roll my own from parts (very tempting)
- Ruby: Ruby on Rails
So right now (01/18/2006) despite the fact that I really, really like Python, I lean very heavily toward PHP/Symfony. I have a bunch of code in PHP, I think I can work around the UTF-8 issues in the short term, the available RDF library is very impressive, XML support in PHP5.1 is excellent, Symfony looks like it will do just what I need (but I'm sooo tempted to roll my own).
I'll add links to all of these and some pro/con comments to some.