How to Customize An Application

Today's been an interesting day for me.

I found myself rereading this old blog entry from Jason Delmore from last year. It was about why Adobe doesn't give ColdFusion away for free like PHP or Ruby, which of course people still debate today although there's not really any reason to debate it anymore with there now being three separate free CFML engines in various stages of development (Railo, Open BlueDragon and SmithProject) and an upcoming CFML language standard committee.

Whatever your opinion on the idea of standards or for that matter committees, you can't argue that 2008 was anything but an eventful year for the ColdFusion community. Some of the news is just caching up to us like the announcement of the beta for the new Bolt IDE for ColdFusion. The community spoke and Adobe listened. Announcements regarding the features of CF9/Centaur (such as Hibernate ORM) are of course similarly exciting. And then there have been all the wow events in the community like Railo announcing their becoming part of the JBoss project and becoming free/open-source and Kristen Schofield announcing the free educational licensing for ColdFusion and releasing the evangelism kit that finally arms us with some great information for evangelizing the platform, allowing you to "be your own Ben Forta" as she described in her Max presentation. :)

These are all great things for the ColdFusion community, they will be great things for the CF Open Source community generally and personally I know they will be great things for the onTap framework community specifically as we grow rapidly over the next year or two.

ColdFusion is a bit unusual in the way it does things. Although it's not without its challenges as is the case with any language, CF has often been the first in new areas. Notably, CF was the first server of its kind to connect the web to databases seamlessly and easily over ten years ago when Allaire released the very first version. The onTap framework today is in some ways following in that pioneering spirit and right now I'd like to show you a few specific features that make it truly unique in today's CF open source community.

[More]

FireLadder Scaffolding Tool Released!

Today we've released version 1.0 beta of the FireLadder scaffolding tool. This plugin requires DataFaucet version 1.0. We also discovered and fixed a new issue in DataFaucet while developing it, so if you alreaady have DataFaucet installed, now's a good time to update. :) And there were a couple of small issues discovered and fixed in the onTap framework core also, so while it may not be necessary, an update of the onTap core distribution wouldn't hurt either.

So you're probably wondering, "where do I download it?"

The answer - it's on the framework webservice. Just go to your onTap framework application wherever you installed it, load up the plugin manager, select the "more" tab, hit "search" and select the "install" button next to the FireLadder plugin! Find and install it directly within the framework! (Just like Eclipse! Just like Firefox!)

After downloading an extracting the onTap framework 3.2 distribution, your plugin manager can be found at:

http://localhost/ontap/admin/plugins/

Simply replace "localhost/ontap" for wherever you've placed the framework.

Also in case you haven't seen it yet, the installation videos on the framework home page show how to download and install plugins from the webservice.

Although this is a 1.0 release of the scaffolding utility and it is not intended for complex applications, you can expect SOA-style integration. It will create a discrete package of model objects (2 per table), an IoC Container that will be wired into the framework's manager, two views per table and a handful of controller templates.

It was actually quite a bit more complicated than I had expected and turns out to be more complicated than it is for me personally to generally just create simple CRUD apps. A large part of the problem has to do with techniques I don't normally use. For example, I don't normally change the column names in my database when I create my objects - so if I have a table with a column named "productprice", then my object has a property named "productprice" (not just "price"). I also don't create table columns with underscores in my database.

So because the DataFaucet tools recently added features to do both of these things, I found myself being forced to recreate a lot of the automation that was previously built-in to the framework core. The previous automation is still there of course, I just couldn't use it for the scaffolding tool because of the introduction of inconsistencies in naming conventions.

And then I noticed because I was using the Galleon Forums tables to test it, that I also needed a "dePluralizer" because some folks like to name their tables in the plural. This is actually a prime example of why I feel this is a bad idea. Those in favor of plural table names like "galleon_conferences" site as the reason for preferring them that you're naming the table and the table will contain multiples. Okay, that's fine. My problem with this is that this is a non-functional argument. There is no *functional* basis for it.

There are however *functional* arguments for not naming tables in the plural, notably that no matter how much you like the idea of a plural table name, you're not going to name your business objects that way. You're going to name your business objects in the singular and that means ta-da! more work on both your part and on the part of the server, simply because you felt it was "propper" to have plural table names. That's actually a *functional* argument. And you can see why if you look at the FireLadder.cfc in the new scaffolding tool, because it has to dePluralize table names... and it can't even handle all of them because they're not consistent and there aren't any rules that could be applied to distinguish when the plural is created with an "i" vs. when the plural is created with an "s" vs. when the plural is created with "es".

Examples:

Ring Rings
Horse Horses
Status Statuses (keep or drop the e?)
Virus Virii (es or ii?)

There are no consistent rules for pluralization and therefore, no way to guarantee successful translation from singular to plural or vice versa. So the end result is that plural table names means more work and a more fragile and less scalable system.

All that being said, there is some support in FireLadder for plural table names.

DataFaucet ORM!!!

DataFaucet is an ORM tool.

Why DataFaucet? Well two reasons, first it originated with the onTap framework and I like the pun.

Secondly the goal of DataFaucet is to make ORM as easy as getting a glass of water from the tap in your kitchen.

Another ORM? Really? Why?

To give you a little history, my friend Doug Boude, who's been doing a lot of work with Model-Glue in the past couple years had posted a blog the other day about his experiences with ORM. I can't say that I "feel his pain" because honestly, I've never worked with any of the recent ORM frameworks for ColdFusion outside of my own. But my own is an integrated part of the onTap framework... and perhaps largely as a result of that, few people have really looked at it. Doug had asked me once a couple years ago if I could split it out into a separate project - actually several people have over the years - and I've always replied "yes", but just never bothered to do it. So when Doug asked, particularly after having just finished the first draft of the Galleon ports project, I decided to go ahead and split it out for everyone. Doug of course wants to use it with Model-Glue because that's the framework he and his team work with. And that's fine. OnTap has always been open source, so I never had any objection to that, but nobody else was undertaking the task of splitting it out.

A year or so ago I found myself explaining some of the features of the SQL abstraction library to a coworker who said to me "you know Transfer already does all that"... umm... sure... and Transfer wasn't even a twinkle in Mark Mandel's eye when I started working on my SQL abstraction tools... and it's as far as I know still not as robust. Nor is Reactor, in spite of Sean Corfield's comment when I spoke at cf.Objective a couple years ago that it "reminded me of the things we can do with Reactor". I'm not saying this to be derogatory here, I'm just trying to establish a baseline. I started working on ORM with ColdFusion 5, before there even were CFCs. At the time it was HIDEOUS and SLOW and required custom-tags as its syntax. But it did work. It wasn't until much later after the release of ColdFusion MX that I grabbed up the Java API docs and learned how to work with JDBC through reflection and wrote the cover article for the ColdFusion Developer's Journal titled Leveraging JDBC or Just Fetching Coffee. That article, my first for the journal, almost didn't get published. When I originally proposed it the editor at the time felt it would be too advanced for the average ColdFusion programmer. Someone else recommended that I reword the proposal and resubmit it and I suspect they also put a bug in the guy's ear themselves and it got accepted and made its way to the cover. Where it quickly became one of their more popular articles. As far as I know that was still before either Transfer or Reactor had been conceived. It wasn't until still a while later that I finally rewrote the entire SQL abstraction library using CFCs instead of custom tags.

What struck me about Doug's comments about their experiences with ORM is that none of the things he described as being major headaches with their choice of ORM would have been an issue if they'd been using the onTap framework. And that really may have been the thing that finally convinced me to go ahead and do the leg-work to separate these tools for everyone. It is a little more work than I'd expected, but not much. I'll be done with the initial draft in a few more hours.

Two of Doug's comments in particular are especially important to note here. The first is his comment about what he calls "gluttony" -- the ORM components being "resource hogs" and slow as hell... While I've never been the first to champion the cause of performance tuning, I told Doug that, "gee when I tested them, they only added about 35ms per query over the top of Ray's original ad-hoc sql" (from Galleon) -- which he said sounded reasonable, leading me to believe that it's more efficient than the ORM they chose. That's really interesting because the ORM they chose was a code-generator like Reactor or I presume Transfer. I don't know which one they chose. But the conventional wisdom is that generated code is more efficient... apparently except when compared to mine. DataFaucet will not be a code-generator -- it abstracts SQL on the fly at run-time, and in truth, a number of its features like and/or keyword search filtering could not be accomplished with generated code.

The second thing and probably more important than the performance question, is something I'll actually quote from Doug's blog:

And ah, the grandest reason of them all that I pretty much despise ORMs: losing my beloved sql. Now, I wouldn't have a problem with losing sql as long as my ORM provides me with a suitable substitute. But it does not. The task of translating a simple sql statement into "ORM-speak" is far, far from simple, my friend. I and others I know have quite literally spent an entire DAY trying to figure out how to write the code, leveraging our ORM, to execute a simple join.

You mean like this?

<cfscript>
statement = ds.getStatement("select").init("products"); // query the product table

cat = statement.join("productcategories",true,"productcategoryid"); // join the category table

cat.filter("productcategoryid",rc.productcategoryid); // filter products by category

query = statement.execute(); // run the query
</cfscript>

hmmm... makes me wonder what ORM-speak looks like... must be hideous... which brings me back to the name... DataFaucet -- relax, have a drink. :)

I actually wrote an article that's included with my ports of Ray's Galleon Forums about having converted Ray's business objects to use the DataFaucet CFCs and cataloged all the changes in an Excel spreadsheet. Overall I found that in addition to a goodly amount of decoupling (it's not limited to the big-4 databases anymore), I was also able to remove about 600 lines of code with a negligible change in performance. In fact, it's unlikely you would even notice the difference as a forum user.

Anyway, all this stuff is already built -- it has been for a while now... all I gotta do is decouple a few leftover helpers (half of which were holdovers from CF5). The other half were just because I didn't want to duplicate code. Although I'm going to have to put up with some duplicated code because if I'm going to support both projects then I'm not going to branch the code, I'm gonna have to rewrite some of the onTap framework core to use the new version of DataFaucet.

UPDATE: So it's actually approved on RIAForge now. There's some code in the SVN repository, but it's in an ugly state right now. You can look, but don't say I didn't warn you. :)

As I've been working on separating it from the framework I've also realized that, as clean and simple as I thought the syntax was before, there are several ways I can improve it even further, so I'm going to be adding some new methods to make those short query builds even smaller as a matter of syntax sugar. For simple queries, you'll be able to do something like <return ds.select(table="mytable",filters=arguments).execute() />. And since the project is already approved (thanks Ray) :) I posted a syntax sample on the DataFaucet blog.

Nightly Build

In the process of working on the Galleon ports I noticed that the sql library's subqueries feature wasn't working in update statements, so I resolved that and while I was at it, I added character masking to the form tools numeric validation so that when javascript is enabled, users won't be able to enter non-numeric characters into text fields that have numeric validation.

Here's a sample of how to use the sql update:

<cfscript>
ds = request.tapi.getObject("datasource").init();

ins = { hitcount = ds.getStatement("select").init("mytable","hitcount + 1") };
ins.hitcount.filter("primarykey","keyvalue","=");
update = ds.getStatement("update").init("mytable",ins);

update.setFirstFilter(ins.hitcount.getFirstFilter());

update.execute();
</cfscript>

I declared ins using ColdFusion 8's implicit structure syntax for convenience sake, since the update statement is initialized with the name of the table and a structure with the values you want to insert into the record. That 2nd to last line reuses the filter object from the select statement so that both the select and the update have the same filters - plus not having to instantiate another object is more efficient.

1492

... that must have been the last time I used a stored procedure because I realized earlier today that the code for executing stored procedures via the SQL Abstraction layer was riddled with typos and other syntax errors, even though I remember having no problems with it the last time I used it. So apparently I managed to make a bunch of changes to it in the interim without testing them. (Actually I thought it was kinda funny and ironic that the first error in the sqlagent.cfc was on line 1492.)

Truth is I just generally don't find stored procedures very helpful. I suppose if I had some specific need to make a single trip to the database to get several bits of data at one time for say a console page or a stats page that featured several independent reports, then I might use it. Or if I had an application that I knew needed to connect to a database that would be accessed outside of ColdFusion and I wanted to share some of the business logic with the other application(s) sharing the database, then sure. But on a daily basis I generally find select statements more flexible and less work.

So tonight I went through and updated the storedprocedure.cfc so it works again. I was surprised how much work that involved... Though the better part of the few hours I spent actually was in a hair pulling session over this (pseudocode):

<cffunction name="executeStoredProcedure" access="public" output="false" returntype="struct">
<cfset var param = getParameters() />
<cfset var cfstoredproc = 0 />
<cfstoredproc name="#theproc#" datasource="#thedsn#">
<cfloop index="x" from="1" to="#arraylen(param)#">
<cfprocparam value="#param[x].content#" ... />
</cfloop>
...
</cfstoredproc>
<cfreturn cfstoredproc.resultCode />
</cffunction>

What happens when you attempt to execute this function is that the ColdFusion Server tells you that you're trying to dereference a variable of type java.lang.String as a structure with keys. What's even more frustrating is that it tells you that's happening not on the stored procedure line, but on the cfloop line where it's using the array. The reason why it took me a few hours to figure it out is because the error itself actually has absolutely nothing to do with the parameter array, the loop, the procparam tags or the attributes of the cfstoredproc tag. The error occurs because the cfstoredproc variable which I smartly var'ed as all variables should be (whether they're query names or cffile variables), was set originally to "0" instead of to structNew().

Normally as far as I know, ColdFusion gets rid of whatever variable was there ahead of time, so for example, if you declare cffile to return its structure to "myfile" it will overwrite the myfile variable if there was one. But apparently with cfstoredproc there's a bug that causes it to attempt to use the existing variable as a structure.

So with that mystery solved, assuming you use lots of stored procedures, enjoy not having to declare your procparam tags again. :) Here's the generic pseudocode for executing a procedure:

<!--- get the primary datasource --->
<cfset ds = request.tapi.getObject("datasource").init()>
<!--- get a procedure --->
<cfset proc = ds.getProcedure().init("myproc","result1,result2",attributes)>
<!--- get the results and use them --->
<cfdump var="#proc.execute()#">

The attributes structure is a combination of form and url scopes just like the old Fusebox technique, and the procedure object maps those to the parameters of the procedure, so whatever you put into the browser's address bar will become a parameter to the stored procedure. This is the named proc parameters that Macromedia / Adobe haven't been able to figure out like I mentioned in the Iron Man blog. Output parameters if you use them will then be reset in the procedure object's parameter values.

cfqueryparam

This isn't really related to anything in particular. Someone just posted this on the cf-talk list:

We have a large content management application used by retailers to manage their website content and website storefront. The application uses ColdFusion 7.02, MSSQL Server 2000 and IIS. We have a load balanced environment with 5 webservers. The application has over 10,000 cfquery tags and until recently we did not use cfqueryparam. We upgraded the application to use cfqueryparam in all queries and noticed a significant increase in the JRUN Working Set usage on all webservers. Prior to the upgrade the JRUN working set was flat at roughly 550 mb and after the change the JRUN working set climbed over 800mb.

I use cfqueryparam religiously, so obviously any application I've written will probably do the same thing... but that's not why I'm posting this... I'm posting it because when I read this post I thought to myself "wow, 10,000... I have 1... not 1k, not 100, ONE"... made me chuckle. I get all the benefits of cfqueryparam but I never write a single (cryptic and redundant) cfsqltype attribute. :)

It's not just cool, it's awesome because it's really important to include them (to protect yourself against sql injection attacks) and yet so many of the places I've worked, like the guy who posted this to the mailing list, just haven't. And usually the reason is simple "laziness". People don't use them because they don't like being bothered to write the extra 20+ characters of code for each one. I don't blame them, it's tedious, cryptic and annoying, not to mention that cfsqltype adds coupling. So I don't. But I still use cfqueryparam. :)

(With the exception of cases where I'm using a Query of Query.)

Two New Builds

So I was working on my contact system and I wasn't entirely happy with the performance of a particular page and started timing portions of the page to figure out where I could tune it to get the response time down. It turned out that one of the bigger sources of the overhead was in the display of region names. It's part of the Members onTap plugin, there's a RegionManager object which manages geographic regions from the country down to as small an area as you want. Probably most folks will use it for Country > State > City -- but there's nothing keeping you from entering US > Northwest > Oregon > Portland > South East -- or even more specific if you really wanted to, though it'd probably become pretty cumbersome. Once those regions are in the database however, you get to display the whole region path with the xhtml syntax like this:

<cf_html>
<cfoutput><tap:region regionid="#myregion#" codes="true" /></cfoutput>
</cf_html>

And that would create something like this:

<span>
<span>US</span>/
<span>Oregon</span>/
<span>Portland<span>
</span>

So on the page it looks like

US / Oregon / Portland

But this whole nested set model thing is always kind of a pain in the ass, and the method was more overhead than was really needed. Plus unless it's a menu and the individual region names are linked (which is another option in that tag), it doesn't really need the individual spans in the middle, so I was able to get some better performance out of it by changing the display code as well as retuning the RegionManager object so that it stores 2 queries internally with all the data from both of the tables that store the region information - one for the region and another for the nested set info. So now it uses query of query to get the ancestors or descendants for a particular region. Also eliminated the caching for individual region record structures since it's caching those queries now.

Incidentally, screw Joe Celko -- I learned how to do nested set on my own, before I'd ever heard of Celko. Never read his book either. Just another in a long list of things I've learned on my own only to later discover that even though people looked at me funny when I explained them, they happened to be powerful design patterns written up by someone else, which usually become popular some time after I learned them. :)

I also realized that when I'd implemented the auto-increment code recently apparently I got the sample code from somewhere else and hadn't thought to check the documentation. So I didn't realize that the column name for it wasn't consistent across platforms, so I had to update the framework core to use the different names for different databases. Would have been nice if Adobe had provided an extra key like a cf_autoincrement or somesuch maybe in addition to if not in place of the individual values for different databases. Which is essentially what I did in my code, so I guess at least if I get to use my own development techniques then the issue is moot.

Anyway, this means two new builds today. One new build of the framework core and one new build of the Members onTap plugin.

"AND" / "OR" Keyword Search

I know that verity includes a lot of advanced search features, but I honestly just haven't worked with it much. So I'm not sure if verity offers an "and/or" keyword feature, though I wouldn't be surprised. Then again verity requires you to create and maintain collections and that means double-maintenance if the data is stored in your database and you don't always need the other features verity has. So a while back I decided I wanted the ability to include and/or keywords in searches, (similar to google I suppose) and rather than copying and pasting that code every time, the framework's SQL abstraction layer gave me the ability to simplify the process of creating those and/or filters. The framework code looks something like this:

<cfparam name="attributes.search" type="string" default="" />
<cfif len(trim(atributes.search))>
<cfset search = Datasource.getStatement("select").init("tbl_Content") />
<cfset search.andOrFilter("title,keywords,contentText",attributes.search) />
<cfdump var="#search.execute()#" />
</cfif>

This code then takes whatever you've entered in the search form, breaks it apart into phrases separated by the words "and" and "or"1 and spreads those phrases across the title, keywords and contentText columns in the tbl_content table. Being able to do this with just one line of code is awfully convenient. The resultant query is something like this:

searched for "fish and chicken"

select * from tbl_Content
where (
(title like '%fish%'
or keywords like '%fish%'
or contentText like '%fish%')
and
(title like '%chicken%'
or keywords like '%chicken%'
or contentText like '%chicken%')
)

So with this sql output, you can see that the query will return any record that contains both words, but it also returns results where the words are in different portions of the content, for example the word "fish" in the title and "chicken" in the contentText, so if there's an article titled "let's go shark fishing" with the phrase "are you chicken?" in the contentText, this search will return that record. What makes this really nice however is that as the number of columns increase and/or the complexity of the search string increases, the SQL statement becomes very large and very complicated rather quickly, however, the andOrFilter facade which handles the thankless task of creating the SQL syntax never becomes any more complicated than the one simple line above2.

<cfset search.andOrFilter("title,keywords,contentText",attributes.search) />

Today I realized that the facade was limited to only accessing columns in a single table, so although it worked it was still not as flexible as I wanted it to be. I was working on a contact manager and needing to include two columns from joined tables in the and/or column list. So I updated the facade to allow that and now you can also do this:

<cfset search.join("author a") />
<cfset search.andOrFilter(
"title,keywords,contentText,a.firstname,a.lastname"
,attributes.search) />

It was a really good thing for me, because the query I was working on was massive... Check this out: 3

Aside from the fact that the sql abstraction layer escapes all the column names for me and ensures case insensitivity (even with a case-sensitive database collation), creating this query required me to write a remarkably small amount of actual code, much of which is even reused in other queries.

So anyway, I uploaded a new archive with the enhancement for multi-table and/or filters4.

  1. Actually it's a bit more sophisticated than that even because it uses localized strings for the and/or keywords for international use, so for example, german users can use the keywords "und" and "oder" in their search.
  2. And really, without a facade like this, and/or keyword searching is one of those features that just never gets implemented because, as nice as it may be for the users, it's tedious and never makes it into the list of priorities we can spend time working on.
  3. Thanks to search.getSyntax() for the sql syntax - this lets you output the syntax instead of executing the query for use in generating scripts to be executed elsewhere, as in my case it's easier to debug a query sometimes by copying and pasting the SQL syntax into an IDE provided by the database vendor like Query Analyzer. This way I don't have to sit through the tedium of copying all the query-param values, sifting through the query and replacing question-marks, because getSyntax() performs the merge for me.
  4. I also discovered that when using filterGroups (they allow you to wrap parenthesis around groups of filters like in the query above) if you misspelled the name of a table alias, it was producing an unusual error -- it's supposed to throw a custom "column not found" error, which it wasn't doing, so I fixed that also.

BlogCFC was created by Raymond Camden. This blog is running version 5.5.006. | Protected by Akismet | Blog with WordPress