I just spent some time helping a very irate customer. You know, the "ColdFusion sucks" "we're going to dump CF" "how can you sell this ****" brand of irate. The cause of all of this anger and frustration was very serious ColdFusion performance issues. And I am sure the individual has been under a lot of pressure and so the ramblings are somewhat understandable.
But, at the end of the day, as is so often the case, the bottleneck was in database access and query processing. No, there was nothing wrong with the DBMS. And no, there was nothing wrong with the SQL used. And no, the connection and data source settings were not the problem.
So what was the problem? Simply, ColdFusion was being made to do the DBMS' job! I found dozens of queries that were returning data that was being used solely to feed other queries. I saw queries being returned and then filtered client-side, within ColdFusion. I even saw queries (some that were quite long running) that were never being used once executed (leftover code from previous edits, I assume). And more.
I've been saying this for years, but, let's go through this again. ColdFusion is NOT a DBMS. Whatever database back-end you are using, chances are that it is a big powerful application that is designed to do one thing and do it well, manage and manipulate data. It makes absolutely no sense to give your DBMS less work to do while making ColdFusion work hard doing what the DBMS is supposed to do, what it is designed to do, what it is optimized to do.
So, a few rules:
- Don't ever retrieve data you don't need.
- Don't ever retrieve too much data, only to filter the results within ColdFusion (via querying a query, or excluding data while looping).
- Don't ever retrieve data if that data is only going to be used to drive additional queries. If you have a query, and then are using the results (perhaps in a <cfloop>, or via a ValueList()) in subsequent queries, then you need to find a way to consolidate those queries and do the work on the DBMS. And yes, that may mean writing JOINs or using stored procedures.
- Don't ever perform calculations and aggregations within ColdFusion, unless the data being used is already retrieved for some other purpose. Yes, the SQL aggregate functions (and having to use GROUP BY and figuring out how to make that work while using WHERE or JOINs) can be a little tricky. But do it anyway.
- Don't ever underestimate the important of caching query results. If you use the data frequently, and it is not changing as frequently as it is used, then cache it!
- There's more, but you get the idea.
Keep the following in mind:
- ColdFusion is never going to be able to query/sort/filter/count/sum/average/etc. as fast as your DBMS will. That is not what ColdFusion is optimized to do, and it is exactly what your DBMS is optimized to do.
- Sending unnecessary data from your DBMS to ColdFusion creates unnecessary network traffic, puts unnecessary load on the ColdFusion server hardware, unnecessarily increases server memory load, and unnecessarily slows down your and other requests.
- More often that not, your ColdFusion box is going to be under heavy load while your DBMS box sits with cycles to spare. It makes no sense to not leverage that power and potential.
- And as a side benefit, by moving data processing and manipulation back where it belongs, on the DBMS, you significantly increase the likelihood that you'll be able to reuse that work and effort.
And a couple of side notes:
- None of these comments are ColdFusion specific, and are just as applicable to applications developed in PHP, ASP, JSP, etc.
- You MUST learn how to use your DBMS. In the situation that triggered this post, one particularly problematic query was taking over 10 seconds to execute. Adding an index to the table used dropped that down to a couple of milliseconds!
- "My DBA does not let me write stored procedures" is not a valid argument, and is all the justification you need to get a new DBMS or a new DBA!
Considering how data centric ColdFusion applications tend to be, all of this is critical!
Bottom line, you have a DBMS for a reason, use it!
Also, CF query caching is fine, but especially if your app "owns" the db (no third party apps change the db without your apps knowledge) consider an IsDirty caching strategy. If there are third party apps, try to implement some kind of distributed IsDirty notification - time based db caching really isn't an optimal approach for most use cases.
http://www.pbell.com/index.cfm/2006/6/26/isDirty--...
Ben - don't know if you ever say Charlies article:
http://carehart.org/blog/client/index.cfm/2006/10/...
The easier captcha is really nice and still seems to block the spam . . .
I once had a project where data is stored like this:
Users (Userid, mydata1, Mydata2, mydata3, mydata4, etc.. )
instead of like this:
Users (UserID )
Users_Data (userID, dataID)
Data (DataID, MyData)
The design violated some form of relational DB design for sure. The business decision (above me) was made to "deal with it" instead of fixing it. When we had to perform aggregate functions off the data, we ended up retrieving the data, creating a query inside CF along the lines of:
ComputedQuery: (UserID, DataID, MyData)
And then using CFs QOQ's aggregate functions to get the totals we needed.
I wish we could have fixed it, but...
Excellent post, SQL was created for the sole purpose of managing data. Let it do it's job!
Right you are.
Today, I can think of alternate "more-SQL" oriented approaches to the problem.
If I come across this again, I'll probably create a view which contains normalized data (using the Union operator) and the can just run the aggregate functions from the view.
At the time, we chose to address this in CF. I don't remember if there were other circumstances surround that decision.
--- Ben
--- Ben
This is very important if you have a project that you are sure that it will work on a platform that you know.
But we had some difficulties when I was working for a company which has a big content management system. Depending on customer needs we would like to suggest as much as possible different DB platform as solution and we had decided to do minimal work on DB level and we wre using DBs such as storage platform with some minimal work such as indexing etc.
Our system that we had created is still working on MSSQL, MySQL, PostgreSQL, Oracle, DB2 and Sybase. There are some DAOs for every database platform and they deals with SQL stuff depending on customer needs.
I mean it also depends on your project target.
I think the main issue is caching for application performance. I am working on a customer project now where my main problems are caching queries and objects. I have some pages that they gets same lookups from DB again and again or some pages where I have same object creation lots of time. We need to educate developers to learn how they can create optimized codes.
> management system. Depending on customer needs we would like to suggest as much as possible
> different DB platform as solution and we had decided to do minimal work on DB level and we were
> using DBs such as storage platform with some minimal work such as indexing etc.
To quickly comment on this, there is no reason why we should lower DB load process to the least common denominator. If we are going to support multiple DBs, let’s do it the right way by providing specialized interfaces that make good use of the different DB capabilities.
The same issue would occur with PHP, .NET or any other language. ColdFusion is good (well after 11 years using it fantastic). But as Ben says, let it do what it does best and let your DEDICATED support systems integrate nicely and not hamper it.
If you are doing anything serious with DBMS or email it is always better to share the load across servers rather than hulk it onto a single box running everything. Hardware is cheap. Patience is a virtue few of us have these days.
If running a decent ColdFusion shop you would at the least have a seperate server for ColdFusion, Email and DBMS, decent HDD space and RAM, and good bandwidth. With close monitoring you should rarely have major issues.
I can't believe that you even had to write something like this up. Are people out there still that stupid that they don't understand how to write queries from a database. Have people out there never heard of views and indexed views?
CFINPUT is used in conjunction w/ CFFORM, and has nothing to do with database queries. You probably meant CFINSERT and/or CFUPDATE?
Tony,
I don't think you can get very far in any web application w/o knowing how to write a query from the database. I would expect that anyone who can write a SQL query in a CF app should be able to write a view at the database level. The problem here is that people aren't taking advantage of the database. It may very well be that they haven't heard of views.
Perhaps it's also because "It works, and the boss / customer is happy, so why change it."
Altough when you start talking thousands of records, things can get a tad slow.
1) Good Database design helps write fast queries that return the data you require.
2) INNER JOIN (try and get everything you need in one go, with out going crazy with outer joins,etc) - don't loop over a master query to do 100+ seperate queries.
3) CREATE FAKE INDEXES (Use coldfusion to create structures with keys that allow you to "link" back to master records, this helps reduce amount of queries required, and also the amount of looping required.)
These are just a couple of things that help keep things under control.
Cheers
Gareth.
Not only does it feel good to see clients honestly smiling about the performance improvements through the database, but I've been able to generate that with code overhauls as well (rewriting a system from spagetti code to Fusebox, there was dramatic performance issues).
It's sad though that clients get to the point where they're so frustrated by the performance of their application, and in many ways IT/IS is to blame because we fail to use clear and simple language to communicate what their application does, how, and the process it interats with other components. We are terrible at talking shop and making systems "incomprehesable" to our clients, and even people or other departments as required.
Yeah, I often find that clients who have gotten to the stage of rage that you describe Ben have had IT or support individuals who never communicated in a manner that encouraged trust, faith, or belief in the system as well - leaving clients with the impression they're stuck with a "very expensive black box that's too expensive to fix" and needs to be replaced like a commodity - all of it in total.
Sometimes we focus so much on oo, flex, ria's that we forget the basics of making sure the applications are scalable.
I am somewhat knowledgeable about databases, sql server in particular. But I can't imagine a company that has a big and complex database, to not want to hire a dba, to help manage and maintain it.
Some areas do require special knowledge, skills and experience.
I work at a university where many of the colleges and departments have web sites and web applications hosted by the central IT office. More than a few of them have ColdFusion web applications that access data from one of our Oracle servers. The developers of these various applications range in experience from full-time developers to enlightened undergraduate students, but none of them are allowed direct access to the database system: the only way they can access their data is through ColdFusion calls. They have no line of communication to the DBAs, and even if they did, there’s no way the few DBAs we have would be able to handle their query requests in addition to their current workload, certainly not in a timely fashion anyway.
Let’s not forget that one of the wonderful aspects of ColdFusion is that is it very easy to learn how to build basic web applications with just a little time and effort. As a result, a number of ColdFusion developers (I won’t say many, I really don’t know) start out with very little formal training, working on small projects that utilize shared back-end resources they themselves have little control over. When they need to write a new query to retrieve data, they probably refer to whatever ColdFusion book they have at hand for guidance, and while that book might mention stored procedures, it’s not necessarily going to give them the details they need to write such a procedure in their DBMS.
I’m all for promoting good coding and development practices and encouraging people to follow such practices, but we shouldn’t characterize the developers who don’t follow those practices as being deliberately ignorant or “stupid”: they’ll just tune us out, and we’ll have missed a chance to have a positive impact on our development community as a whole.
Very valid point, there are exceptions to every rule. Check out another post I made a short while ago at http://www.forta.com/blog/index.cfm/2006/10/3/Use-..., I think we are in agreement. But having said that, I'd like to think that we can raise the bar somewhat (albeit slowly and as needed), failing to do so comes back and bites CF (and the CF developers) in the butt.
--- Ben
Such situations wouldn't arise if developers spent a little time learning how to correctly design and use a database first. If you are taking the time to learn CFML then why not everything else that goes with it? Really its common sense.
You wouldn't try and build a house just by reading a book on how to lay bricks.
Unfortunately the seeming simplicity of CFML probably does promote more poorly built apps than other languages.
Does this mean Scorpio now returns a query object for insert, update and delete and I can test the number (RecordCount) of rows that were actually inserted, updated and deleted? Or do I still need to write extra code (database or otherwise) to determine that number?
Also, if many people @ your workplace are having the same problems with limited access to data, you should get together and petition for your DBA to create "views." SQL Server and Oracle both provide views for your exact use/case: your DBA can define a view and say precisely what it contains and who has permission to use it. As long as your DBA is willing to manually index the views your performance will be almost (but not quite) as good as using stored procedures.
Anyway, great post Ben.
Please see this thread and comment if you have any better ideas.
http://www.adobe.com/cfusion/webforums/forum/messa...
Thanks, Dave :+)
Just one general comment. I am just good enough to be dangerous with SQL, but whenever I have a question I can't answer I go to Experts Exchange. For the price of a cup of diner coffee I get $90,000 a year SQL experts climbing all over themselves to help me out. They won't get all the column names right and test the code for you, but I can't remember the last time I didn't get what I needed to figure out a given SQL problem in under an hour.
--- Ben
Shouldn't that be select count(x) ? ? you're still returning many many rows with select x, just not as many columns.
P.S. Ben, I don't know how you do it, I really don't... the few hairs on your head you have left must be terribly grey! I know I'd have pulled out all of mine years ago if I was you!!!