Saturday, May 17, 2008    
Home My Books Blog ColdFusion About Me Back    

Calendar
<< Jan 2007 >>
S M T W T F S
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      

Search

Categories
 • Adobe (61) [RSS]
 • AdobeMAX06 (45) [RSS]
 • AdobeMAX07 (59) [RSS]
 • AdobeMAX08 (6) [RSS]
 • AIR (96) [RSS]
 • Appearances (105) [RSS]
 • Books (66) [RSS]
 • CFEclipse (14) [RSS]
 • ColdFusion (1081) [RSS]
 • Flash (91) [RSS]
 • Flex (319) [RSS]
 • Jobs (81) [RSS]
 • JRun (12) [RSS]
 • Labs (27) [RSS]
 • LiveCycle (12) [RSS]
 • MAX (141) [RSS]
 • Regular Expressions (12) [RSS]
 • SQL (36) [RSS]
 • Stuff (492) [RSS]
 • Tips (CF Studio) (80) [RSS]
 • Tips (CF) (795) [RSS]
 • Tips (Dreamweaver) (91) [RSS]
 • Tips (Flex Builder) (2) [RSS]
 • Using CF (131) [RSS]
 • Wireless (96) [RSS]

Other BLOGs
 • Ray Camden
 • Tim Buntel
 • Sean Corfield
 • John Dowdell
 • Steven Erat
 • Brandon Purcell
 • Charlie Arehart
 • Full As A Goog

RSS Feeds
 • Feed
 • Subscribe

Join my mailing list and find out about new books and other topics of interest.

Thoughts, ideas, tips, musings, and pontifications (not necessarily in that order) by Ben Forta ...
NOTE: This is my personal blog, and the opinions and statements voiced here are my own.

Viewing By Entry / Main
January 4, 2007

Performing SQL Server CONTAINS Searches

SQL Server 2005 supports two forms of full-text search, FREETEXT and CONTAINS. CONTAINS is used to search for rows that contain words, phrases, partial phrases, words with the same stem, proximity searches, synonyms (using a thesaurus lookup), and more.

Here is a simple example:

SELECT note_id, note_text
FROM productnotes
WHERE CONTAINS(note_text, 'handsaw');

WHERE CONTAINS(note_text, 'handsaw') means find the word handsaw in column note_text.

CONTAINS also supports the use of wildcards:

SELECT note_id, note_text
FROM productnotes
WHERE CONTAINS(note_text, '"anvil*"');

'"anvil*"' means match any word that starts with anvil. Note that unlike LIKE, full-text searching uses * as the wildcard character (instead of %). Wildcards may be used at the beginning or end of a string. Also, when passing simple text to CONTAINS then that text is enclosed within single quotes. When passing wildcards each search phrase must be enclosed within double quotes inside those outer single quotes. Failing to do this will likely cause your searches to return no matches.

CONTAINS also supports Boolean operators AND, OR, and NOT. Here are a couple of examples:

SELECT note_id, note_text
FROM productnotes
WHERE CONTAINS(note_text, 'safe AND handsaw');

SELECT note_id, note_text
FROM productnotes
WHERE CONTAINS(note_text, 'rabbit AND NOT food');

When searching through extremely long text there is a greater likelihood of matches being found if search terms are near each other in the saved data. A simple AND search matches terms anywhere in the text, but NEAR can be used to instruct the full-text search engine to only match terms when they are close together. Here is an example:

SELECT note_id, note_text
FROM productnotes
WHERE CONTAINS(note_text, 'detonate NEAR quickly');

'detonate NEAR quickly' means match only rows that contain the words detonate and quickly near each other.

Sometimes you may want to match a word that is part of the same family (based on the same stem). For example, if you were searching for "life" you'd also want to match "lives". Obviously, a wildcard of life* could not help here, and using li* would likely match too many false positives. This is where inflectional matching helps. Here is an example:

SELECT note_id, note_text
FROM productnotes
WHERE CONTAINS(note_text, 'FORMSOF(INFLECTIONAL, life)');

'FORMSOF(INFLECTIONAL, life)' instructs the full-text engine to look for any words that share the same stem as the specified word, in this case "life".

FORMSOF() also supports THESAURUS searches, where words can match synonyms. To use this functionality you must first populate an XML thesaurus file with words and their synonyms.

TrackBacks
There are no trackbacks for this entry.

No trackback URL. Trackbacks are only allowed via interactive form.

Comments
Why should I use SQL Freetext instead of the Verity search function in ColdFusion?

Verity can index database querys aswell. And what about documents (word, pdf, html etc.) in SQL Freetext ?

-- Mikkel, from Denmark
# Posted By Mikkel Johansen | 1/4/07 5:26 PM
Mikkel, maybe you shouldn't. ;-)

The benefits are no limitation on document count, no external collections to have to manage, searching can be used by any database client (not just CF), it does indeed index Excel and Word and other formats, being able to combine WHERE clauses that perform full-text search and other filters is very useful.

The downsides are that Verity is more capable, SQL Server full-text can't do the 'did you mean' type search, it has a much less powerful query language, as an example.

--- Ben
# Posted By Ben Forta | 1/4/07 6:35 PM
I'm very interessted in freetext search (SQL and Verity) so I have more questions.

What about performance - Verity vs. SQL?

I guess that Verity is also better at showing summary, highlighting, categories.
# Posted By Mikkel Johansen | 1/4/07 7:02 PM
Mikkel, I have no performance data. Verity is far more feature rich, without question.

--- Ben
# Posted By Ben Forta | 1/4/07 8:07 PM
I've always been hesitant to enable full text search because I've always been under the impression the performance hit to the DB server is substantial. Any ideas on this? We tried it once at the office and when the DB was being indexed the server became slow and unresponsive to queries. Would that only happen on the initial full-text indexing or would it do so every time the index was refreshed?
# Posted By Joe Mastroianni | 1/5/07 8:21 AM
Joe, I've not used it with tables big enough to really get a feel for it. But, reportedly you have quite a bit of control over how indexing service resources arrre alllocated and used.

--- Ben
# Posted By Ben Forta | 1/5/07 10:06 AM
"it does indeed index Excel and Word and other formats"
This feels new! If it can easily index Excel and Word, AND pdf then I'm really excited about this. Verity has lovely feature sets, but (and this easily can be a product of my addled brain) I've always had issues with each use of the Verity engine over something: indexing, corruption, ease of use. Again, I accept that I've got the low-gear brain on Verity, but using SQL Server alone would make me more happy. Do you have any nice handy links on indexing external documents with 2005 SQL?
# Posted By Stephen Cassady | 1/5/07 11:05 AM
I'll chime in here since I've used both Verity and MSSQL FTS. FTS is by FAR slower then using verity. If you don't believe me, try doing a FTS against a table with 10,000 rows vs. Verity with 10,000 documents. Verity will beat it hands down. FTS has always had this problem. The only way to improve FTS is to throw alot of high performance harddrives, RAM and servers at it. Also hiring a professional DBA to come and fine tune would be a requirement if you're planning to use it in an enterprise environment. In any case plan to shell out some big bucks to get it up and running properly.
# Posted By Tony Petruzzi | 1/5/07 3:57 PM
"it does indeed index Excel and Word and other formats"
Yes, this has always been the case, ever since SQL FTS shipped with SQL Server 7.0. This feature and many others related to SQL Server Full Text Search (FTS) have been greatly imporved in SQL Server 2005 relative to past version of SQL Server, including vastly improved FT Indexing performance. I've worked with SQL FTS ever since it was first incorporated in SQL 7.0 in 1998 and have spoke at several conferences on this and related topics. I cannot speak about SQL FTS performance relative to Verity, but 10,000 rows is a trival table size and can be FT Indexed with SQL 2005 in minutes and no DBA is necessary. I also maintain a blog on this topic at http://jtkane.spaces.live.com/ and porting all of the technical content over to http://www.SQLFTS.com this month with a more formal lauch in Feb. 2007.
Thanks,
John
# Posted By John Kane | 1/6/07 6:18 PM
Thanks for this! But I still have a problem. I'm using dynamic variables and id somebody searches under 2 words. is it like this???

WHERE CONTAINS(note_text, #form.search# AND #form.search');
# Posted By NejcPass | 7/3/07 5:13 AM
Thanks for this! But I still have a problem
If i use "select TextSearch from CLGHistory where contains(TextSearch,'"1ALL*"') " then it will disply me the records.
but if i use "select TextSearch from CLGHistory where contains(TextSearch,'"*1ALL*"') " it will not display me any results
Pls help me
# Posted By Mitesh Patel | 8/22/07 9:29 AM

  © Copyright 1997-2008 Ben Forta, All Rights Reserved