SQL Server: How Many Times Does a Word Appear in a Sentence
This is, of course, ridiculously easy to do in almost any programming language on the market these days. But what if you want to do this in SQL Server? SQL Server doesn't provide a function for this, so we'll have to create our own. The most intuitive way to do it is to simply split the given phrase by whitespace and compare each word to the word that you want to search. But this is, of course, slow, especially as the phrase gets big.
I found a nice little function in the book Beginning ASP .NET 2.0 E-Commerce in C# 2005 by Christian Darie and Karli Watson. What they do is replace each instance of the word with a word that is one character longer. Then, the length of the new sentence - the length of the old sentence will give you the number of times the word appeared in the sentence.
An example:
- The word: "word"
- The sentence: "The word we're looking for is the word 'word.'"
- The replaced sentence: "The wordx we're looking for is the wordx 'wordx.'"
- Difference in lengths of the two sentences: 3, which is the number of times the word appears.
Pretty neat. ;-)
The reason to use the REPLACE function instead of doing it the normal way, of course, is because REPLACE is faster. The code, then, which I'm copying almost verbatim from page 174 of the book:
CREATE FUNCTION dbo.WordCount(
@Word VARCHAR(20),
@Phrase VARCHAR(1000))
RETURNS SMALLINT
/* @BiggerWord is a string one character longer than @Word */
DECLARE @BiggerWord VARCHAR(21)
SELECT @BiggerWord = @Word + 'x'
/* Replace @Word with @BiggerWord in @Phrase */
DECLARE @BiggerPhrase VARCHAR(2000)
SELECT @BiggerPhrase = REPLACE (@Phrase, @Word, @BiggerWord)
/* The length difference between @BiggerPhrase & @Phrase is the number we're looking for */
RETURN LEN(@BiggerPhrase) - LEN(@Phrase)