Here’s another little function for some string parsing, whilst basic and limited has it’s uses. I needed something to parse some very simple text that we were getting from somewhere as part of an import. We needed to get the string values between a ‘start’ and ‘end’ string. It was never going to change format so I didn’t need to over complicate it, the text we were getting was similar to this for example:-
entry:value1:end…entry:value2:end…entry:value3:end
and so on…
CREATE FUNCTION [dbo].[fnFindStringMatchesBetweenSimpleStartAndEndStrings] (
@text NVARCHAR(MAX)
,@startString NVARCHAR(1024)
,@endString NVARCHAR(1024)
)
RETURNS @t TABLE (
StringMatch NVARCHAR(MAX)
)
AS
BEGIN
-- we'll use this to store our current 'start index' position
DECLARE @startIndex INT = CHARINDEX(@startString, @text)
-- and this will of course be the end index position
DECLARE @endIndex INT = CHARINDEX(@endString, @text, @startIndex + LEN(@startString))
-- now loop through and get the rest, until no more are found
WHILE @startIndex > 0 AND @endIndex > 0
BEGIN
-- save our match
INSERT INTO @t (
StringMatch
) VALUES (
SUBSTRING(@text, @startIndex + LEN(@startString), @endIndex - (@startIndex + LEN(@startString)))
)
-- next positions
SET @startIndex = CHARINDEX(@startString, @text, @endIndex)
SET @endIndex = CHARINDEX(@endString, @text, @startIndex + LEN(@startString))
END
-- send back our matches (if any)
RETURN
END
So to use the function to find the ‘value**’ strings inbetween the ‘entry:’ and ‘:end’ strings you would use:-
SELECT
StringMatch
FROM
dbo.fnFindStringMatchesBetweenSimpleStartAndEndStrings(N'entry:value1:end...entry:value2:end...entry:value3:end', N'entry:', N':end')
The function just returns a simple table with a single column ‘StringMatch‘ with all the values:-
- value1
- value2
- value3
As I mentioned earlier this function is simple and limited, so you couldn’t use it for parsing text that could have nested ‘start’ and ‘end’ strings for example. That being said, I am a firm believer in writing the least amount of code to solve a specific purpose and in this particular case to extend the function to parse nested strings would have added a little performance hit and for no reason – essentially I would be writing redundant code.
What would you do in this case, would you make it more advanced and write the redundant code or not?