Here’s another little function for some string parsing, whilst basic and limited has it’s uses. I needed something to parse some very simple text that we were getting from somewhere as part of an import. We needed to get the string values between a ‘start’ and ‘end’ string. It was never going to change format so I didn’t need to over complicate it, the text we were getting was similar to this for example:-
and so on…
CREATE FUNCTION [dbo].[fnFindStringMatchesBetweenSimpleStartAndEndStrings] ( @text NVARCHAR(MAX) ,@startString NVARCHAR(1024) ,@endString NVARCHAR(1024) ) RETURNS @t TABLE ( StringMatch NVARCHAR(MAX) ) AS BEGIN -- we'll use this to store our current 'start index' position DECLARE @startIndex INT = CHARINDEX(@startString, @text) -- and this will of course be the end index position DECLARE @endIndex INT = CHARINDEX(@endString, @text, @startIndex + LEN(@startString)) -- now loop through and get the rest, until no more are found WHILE @startIndex > 0 AND @endIndex > 0 BEGIN -- save our match INSERT INTO @t ( StringMatch ) VALUES ( SUBSTRING(@text, @startIndex + LEN(@startString), @endIndex - (@startIndex + LEN(@startString))) ) -- next positions SET @startIndex = CHARINDEX(@startString, @text, @endIndex) SET @endIndex = CHARINDEX(@endString, @text, @startIndex + LEN(@startString)) END -- send back our matches (if any) RETURN END
So to use the function to find the ‘value**’ strings inbetween the ‘entry:’ and ‘:end’ strings you would use:-
SELECT StringMatch FROM dbo.fnFindStringMatchesBetweenSimpleStartAndEndStrings(N'entry:value1:end...entry:value2:end...entry:value3:end', N'entry:', N':end')
The function just returns a simple table with a single column ‘StringMatch‘ with all the values:-
As I mentioned earlier this function is simple and limited, so you couldn’t use it for parsing text that could have nested ‘start’ and ‘end’ strings for example. That being said, I am a firm believer in writing the least amount of code to solve a specific purpose and in this particular case to extend the function to parse nested strings would have added a little performance hit and for no reason – essentially I would be writing redundant code.
What would you do in this case, would you make it more advanced and write the redundant code or not?