Here’s another little function for some string parsing, whilst basic and limited has it’s uses. I needed something to parse some very simple text that we were getting from somewhere as part of an import. We needed to get the string values between a ‘start’ and ‘end’ string. It was never going to change format so I didn’t need to over complicate it, the text we were getting was similar to this for example:-

entry:value1:end…entry:value2:end…entry:value3:end

and so on…

CREATE FUNCTION [dbo].[fnFindStringMatchesBetweenSimpleStartAndEndStrings] (
	@text NVARCHAR(MAX)
	,@startString NVARCHAR(1024)
	,@endString NVARCHAR(1024)
)
RETURNS @t TABLE (
	StringMatch NVARCHAR(MAX)
)
AS
BEGIN
	-- we'll use this to store our current 'start index' position			
	DECLARE @startIndex INT = CHARINDEX(@startString, @text)

	-- and this will of course be the end index position
	DECLARE @endIndex INT = CHARINDEX(@endString, @text, @startIndex + LEN(@startString)) 		
	
	-- now loop through and get the rest, until no more are found
	WHILE @startIndex > 0 AND @endIndex > 0
	BEGIN
		-- save our match			
		INSERT INTO @t (
			StringMatch
		) VALUES (
			SUBSTRING(@text, @startIndex + LEN(@startString), @endIndex - (@startIndex + LEN(@startString)))
		)		

		-- next positions
		SET @startIndex = CHARINDEX(@startString, @text, @endIndex)
		SET @endIndex = CHARINDEX(@endString, @text, @startIndex + LEN(@startString))
	END

	-- send back our matches (if any)
	RETURN
END

So to use the function to find the ‘value**’ strings inbetween the ‘entry:’ and ‘:end’ strings you would use:-

SELECT
StringMatch

FROM
dbo.fnFindStringMatchesBetweenSimpleStartAndEndStrings(N'entry:value1:end...entry:value2:end...entry:value3:end', N'entry:', N':end')

The function just returns a simple table with a single column ‘StringMatch‘ with all the values:-

  • value1
  • value2
  • value3

As I mentioned earlier this function is simple and limited, so you couldn’t use it for parsing text that could have nested ‘start’ and ‘end’ strings for example. That being said, I am a firm believer in writing the least amount of code to solve a specific purpose and in this particular case to extend the function to parse nested strings would have added a little performance hit and for no reason – essentially I would be writing redundant code.

What would you do in this case, would you make it more advanced and write the redundant code or not?

%d