Despite the best efforts of programmers to eliminate them through error trapping, sometimes unwanted and, usually, invisible characters sneak into user data. A common example is the presence of carriage returns (CRs) or line feed (LF) characters in data fields that should be presented on a single line. The purpose of this is to aid you in uncovering hidden characters in a SQL database.
Sometimes the user-interface will indicate the presence of such otherwise invisible characters by placing small rectangular placeholders in their stead when viewed in the user-interface.
The problem is, sometimes code in the database or reporting engines may be confused by these improper characters in the data being supplied to the application. The results may range from data not being presented as expected all the way to crashing an entire user-interface.
Finding the culprits in the data can be a problem.
Here is a Transact-SQL stored procedure that can help you uncover these problem characters:
if object_ID (N'dbo.spExplodeASCIIValues_RKL', 'P') is not null
drop procedure dbo.spExplodeASCIIValues_RKL
go
create procedure dbo.spExplodeASCIIValues_RKL
@_iString varchar(800) = NULL
, @_iPrintFlag bit = 0
as
/******************************************************************************
CREATE PROCEDURE spExplodeASCIIValues_RKL
*******************************************************************************
This procedure will return the list of ASCII-value equivalents for the string
passed to the procedure. The list will be by character position in the string.
*******************************************************************************
PARAMETERS
*******************************************************************************
@_iString varchar(255)
@_iPrintFlag bit
*******************************************************************************
USAGE
*******************************************************************************
Useful when trying to figure out why a string might be acting other than ex-
pected. For example, if there are hidden control characters in a string (such
as carriage returns, line feeds, spaces).
This is how we discovered, for example, that using the <Enter> key as <Tab>
key was placing CR/LFs in some fields in the MAS 500 database.
Example:
EXECUTE dbo.spExplodeASCIIValues_RKL 'This is a |-] test
! of this proCEDure 1.2,3 !?', 1
-- compared to:
EXECUTE dbo.spExplodeASCIIValues_RKL
'This is a |-] test ! of this proCEDure 1.2,3 !?', 0
*******************************************************************************
(c)2008-2013 RDCushing RKL eSolutions, LLC
*******************************************************************************/
set nocount on
declare @Char smallint
set @Char = 1
if object_ID(N'dbo.tASCII_Value_RKL','U') is null
begin
create table dbo.tASCII_Value_RKL
( CharSeq int
, ASCIIVal int
, CreateDate datetime
, SrcString nvarchar(255)
, StringSeq int
)
end
declare @StringSeq int
select @StringSeq = coalesce(max(StringSeq),0) + 1
from dbo.tASCII_Value_RKL
while @Char <= len(@_iString )
begin
if @_iPrintFlag = 1
begin
print 'Character ' + cast(@Char as char(3)) + ': ''' + substring(@_iString,@Char,1)
+ ''' = ASCII: ' + cast(ascii(substring(@_iString,@Char,1)) as varchar(3))
--print ''
end
insert into dbo.tASCII_Value_RKL
( CharSeq
, ASCIIVal
, CreateDate
, SrcString
, StringSeq
)
values ( @Char
, ascii(substring(@_iString,@Char,1))
, cast(convert(varchar(10),getdate(),101) as datetime)
, @_iString
, coalesce(@StringSeq,1)
)
set @Char = @Char + 1
end
select * from dbo.tASCII_Value_RKL where StringSeq = @StringSeq
-- drop table dbo.tASCII_Value_RKL
set nocount off
go
grant exec on dbo.spExplodeASCIIValues_RKL to public
go
The procedure takes two parameters:
- A string up to 255 characters in length to be parsed into ASCII characters
- A bit-flag to let the procedure know if you would like printable results
If the bit-flag for printing is passed in as a ‘1’, then you will get a list that looks like this:
Otherwise, the data will merely end up in a table (tASCII_Value_RKL) for further analysis, like this:
Each time the procedure is called, the “StringSeq” is incremented by one.
Executing a T-SQL statement like the following would help you uncover CRs and LFs in a string where they, perhaps, should not be and are causing problems with data displaying in the user-interface:
*******************************************************************************
SPECIFIC EXAMPLE USAGE: Looking for specific ASCII values
*******************************************************************************
select a.CharSeq
, a.ASCIIVal 'ASCIIVal_1'
, b.ASCIIVal 'ASCIIVal_2'
from dbo.tASCII_Value_RKL a
join dbo.tASCII_Value_RKL b
on a.CreateDate = b.CreateDate
and a.CharSeq = b.CharSeq
and a.StringSeq < b.StringSeq
where a.ASCIIVal in (10,13) -- Carriage Return, Line Feed
or b.ASCIIVal in (10,13) -- Carriage Return, Line Feed
order by a.CharSeq
While these kinds problems may not arise frequently (hopefully), when they do, using a tool like this to help uncover the root of the problem can be a real lifesaver.
An extended application of this procedure might be during data transformation in an ETL (extract-transform-load) sequence. Use code like this to uncover potential troublesome data. Then use REPLACE() in T-SQL to eliminate hidden characters in the extracted data before it has a chance to cause problems in its new environment.
Let us know your thoughts on this matter. Leave your comments here, or feel free to contact us directly, if you prefer.