{"id":47,"date":"2018-09-01T18:00:06","date_gmt":"2018-09-01T18:00:06","guid":{"rendered":"https:\/\/datablog.roman-halliday.com\/?p=47"},"modified":"2018-09-01T19:29:14","modified_gmt":"2018-09-01T19:29:14","slug":"cleaning-varchars-removing-replacing-extra-characters","status":"publish","type":"post","link":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/","title":{"rendered":"Cleaning Strings: Removing &#038; replacing extra characters"},"content":{"rendered":"<p>Part of managing data is cleaning text. Tabs, spaces and line breaks can sneak in, and confuse output. When working with whitespace characters and line breaks, we can&#8217;t see them. This can cause a lot of confusion for users (and developers) when unexpected behavior occurs.<\/p>\n<p>Keep in mind we have to be careful, often we do want to store and recreate text exactly. Not all strings\/varchars need cleaning.<\/p>\n<h1>Why does it happen?<\/h1>\n<p>Trailing space characters can come in from the sauce data when the original data type is &#8216;fixed width&#8217; (padded with spaces).<\/p>\n<p>Most often, extra characters come from a user pasting something into a front end. If the\u00a0input isn&#8217;t sanitised or checked, then the extra characters will get pulled in with everything else.<\/p>\n<h1>Methods for cleaning<\/h1>\n<h2>Trimming strings<\/h2>\n<p>Usually the first step to cleaning strings is to remove spaces from the start\/end. There are the two functions for this, <code>LTRIM()<\/code> and <code>RTRIM()<\/code>. which take care of spaces before (to the left) and after (to the right) the non text characters.<\/p>\n<p>Note, oracle (and some other databases have a plain <code>TRIM()<\/code> which does the same as applying both <code>LTRIM()<\/code> and <code>RTRIM()<\/code> to a string. This functionality isn&#8217;t in Microsoft SQL Server.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"sql\">SELECT string_value               AS string_value,\r\n       LTRIM(string_value)        AS string_value_LTRIM,\r\n       RTRIM(string_value)        AS string_value_RTRIM,\r\n       LTRIM(RTRIM(string_value)) AS string_value_TRIM_BOTH\r\n  FROM [test_strings]\r\n;<\/pre>\n<p>See some examples in SQL Fiddle:<\/p>\n<ul>\n<li><a href=\"http:\/\/sqlfiddle.com\/#!18\/79388\/2\">SQL Server &#8211; SQL Fiddle &#8211; LTRIM and RTRIM<\/a><\/li>\n<li><a href=\"http:\/\/sqlfiddle.com\/#!4\/79388\/4\">Oracle &#8211; SQL Fiddle &#8211; LTRIM, RTRIM and TRIM<\/a><\/li>\n<\/ul>\n<p>Note, in the examples I have used the REPLACE() functions explored below to show where spaces are.<\/p>\n<h2>Replacing characters<\/h2>\n<p>Line breaks and tabs are different to trailing spaces, they aren&#8217;t picked up by any <code>TRIM<\/code> command. Fortunately there is the <code>REPLACE()<\/code> command, which does exactly what it says on the tin.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"sql\">SELECT REPLACE([column_name], '&lt;text to replace&gt;', '&lt;new text&gt;') AS [altered_text],\r\n  FROM [data_table]\r\n;<\/pre>\n<p>For special characters, this is best combined with getting the <code>CHAR()<\/code> values of text, rather than trying to copy\/paste a line break or tab into your search string. The key ones are:<\/p>\n<ul>\n<li><code>CHAR(9)<\/code> &#8211; Tab<\/li>\n<li><code>CHAR(10)<\/code> &#8211; Line Feed<\/li>\n<li><code>CHAR(13)<\/code> &#8211; Carriage return<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">SELECT REPLACE([column_name], CHAR(9),  '') AS [column_name_NO_TAB],\r\n       REPLACE([column_name], CHAR(10), '') AS [column_name_NO_LINE_FEED],\r\n       REPLACE([column_name], CHAR(13), '') AS [column_name_NO_CARRIAGE_RETURN]\r\n  FROM [data_table]\r\n;<\/pre>\n<p>Documentation on these functions:<\/p>\n<ul>\n<li><a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/t-sql\/functions\/replace-transact-sql?view=sql-server-2017\">MS Documentation &#8211; REPLACE<\/a><\/li>\n<li><a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/t-sql\/functions\/char-transact-sql?view=sql-server-2017\">MS Documentation &#8211; CHAR<\/a><\/li>\n<li><a href=\"https:\/\/docs.oracle.com\/cd\/B19306_01\/server.102\/b14200\/functions019.htm\">Oracle Documentation &#8211; CHR<\/a><\/li>\n<\/ul>\n<p>All of these methods can be combined to clean a string, cleaning is normally achieved by a combination of trimming and replacing.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Part of managing data is cleaning text. Tabs, spaces and line breaks can sneak in, and confuse output. When working with whitespace characters and line breaks, we can&#8217;t see them. This can cause a lot of confusion for users (and developers) when unexpected behavior occurs. Keep in mind we have to be careful, often we&hellip;<\/p>\n<p class=\"read-more\"><a class=\"readmore-btn\" href=\"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/\">Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,4],"tags":[],"class_list":["post-47","post","type-post","status-publish","format-standard","hentry","category-oracle","category-sql-server"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Cleaning Strings: Removing &amp; replacing extra characters - Rows Across The Lake<\/title>\n<meta name=\"description\" content=\"Part of managing data is cleaning text. Tabs, spaces and line breaks can sneak in, and confuse output.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Cleaning Strings: Removing &amp; replacing extra characters - Rows Across The Lake\" \/>\n<meta property=\"og:description\" content=\"Part of managing data is cleaning text. Tabs, spaces and line breaks can sneak in, and confuse output.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/\" \/>\n<meta property=\"og:site_name\" content=\"Rows Across The Lake\" \/>\n<meta property=\"article:published_time\" content=\"2018-09-01T18:00:06+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-09-01T19:29:14+00:00\" \/>\n<meta name=\"author\" content=\"david\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@d_roman_h\" \/>\n<meta name=\"twitter:site\" content=\"@d_roman_h\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"david\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/\"},\"author\":{\"name\":\"david\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#\\\/schema\\\/person\\\/575f96d2590c3085923ff9e1b565748b\"},\"headline\":\"Cleaning Strings: Removing &#038; replacing extra characters\",\"datePublished\":\"2018-09-01T18:00:06+00:00\",\"dateModified\":\"2018-09-01T19:29:14+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/\"},\"wordCount\":360,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#\\\/schema\\\/person\\\/575f96d2590c3085923ff9e1b565748b\"},\"articleSection\":[\"Oracle\",\"SQL Server\"],\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/\",\"url\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/\",\"name\":\"Cleaning Strings: Removing & replacing extra characters - Rows Across The Lake\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#website\"},\"datePublished\":\"2018-09-01T18:00:06+00:00\",\"dateModified\":\"2018-09-01T19:29:14+00:00\",\"description\":\"Part of managing data is cleaning text. Tabs, spaces and line breaks can sneak in, and confuse output.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/index.php\\\/2018\\\/09\\\/01\\\/cleaning-varchars-removing-replacing-extra-characters\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Cleaning Strings: Removing &#038; replacing extra characters\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#website\",\"url\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/\",\"name\":\"Rows Across The Lake\",\"description\":\"Data &amp; Databases\",\"publisher\":{\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#\\\/schema\\\/person\\\/575f96d2590c3085923ff9e1b565748b\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/datablog.roman-halliday.com\\\/#\\\/schema\\\/person\\\/575f96d2590c3085923ff9e1b565748b\",\"name\":\"david\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g\",\"caption\":\"david\"},\"logo\":{\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Cleaning Strings: Removing & replacing extra characters - Rows Across The Lake","description":"Part of managing data is cleaning text. Tabs, spaces and line breaks can sneak in, and confuse output.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/","og_locale":"en_GB","og_type":"article","og_title":"Cleaning Strings: Removing & replacing extra characters - Rows Across The Lake","og_description":"Part of managing data is cleaning text. Tabs, spaces and line breaks can sneak in, and confuse output.","og_url":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/","og_site_name":"Rows Across The Lake","article_published_time":"2018-09-01T18:00:06+00:00","article_modified_time":"2018-09-01T19:29:14+00:00","author":"david","twitter_card":"summary_large_image","twitter_creator":"@d_roman_h","twitter_site":"@d_roman_h","twitter_misc":{"Written by":"david","Estimated reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/#article","isPartOf":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/"},"author":{"name":"david","@id":"https:\/\/datablog.roman-halliday.com\/#\/schema\/person\/575f96d2590c3085923ff9e1b565748b"},"headline":"Cleaning Strings: Removing &#038; replacing extra characters","datePublished":"2018-09-01T18:00:06+00:00","dateModified":"2018-09-01T19:29:14+00:00","mainEntityOfPage":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/"},"wordCount":360,"commentCount":1,"publisher":{"@id":"https:\/\/datablog.roman-halliday.com\/#\/schema\/person\/575f96d2590c3085923ff9e1b565748b"},"articleSection":["Oracle","SQL Server"],"inLanguage":"en-GB","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/","url":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/","name":"Cleaning Strings: Removing & replacing extra characters - Rows Across The Lake","isPartOf":{"@id":"https:\/\/datablog.roman-halliday.com\/#website"},"datePublished":"2018-09-01T18:00:06+00:00","dateModified":"2018-09-01T19:29:14+00:00","description":"Part of managing data is cleaning text. Tabs, spaces and line breaks can sneak in, and confuse output.","breadcrumb":{"@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/datablog.roman-halliday.com\/index.php\/2018\/09\/01\/cleaning-varchars-removing-replacing-extra-characters\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/datablog.roman-halliday.com\/"},{"@type":"ListItem","position":2,"name":"Cleaning Strings: Removing &#038; replacing extra characters"}]},{"@type":"WebSite","@id":"https:\/\/datablog.roman-halliday.com\/#website","url":"https:\/\/datablog.roman-halliday.com\/","name":"Rows Across The Lake","description":"Data &amp; Databases","publisher":{"@id":"https:\/\/datablog.roman-halliday.com\/#\/schema\/person\/575f96d2590c3085923ff9e1b565748b"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/datablog.roman-halliday.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":["Person","Organization"],"@id":"https:\/\/datablog.roman-halliday.com\/#\/schema\/person\/575f96d2590c3085923ff9e1b565748b","name":"david","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/secure.gravatar.com\/avatar\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g","caption":"david"},"logo":{"@id":"https:\/\/secure.gravatar.com\/avatar\/acddbc676a1d5c73795edcf0627ee39e5aa947da9033b58373e03d93122cb3b7?s=96&d=mm&r=g"}}]}},"_links":{"self":[{"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/posts\/47","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/comments?post=47"}],"version-history":[{"count":4,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/posts\/47\/revisions"}],"predecessor-version":[{"id":236,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/posts\/47\/revisions\/236"}],"wp:attachment":[{"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/media?parent=47"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/categories?post=47"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datablog.roman-halliday.com\/index.php\/wp-json\/wp\/v2\/tags?post=47"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}