Article: Backslashes

Home Page


Consultancy

  • Service Vouchers
  • Escrow Service

Shop



Programming
  • Articles
  • Tools
  • Links

Search

 

Contact

 

PHPinfo


$_SERVER







Dealing with backslashes, escapes, ...

category 'KB', language several, created 21-Aug-2025, version V1.1, by Luc Pattyn

with the assistance of Microsoft Copilot.


License: The author hereby grants you a worldwide, non-exclusive license to use and redistribute the files and the source code in the article in any way you see fit, provided you keep the copyright notice in place; when code modifications are applied, the notice must reflect that. The author retains copyright to the article, you may not republish or otherwise make available the article, in whole or in part, without the prior written consent of the author.

Disclaimer: This work is provided as is, without any express or implied warranties or conditions or guarantees. You, the user, assume all risk in its use. In no event will the author be liable to you on any legal theory for any special, incidental, consequential, punitive or exemplary damages arising out of this license or the use of the work or otherwise.


The Backslash as an Escape Mechanism

Most programming languages support literal strings, often enclosed in double quotes. To insert non-printable characters, escape sequences like \n (newline), \r (carriage return), and \t (tab) are used. To include a literal backslash (\), it must be doubled (\\).

But now the question is: what happens when a backslash is followed by a character that is not in the list of escapable characters?

  • Strict languages throw an error: e.g. CS1009 Unrecognized Escape Sequence in C#.
  • Some loose languages (e.g. PHP) treat it as a normal two-character sequence, as if the backslash was doubled - so doubling it has no effect.
  • Other loose languages (e.g. JavaScript) swallow the unexpected backslash silently (hence actually dropping it makes no difference whatsoever).
  • Other loose languages (e.g. AutoHotKey v2.0) use a different escape character alltogether (backtick for AHK2).

Try the following code snippets, and observe the various lengths they report:

string someString = "st\z";		// does not compile, CS1009 !
foreach(["stz", "\stz", "\\stz", "s\tz", "s\\tz", "st\z", "st\\z"] as $s) {
    echo "<p>strlen('".$s."') = ".strlen($s)."</p>";
}
const samples = ["stz", "\\stz", "\stz", "s\\tz", "s\tz", "st\\z", "st\z"];
samples.forEach(s => {
    const p = document.createElement("p");
    p.textContent = `length of "${s}" = ${s.length}`;
    document.body.appendChild(p);
});
F1::{
	samples := ["stz", "\stz", "\\stz", "s\tz", "s\\tz", "st\\z", "st\z"]
	output := ""
	for s in samples {
		output .= "StrLen(" s ") = " StrLen(s) "`n"
	}
	MsgBox(output)
}

Backslashes in Regex Patterns

Regex — short for “regular expressions” — is a powerful way to search for patterns in text. Instead of checking character by character, you describe what you're looking for using a compact, symbolic language. A regex pattern can match anything from a simple word to complex file paths, email addresses, or even entire formats.

Regex patterns are rather cryptic at first sight, and even beyond. A regex engine, in whatever environment, uses several special characters such as ., *, +, ?, (, ) and \. All of these have special meaning within regex, and when one of these characters is needed literally, an escape mechanism is required, fortunately that mechanism is always based on backslash.

On top of the regex escapes, and depending on the programming language, you may also need to escape the backslash itself. So there generally are two levels of escape mechanisms; when reading or writing them, always consider the regex first, then add to that whatever the programming language dictates.

This makes it hard to read and understand a correct regex pattern, let alone create one from scratch. To make things concrete, we’ll focus on one example: confirming that a file path refers to an `.exe` file inside the folder C:\Progs\..., including any subfolders. Lots of problems emerge immediately:

  • We need to match literal backslashes, which require escaping (even in AHK2, the regex engine uses \ to escape special characters).
  • Matching nested subfolders will take some effort.
  • The filename itself needs some care.
  • The dot in .exe is a special character in regex, so it must be escaped.

Let us build the pattern bottom-up for the easiest environment: an AHK2 script. Remember, that is a programming language that does not use backslashes for its own escape needs.

We may end up with something like "i)^C:\\Progs\\(?:[^\\]+\\)*[^\\]+\.exe$", but don't worry, we'll get there step-by-step.


Building the Pattern

Let’s start constructing the regex pattern from left to right, ignoring the programming language at the moment. In each step, we’ll highlight the new addition and explain its purpose in just a few words.


Pattern Comment
^C: ^ marks start of string, C: are literal characters
^C:\\ \\ is a literal backslash
^C:\\Progs\\ folder name and another literal backslash
^C:\\Progs\\.* . = any character, * = 0 or more times
^C:\\Progs\\.*exe$ $ = end of string, literal exe
^C:\\Progs\\.*\.exe$ \. escapes the dot in .exe
^C:\\Progs\\.*[^\\]+\.exe$ [^\\] = anything but a backslash, + = 1 or more times
^C:\\Progs\\(?:[^\\]+\\)*[^\\]+\.exe$ Subfolder chain: match anything but a backslash, 1 or more times, then a backslash; the parentheses (?: and ) turn that into a group, take 0 or more times!

Language-Specific Embedding

Whatever programming language we choose, we still have to put our regex pattern in a literal string, as well as apply some minor additions. All examples use double quotes and enable case-insensitive matching using a tiny prefix and/or postfix modifier.

Starting point is ^C:\\Progs\\(?:[^\\]+\\)*[^\\]+\.exe$


Code Language and Notes
Regex.IsMatch(path,
@"(?i)^C:\\Progs\\(?:[^\\]+\\)*[^\\]+\.exe$")
C#
Verbatim string avoids quadruple backslashes
preg_match(
'/^C:\\Progs\\(?:[^\\]+\\)*[^\\]+\.exe$/i',
$path)
PHP
using single quoted strings avoids quadruple backslashes; double quotes would make them necessary!
"/^C:\\\\Progs\\\\(?:[^\\\\]+\\\\)*[^\\\\]+\\.exe$/i".test(path) JavaScript
no way to avoid quadruple backslashes!
RegExMatch(path,
"i)^C:\\Progs\\(?:[^\\]+\\)*[^\\]+\.exe$")
AHK2
the simplest of them all!

Notes:

  1. Several regex methods support options, including an alternative for the "inline" instruction of case-insensititve matching;
  2. using the most appropriate approach, one can most always avoid needing quadruple backslashes; JavaScript is a big exception here.

Epilogue

Backslashes are deceptively simple — until you meet them in a regex pattern. Understanding how escape mechanisms work at both the regex level and the language level is essential for writing correct and readable patterns. Whether you're working in C#, PHP, JavaScript, or AHK2, the key is to think regex first, then adapt to the language's string rules.

Regex is a powerful tool, but it rewards precision. Start small, build incrementally, and always test your patterns in the actual environment where they’ll run.



Perceler

Copyright © 2012, Luc Pattyn

Last Modified 21-May-2025