JX Application Framework
Loading...
Searching...
No Matches
Classes | Public Member Functions | Static Public Attributes | Protected Member Functions | List of all members
JSubstitute Class Reference

#include <JSubstitute.h>

Inheritance diagram for JSubstitute:
[legend]

Classes

class  IllegalControlChar
 
class  LoneDollar
 
class  TrailingBackslash
 

Public Member Functions

 JSubstitute ()
 
 JSubstitute (const JSubstitute &source)
 
virtual ~JSubstitute ()
 
JSubstituteoperator= (const JSubstitute &source)
 
void Substitute (JString *s) const
 
JError ContainsError (const JString &s, JCharacterRange *errRange) const
 
bool EscapeExists (const unsigned char character) const
 
bool GetEscape (const unsigned char character, const JString **value) const
 
bool SetEscape (const unsigned char character, const JUtf8Byte *value)
 
bool ClearEscape (const unsigned char character)
 
void ClearAllEscapes ()
 
void SetNonprintingEscapes ()
 
void ClearNonprintingEscapes ()
 
void SetWhitespaceEscapes ()
 
void ClearWhitespaceEscapes ()
 
void SetCEscapes ()
 
void ClearCEscapes ()
 
void SetRegexExtensions ()
 
void ClearRegexExtensions ()
 
void DefineVariable (const JUtf8Byte *name, const JString &value)
 
bool SetVariableValue (const JUtf8Byte *name, const JString &value)
 
void DefineVariables (const JUtf8Byte *regexPattern)
 
void UndefineVariable (const JUtf8Byte *name)
 
void UndefineAllVariables ()
 
void Reset ()
 
bool IsUsingControlEscapes () const
 
void UseControlEscapes (const bool use=true)
 
bool WillIgnoreUnrecognized () const
 
void IgnoreUnrecognized (const bool ignore=true)
 
bool IsPureEscapeEngine () const
 
void SetPureEscapeEngine (const bool is=true)
 

Static Public Attributes

static const JUtf8BytekLoneDollar = "LoneDollar::JSubstitute"
 
static const JUtf8BytekTrailingBackslash = "TrailingBackslash::JSubstitute"
 
static const JUtf8BytekIllegalControlChar = "IllegalControlChar::JSubstitute"
 

Protected Member Functions

virtual bool Evaluate (JStringIterator &iter, JString *value) const
 
virtual bool GetValue (const JString &name, JString *value) const
 

Detailed Description

JSubstitute replaces escaped characters like \a and variables of the
form $name in a JString.  "name" can either be a literal, in which case
we store the value and perform the replacement, or it can be a regular
expression, in which case we call the virtual function GetValue().  An
example of the latter case is [+-]?[0-9]+, which is used in regular
expression replace patterns to denote submatches.

By default C escapes are not expanded since this is most convenient for
patterns specified in source code; in user-specified patterns in
interactive programs, it may be better to add these escapes so that
non-printing characters may be entered conveniently.

Constructor & Destructor Documentation

◆ JSubstitute() [1/2]

JSubstitute::JSubstitute ( )

◆ JSubstitute() [2/2]

JSubstitute::JSubstitute ( const JSubstitute source)

◆ ~JSubstitute()

JSubstitute::~JSubstitute ( )
virtual

Member Function Documentation

◆ ClearAllEscapes()

void JSubstitute::ClearAllEscapes ( )

Clears all ordinary escape bindings.

◆ ClearCEscapes()

void JSubstitute::ClearCEscapes ( )

◆ ClearEscape()

bool JSubstitute::ClearEscape ( const unsigned char  c)

Clears any value for the given character and returns true, if it exists, otherwise returns false.

◆ ClearNonprintingEscapes()

void JSubstitute::ClearNonprintingEscapes ( )

◆ ClearRegexExtensions()

void JSubstitute::ClearRegexExtensions ( )

◆ ClearWhitespaceEscapes()

void JSubstitute::ClearWhitespaceEscapes ( )

◆ ContainsError()

JError JSubstitute::ContainsError ( const JString s,
JCharacterRange errRange 
) const

Checks that every unescaped $ is followed by a valid variable name and every is followed by [A-_].

If an error is found, returns one of our JError objects and sets errRange to the offending character range. If there are multiple errors, only the first one is reported.

◆ DefineVariable()

void JSubstitute::DefineVariable ( const JUtf8Byte name,
const JString value 
)

◆ DefineVariables()

void JSubstitute::DefineVariables ( const JUtf8Byte regexPattern)

If a regex matches after $, GetValue() is called.

◆ EscapeExists()

bool JSubstitute::EscapeExists ( const unsigned char  c) const
inline

Returns true if an escape exists for the given character, false otherwise.

◆ Evaluate()

bool JSubstitute::Evaluate ( JStringIterator iter,
JString value 
) const
protectedvirtual

Returns true if it found a variable name starting at startIndex. It checks all variables to find the longest match. *value contains the variable's value.

Derived classes can override this if they have names that can't be expressed as regular expressions, e.g., $(w $( y) z)

◆ GetEscape()

bool JSubstitute::GetEscape ( const unsigned char  c,
const JString **  value 
) const
inline

Returns true if there is a definition for the given character.

◆ GetValue()

bool JSubstitute::GetValue ( const JString name,
JString value 
) const
protectedvirtual

Returns true if there is a variable with the given name. The default is to return false. This function can't be pure virtual because one shouldn't have to create a derived class if one only has literal variable names.

Reimplemented in JInterpolate.

◆ IgnoreUnrecognized()

void JSubstitute::IgnoreUnrecognized ( const bool  ignore = true)
inline

◆ IsPureEscapeEngine()

bool JSubstitute::IsPureEscapeEngine ( ) const
inline

Set this flag to true to avoid treating $ as an operator. The default value is false.

◆ IsUsingControlEscapes()

bool JSubstitute::IsUsingControlEscapes ( ) const
inline

This flag enables special Perl-style two-character escapes for control characters; "\cX" is replaced with the control-X character. The default is no control escapes. When enabled '' obeys the special rules below; when disabled it obeys the ordinary rules. Any ordinary value bound to '' is untouched.

Refer to the documentation for Substitute() for more information.

◆ operator=()

JSubstitute & JSubstitute::operator= ( const JSubstitute source)

◆ Reset()

void JSubstitute::Reset ( )
inline

Resets the object to its initial state.

◆ SetCEscapes()

void JSubstitute::SetCEscapes ( )

Adds entries to the table corresponding to the non-numeric escapes specific to C (to get exactly C's escape set you also need to call SetNonprintingEscapes and SetWhitespaceEscapes, then remove the '' escape). The escapes and their values are:

\\  \
\'  '
\"  "
\?  ?

The numeric values are naturally those chosen by the compiler for those escape sequences; this means they not only vary by character set but also by system.

Note that ANSI does not define what happens if you backslash a character other than one of the character escapes or an octal or hex code, so you have to choose the behavior you want with IgnoreUnrecognized().

◆ SetEscape()

bool JSubstitute::SetEscape ( const unsigned char  c,
const JUtf8Byte value 
)

Changes the value of the 'character' escape to the given value. Returns true if 'character' already had a value, false otherwise.

◆ SetNonprintingEscapes()

void JSubstitute::SetNonprintingEscapes ( )

Adds entries to the table corresponding to standard escapes for certain non-printing characters. The escapes and their values are:

\a   bell
\b   backspace
\e   escape

The numeric values for and are those chosen by the compiler for those escape sequences; is not a C escape (though it appears in Perl) and has the value 1B hex.

Note that ANSI does not define what happens if you backslash a character other than these or an octal or hex code, but you do. If you want other backslashed characters to represent themselves (so that the backslash is effectively removed) call SetIgnoreUnrecognized() first. If you want no changes to be made other than those listed here, call SetIgnoreUnrecognized(false).

◆ SetPureEscapeEngine()

void JSubstitute::SetPureEscapeEngine ( const bool  is = true)
inline

◆ SetRegexExtensions()

void JSubstitute::SetRegexExtensions ( )

Adds entries to the table corresponding to escapes useful as shorthands in defining regular expressions with JRegex. The escapes and their values are:

\d   a digit, [0-9]
\D   a non-digit
\w   a word character, [a-zA-Z0-9_]
\W   a non-word character
\s   a whitespace character, [ \f\n\r\t\v]
\S   a non-whitespace character
\<   an anchor just before a word (between \W and \w)
\>   an anchor just after a word (between \w and \W)

These escapes behave as atoms so they can be quantified normally and will not affect parenthesis numbering (this last requirement is why certain popular shorthands will not be added until Spencer's regexes acquire non-capturing parentheses so they can be defined atomically).

Note: these are normally most useful when the behavior for unrecognized escapes is to leave them alone. This is true whenever the string will be passed to another object which will do further backslash escape processing, such as JRegex (where "[" begins a character class while "\[" inserts a literal "[").

◆ SetVariableValue()

bool JSubstitute::SetVariableValue ( const JUtf8Byte name,
const JString value 
)

Set the value of a non-regex variable.

◆ SetWhitespaceEscapes()

void JSubstitute::SetWhitespaceEscapes ( )

Adds entries to the table corresponding to the standard (in C, Perl, and other unixy things anyway) codes for whitespace. The escapes and their values are:

\f   form feed
\n   newline
\r   carriage return
\t   horizontal tab
\v   vertical tab

The numeric values are naturally those chosen by the compiler for those escape sequences; this means they not only vary by character set but also by system.

◆ Substitute()

void JSubstitute::Substitute ( JString s) const

Scans the given JString for each backslash and dollar symbol.

If the backslash is followed by a character that has a value, that value is substituted for the backslash plus character. Otherwise, the backslash is removed if IgnoreUnrecognized is not set.

If ControlEscapes is set, and the backslash is followed by 'c', then the next character is converted to a control character if it is between 'A' and '_'. Otherwise, the '\c' is removed if IgnoreUnrecognized is not set. '@' is not included because this would produce nullptr, which is the C string terminator.

If PureEscapeEngine is not set and a $ is found, the value of the longest matching variable name is used to replace the $ and the variable name. If nothing matches, the $ is removed.

If a special character ('\', '\c' if ControlEscapes, '$' if not PureEscapeEngine) is found at the end of the string, it is removed.

To avoid infinite loops, substituted values are not re-scanned, so backslashes and dollars in value strings are left untouched.

◆ UndefineAllVariables()

void JSubstitute::UndefineAllVariables ( )

◆ UndefineVariable()

void JSubstitute::UndefineVariable ( const JUtf8Byte name)

We remove all variables that match, in case the same string is used as both a literal and a regular expression. (evil, but possible)

◆ UseControlEscapes()

void JSubstitute::UseControlEscapes ( const bool  use = true)
inline

◆ WillIgnoreUnrecognized()

bool JSubstitute::WillIgnoreUnrecognized ( ) const
inline

This flag controls the behavior when an escape is not recognized. false (the default) indicates that unknown escapes should be ignored, leaving the backslash in place. true means that the backslash should be removed, so that unrecognized escapes just represent the escaped character itself.

Member Data Documentation

◆ kIllegalControlChar

const JUtf8Byte * JSubstitute::kIllegalControlChar = "IllegalControlChar::JSubstitute"
static

◆ kLoneDollar

const JUtf8Byte * JSubstitute::kLoneDollar = "LoneDollar::JSubstitute"
static

◆ kTrailingBackslash

const JUtf8Byte * JSubstitute::kTrailingBackslash = "TrailingBackslash::JSubstitute"
static

The documentation for this class was generated from the following files: