3. Data types and literals

HSL has multiple data types; strings, booleans, numbers, arrays (which also works as an ordered map to store key-value pairs, similar to PHP’s array), objects (created by classes) and functions (both anonymous functions and named function pointers). Some of these data types may be represented as literals. There is also a none (or null) data type that is often used to represent errors or no valid value or response (e.g. a return statement without a value or a failed json_decode() both of which return none).

3.1. Boolean

The keywords true and false represent boolean true and boolean false, they are treated as 1 and 0 in arithmetics operations.

Warning

Boolean true and false should not always be used in if statement, if you are not fully aware of the truthiness and loose comparison.

if (5 == true) { } // false: 5 is not equal to 1
if (5) { } // true: 5 is not false, hence true

3.2. Number

The number type is a double-precision 64-bit IEEE 754 value. If converted to a string it will be presented in the most accurate way possible without trailing decimal zeros. A numeric separator (_) is allowed between digits for readability, it does not affect the value of the number.

echo 1.0; // 1
echo 1_000_000; // 1000000

Note

The number type can safely represent all integers between +/-9007199254740991 (the equivalent of (2 ** 53) - 1).

Warning

After some arithmetic operations on floating point numbers; the equality (==) of two floating point numbers may not be true even if they mathematically “should”. This caveat is not unique to HSL, instead it is the result of how computers calculates and stores floating point numbers. Arithmetic operations on numbers without decimals are not affected.

3.2.1. Hexadecimal

Numbers may be entered in hexadecimal form (also known as base 16) using the 0x prefix; followed by [0-9a-f]+. A numeric separator (_) is allowed between digits for readability, it does not affect the value of the number.

echo 0xfa; // 250
echo 0x00_fa; // 250

3.2.2. Octal

Numbers may be entered in octal form (also known as base 8) using the 0o prefix; followed by [0-7]+. A numeric separator (_) is allowed between digits for readability, it does not affect the value of the number.

echo 0o372; // 250

3.2.3. Binary

Numbers may be entered in binary form (also known as base 2) using the 0b prefix; followed by [0-1]+. A numeric separator (_) is allowed between digits for readability, it does not affect the value of the number.

echo 0b11111010; // 250
echo 0b1111_1010; // 250

3.3. String

There are two kinds of string literals, double-quoted strings and raw strings. Double-quoted strings support language features such as variable interpolation and escape sequences. Most functions (e.g. length() and str_slice()) are not UTF-8 aware, with the exception of regular expression matching (e.g. pcre_match()) which may be configured to be UTF-8 aware with the /u modifier.

3.3.1. Double-quoted string

Variable interpolation replaces $variable placeholders within string literals. Variables are matched in strings with the following pattern $[a-zA-Z_]+[a-zA-Z0-9_]. If needed there is also a more explicit syntax ${variable} (which allows variables mid-words). Interpolating an undeclared variable raises a runtime error.

"$variable"
"${variable}abc"

Escape sequence

Meaning

\\

Backslash (\)

\"

Double quote (")

\$

Dollar sign ($)

\n

ASCII Linefeed (LF)

\r

ASCII Carriage Return (CR)

\t

ASCII Horizontal Tab (TAB)

\xhh

Character with hex value hh

3.3.2. Raw string

Raw strings do not support variable interpolation nor escape sequences. This make them suitable for regular expressions. Raw strings start and end with two single quotes on each side '', with an optional delimiter in between. The delimiter can be any of [\x21-\x26\x28-\x7e]*; simply put any word.

''raw string''
'DELIMITER'raw string'DELIMITER'
'#'raw string'#'

Note

There is no performance difference between double-quoted and raw strings containing the same value. However if special characters needs to be escaped then raw string are recommended for clarity.

3.4. Regex

A regex literal is a pre-compiled regular expression object. The regular expression implementation is “Perl Compatible” (hence the function names pcre_…), for syntax see the perlre documentation. See supported pattern modifiers.

#/pattern/[modifiers]

This type can mainly be used with the regular expression operators and also as argument to the pcre_match() family of functions.

if ($string =~ #/hello/i) {}
if (pcre_match(#/hello/i, $string)) {}

3.4.1. Pattern modifiers

Use pattern modifiers to change the behavior of the pattern engine, they have the capability to make the match case-insensitive and activate UTF-8 support (where one UTF-8 characters may be matched using only one dot) etc.

Modifier

Internal define

Description

i

PCRE_CASELESS

Do case-insensitive matching

m

PCRE_MULTILINE

See perl documentation

u

PCRE_UTF8

Enable UTF-8 support

s

PCRE_DOTALL

See perl documentation

x

PCRE_EXTENDED

See perl documentation

U

PCRE_UNGREEDY

See perl documentation

X

PCRE_EXTRA

See perl documentation

3.5. Array

An array is a very useful container; it can act as an indexed array (automatically indexed at zero, or the highest current index + 1) or as an ordered map (associative array) with any and mixed data types as key and value. The short array syntax for literal arrays [] is recommended.

// indexed arrays
echo array("value", "value2");
echo ["value", "value2"];
echo [0 => "value", 1 => "value2"];

// associative arrays
echo array("key" => "value");
echo ["key" => "value"];

// multidimensional arrays
echo ["key" => ["key" => "value"]];

// automatic indexing
echo ["foo", 3=>"bar", "baz"]; // 0=>foo, 3=>bar, 4=>baz

// delete index
$foo = [];
$foo["bar"] = "hello";
unset($foo["bar"]);
echo $foo; // []

Note

Accessing any element in a zero indexed array using the subscript or slice operator is very fast (it has the complexity of O(1)).

3.6. Function

Both anonymous functions (closures) and named function pointers (references to functions) are available. This datatype is primarly used to be passed as callbacks to other functions.

3.6.1. Anonymous functions

An anonymous function is a unnamed function, it can be passed as value to a function or assigned to a variable. An anonymous function can also act as a closure. The global variable scoping rules apply.

$multiply = function ($x, $y) { return $x * $y; };
echo $multiply(3, 5); // 15

3.6.2. Named function pointers

A named function pointer is a reference to a named function. It can reference both a builtin function or a user-defined function. Prepending the function name with the builtin keyword works as expected.

function strlen($str) { return 42; }

$function = strlen;
echo $function("Hello"); // 42

$function = builtin strlen;
echo $function("Hello"); // 5

3.7. Object

An object is an instance type of a class statement or of a builtin class (such as Socket or File).

class Iterator

The builtin iterator class is used for iterators such as generators or Map.

next()

Return the next value from the iterator.

Returns

iterator data

Return type

array

The iterator data returned is an associative array with two fields

  • value (any) the value of the iterator

  • done (boolean) if the iteration/iterator is completed.

Note

Some iterators (such as Map) return an array containing a [key, value] as value. The foreach statement will then map these to its key and value variables.

3.8. None

This data type is represented by the keyword none. It may be used to indicate error-result or no return value from functions such as. json_decode() (in case of a decode error) or from a user-defined function with no or an empty return statement. This data type should not be used as an argument to other built-in functions as it yields undefined behavior for the most part. The only functions safe to handle this data type is:

$obj = json_decode("...");
if ($obj == none)
        echo "None";