Rulz Programming Language

Rulz and Rulz Language is Copyright © G.A.Jennings, 2015-2022.

This is language specification 2-1/2; last updated 24 June 2022.

This document is slighty out of date.

Rulz Data

A Rulz statement—a "rule"—is made of of two things, Operators and Arguments. Arguments are in two states: 1) literals which are plain, "unknown what they are" tokens in a source file—source state; 2) values which exist in computer memory after being read and converted (interpolated) by the Rulz code—computer state.

The first are tokens (though actually they are are string literals though I can't use that term yet), which are sequences of characters that "fit" into a "grammar" which defines just how a sequence of characters are converted to a token.

(A file's character encoding is beyond the scope of this document.[How's that for a cop out?])

The second are values that are in a binary or machine state, that the computer sees. These values are all of a type. This is where we can speak of "string" as a type of value, as in "Hello" as a token, is in memory the ASCII values for the letters H e l l and o. (The five hexadecimal values 48 65 6c 6c 6f.) Whereas the token 100 is a value like 0064 (depending on the CPU).

Rulz defines tokens for the following types: boolean, integer, float, string, list, hash and the PHP types null and handle (resource).

The type name number is a placeholder for a value that can be an integer or a float. The type-name scalar is a placeholder for a value that can be a number or a string. (Rulz also includes boolean and null in a scalar.)

Rulz introduces other tokens as word, file, argument and punctuation— though these are for how arguments are parsed, or seperated and tokenized to be used by operators or passed to subroutines—where they are usually treated as strings.

Argument Tokens

Example             Name                 Type

TRUE                true                 boolean
FALSE               false                boolean
NULL                NULL                 null
CONST               constant             scalar
123                 decimal              integer
0.123               float
1e4                 float
0123                octal                integer
0x1A                hexadecimal          integer
1Ah                 hexadecimal          integer
0b0001              binary               integer
0001b               binary               integer
(a,b,c)             list
(a b c)             list
(a=1,b=2,c=3)       hash                 list
{a,1,b,2,c,3}       hash                 list
simple              bareword             string
-a --help           argument             string
/lib/test.php       file                 string
./*.txt             glob                 list
\n \t               escape               string (length of 1)
"" ''               empty                string (length of 0)
"double quotes"     interpolated         string
'single quotes'     non-interpolated     string
[ ) +               punctuation          string (length of 1)

The list and hash types—type-name array—differ in format (and in name in PHP and Perl, though PHP calls hashes associative arrays) but are the same internally; lists are hashes with integer keys.

The hexadecimal extension (trailing [hH]) must have a leading digit.

A glob must start with a ~/, ./ or /.

Variables

Basic variables are like $var where "var" is one or more lowercase letters, and, PHP-like, can be assigned any type. Variable assignment is Bash-like with the $ not used.

Like Bash and Perl all variables are global and there are "my" and "local" attributes for subroutines.

There are Perl-like special variables of the format: $[0-9[:punct:]]. The special variables $0 and $_, differ from their Perl counterparts and are used as defaults for most Builtins and Commands and for some Functions.

See Variables for Rulz special variables.

Internal variables (similar to Perl's named special variables, are like $VAR, like user variables but uppercase. Most are read-only, but some control Rulz run-time options.

Subroutine Parameters

The parameter variables are unique and no other starts with @—it's syntax was stolen from Perl just for the parameter arguments.

@_                      list of parameters
@#                      count of parameters
@*                      all parameters expanded to a string
@0                      first parameter
@1                      second parameter, etc., with @9 the limit
@A                      first parameter
@B                      second parameter, etc., with @Z the limit

The special variable $@ is an alias to @*.

Variable Variables

Variable variables are kind of supported.

= a b                   assign $a with 'b'
+= $a                   increments $b ($a is interpolated to 'b')
^ $b                    displays "1"

Variable References

Variables are arguments by themselves or within double quotes. As arguments they are passed by value. To pass a variable by reference the notation is \$var.

$var                    variable
\$var                   reference

References are like in PHP and not like Perl references. An alternative to a sub-string function is to add an index to a reference (see also #String Indexing).

\$var[1]
\$var:1

String Indexing

= var foobar
$var                    'foobar'
$[0]var                 'f'   (index)
$[-1]var                'r'   (rindex)
$[0:3]var               'foo' (sub string)
$[3:]var                'bar' (sub string)
$[0,3]var               'fb'  (index concatenation)

There is an alternative index notation.

$var:0                  'f'   (index)
$var:-1                 'r'   (rindex)
$var:0:3                'foo' (sub string)
$var:3:                 'bar' (sub string)
$var:0,3                'fb'  (index concatenation)

String Expansion

String expansion interpolation (inside double quotes) has been based on Bash shell expansion.

= foo foobar            $foo = "foobar"
${foo}                  same as $foo        "foobar"
${#foo}                 length              6
${^foo}                 uppercase first     "Foobar"
${^^foo}                uppercase           "FOOBAR"
${,foo}                 lowercase first     "foobar"
${,,foo}                lowercase           "foobar"
${[0]foo}               character index     "f"
${[5]foo}               character index     "r"
${[6]foo}               character index     "r"
${0:3:foo}              substr              "foo"
${3::foo}               substr              "bar"
${fo#foo}               delete from start   "obar"
${ar%foo}               delete from end     "foob"

There is an alternative notation for variable arguments (outside of quoted strings).

$foo:#                  length              6
$foo:^                  uppercase first     "Foobar"
$foo:^^                 uppercase           "FOOBAR"
$foo:,                  lowercase first     "foobar"
$foo:,,                 lowercase           "foobar"
$foo:fo#                delete from start   "obar"
$foo:ar%                delete from end     "foob"

And a function (or tilde) notation (see String Functions)—though currently only non-argument functions.

$foo~u                  uppercase
$foo~l                  lowercase
$foo~n                  length
$foo~q                  quotemeta 
$foo~c                  chop
$foo~C                  chomp
$foo~t                  trim
$foo~h                  htmlentities
$foo~e                  urlencode
$foo~E                  urldecode
$foo~y                  crypt
$foo~b                  basename
$foo~d                  dirname

List Constructors

Lists can have any types, even other lists though not explicitly. List constructors have four formats.

()                      empty list
(1,2,3)                 list of three integers
(1 2 3)                 same
(1..3)                  same

Non-integer elements and variable interpolation is supported.

(a..z)                  letters 'a' through 'z'
($a,$b)                 interpolation

Nested lists can occur through interpolation.

= a (a,b)         
= b ($a,c)              (('a','b'),'c')

List Assignment

List elements can be assigned, either implicitly.

= []a 1
= []a 2

Or explicitly.

= [0]a 1
= [1]a 2

List Indexing

= var (a b c)
^ $var                  result is ('a','b','c')
$[0]var                 'a'   (index)
$[-1]var                'c'   (index)
$[0:1]var               (a,b) (slice)
$[0,3]var               (a,c) (index concatenation)

Hash Constructors

Hashes can have any types, even lists and other hashes though not explicitly. Hash constructors have two formats.

(a=1,b=2,c=3)           hash with three key/value pairs
{a,1,b,2,c,3}           same

Keys can only be of lowercase letters and quotes are not used. A hash cannot be empty.

= foo (a='b')           hash 'foo' w/ element 'a' set to 'b'
= bar (a=$foo)          hash 'bar' w/ element 'a' is hash 'foo'

Hash Assignment

Like lists hash elements can be assigned.

= [a]foo b              hash 'foo' element 'a' set to 'b'
= [b]foo (a,b)          hash 'foo' element 'b' set to ('a','b')

Hash Indexing

Hashes are also $ variables.

= var {a,1,b,2,c,3}
$[a]var                 1     (index)
$[a:b]var               (1,2) (slice)
$[a,c]var               (1,3) (index concatenation)

Hash slices and index concatenation result in lists.

Lists and hashes can be combined in the PHP way.

= [0]a 1
= [1]a 2
= [a]a a
= [b]a b                result is ( 0 = 1, 1 = 2, a = 'a', b = 'b' )

There are ways to construct complex lists and hashes, with complex string keys for example...

Escape Sequences

\n                      linefeed
\r                      carriage return
\t                      tab
\e                      escape
\\                      backslash
\$                      dollar sign
\"                      double quote
\'                      single quote

Escape Extensions

\N                      null
\T                      true
\F                      false

Though not within double quoted strings.

Perl-like Escape Sequences

\l                      lowercase next character
\u                      uppercase next character
\E                      end \L \U \q \Q
\L                      lowercase to \E
\U                      uppercase to \E
\q                      single quote to \E
\Q                      double quote to \E

Quote Sequences

Perl's quote-like operators are supported with extensions; like Perl 6 but with trailing modifiers.

|string|                single quoted string
|string|q               double quoted string
|string|w               list
|string|x               exec
|string|m               meta

|string|s               in single quotes
|string|d               in double quotes

^ |have "3" 4's|s       same as ^ "'have \"3\" 4's'"
^ |have "7" 9's|d       same as ^ "\"have \"7\" 9's\""

Unlike Perl both interpolate variables and some escape sequences.

Character Names

Each character as a name. Rulz, like all languages, uses some characters as special delimiters, and uses specific name for them, different than from some other authors.

Here are the important characters, their pronouncable names and how they are used.

Character   Name            Usage

.           decimal point   used in rational numbers
.           dot             used in file names and as a range for lists
*           star            file globs, as in *.php, which is "star dot PHP"
{}          brace           variable string expansion and hash contructors
[]          bracket         variable index notation and subroutine definitions

Braces are not "curly braces" as that implies that there are some non-curly brace characters, which there are not. (If brackets were "square braces"... but they are not.) "Square brackets" is also redundant—there are not "round brackets", but parenthesis.

(Slashes are "forward slash" and "backslash". That the latter is not a "backward slash" is weird.)

There are however, two characters that mess this up.

<           less than
>           greater than

When these two are used as a delimiting pair they are no longer verbs, and saying "between less than and greater thans" may be a bit too odd. So these are, in this case, "angle brackets".

I prefer, for braces, brackets and angles, to use the names "squiglies", "squares" and "anglies". But that's just me and may be too square, dare I say, for actual use.