In the previous chapter, we explored arrays and dictionaries. Arrays can access elements by index, and dictionaries can access elements by string key. However, what if we access an array index that is higher than our total array elements count? What if we attempt to access a non-existent dictionary key?
import System; int[] arr = [ 1, 2, 3 ]; arr[1000]; // ? Dictionary<int> dict = { "a": 1 }; dict["b"]; // ?
The idea of accessing non-existent elements of containers, such as arrays and dictionaries, led to the invention of "existent types." Existent types are exclusive to JS++ and are co-invented by me and Anton Rapetov.
In languages preceding JS++, such as C++, Java, and C#, accessing a non-existent element would result in program termination from segmentation faults or uncaught exceptions. In JS++, your program can't crash or exit prematurely from an "out-of-bounds" access, and this is checked by the compiler. The checking incurs almost no compile time overhead; in fact, we've shown that existent types can result in only a ± 1ms (millisecond) difference in compile times for complex projects.
An existent type describes whether a container access is within-bounds or out-of-bounds. Here's a basic example:
int[] arr = [ 1, 2, 3 ]; int+ x = arr[0]; // within-bounds int+ y = arr[1000]; // out-of-bounds
Existent types use the +
type annotation syntax. Existent types are also known as the "bounds-checked type;" to aid your understanding of existent types, here's how code for existent types might be generated:
int[] arr = [ 1, 2, 3 ]; int+ x = 0 < arr.length ? arr[0] : undefined; // within-bounds int+ y = 1000 < arr.length ? arr[1000] : undefined; // out-of-bounds
In other words, existent types don't just stop at the type checker. When existent types are encountered, code will be generated to perform bounds checking. Bounds checking means that - at runtime - the container access will be checked to ensure that it is "within-bounds" and not attempting to access a non-existent element. If an out-of-bounds access occurs, the variable with the existent type will be assigned a value of undefined
.
In our example above, x
is within-bounds and will have the value of the first element of arr
(1). Meanwhile, y
is out-of-bounds because the index 1000 is larger than the array's size of three elements. Thus, y
has a value of undefined
.
An existent type cannot be the element type for a container such as an array. It's a compile-time error if you try:
int+[] arr = [ 1, 2, undefined, 3 ];
[ ERROR ] JSPPE5204: Existent type `int+' cannot be used as the element type for arrays at line 1 char 0
In order to understand this concept, we have to understand some JavaScript. In principle, JavaScript differentiates between two values: null
and undefined
. null
means the value exists but is an "empty value" while undefined
means no value exists at all. This is most easily understood via variable declarations:
var a = null; var b; console.log(a); // null console.log(b); // undefined
In the above JavaScript code, the variable 'a
' was declared and initialized to null
(a value exists but represents an "empty" value). Meanwhile, the variable 'b
' was declared but not initialized. Thus, b
has the value 'undefined' (no value exists at all).
This sounds fine at first. However, JavaScript doesn't actually apply this rule in practice:
var a = null; var b; var c = undefined; console.log(a); // null console.log(b); // undefined console.log(c); // undefined
JS++ has different semantics. We want you to be able to express "empty" values. For example, a file that was just created might have a creation date, but it won't necessarily have a "last access" date. JS++ allows you to express "empty" values with nullable types. In JS++, null
means "empty" value, but undefined
only means "out-of-bounds access." This distinction is important to understand if you want to understand why JS++ does not allow existent types to be the element type. In JavaScript, you can have an array of undefined
values:
var arr = [ undefined, undefined, undefined ];
Thus, JavaScript is unable to distinguish a within-bounds 'undefined' value from an out-of-bounds 'undefined' value. JS++ does not have this problem. However, if you want to express emptiness...
Nullable types allow you to declare that data can also have the value null
:
int? x = 1; x = null; // OK
Note that a nullable type cannot be assigned the value 'undefined'. Besides this, nullable types don't suffer the restrictions of existent types and should be used when emptiness needs to be expressed. For example, the following array can contain int
values and empty (null
) values:
int?[] arr = [ 1, null, 2, null, 3 ];
However, the above array now poses a new problem: what happens if we access an out-of-bounds element?
When we have an array of nullable-type elements, out-of-bounds accesses are still represented with existent types. JS++ allows us to combine nullable and existent types using the ?+
syntax:
int?[] arr = [ 1, null, 2, null, 3 ]; int?+ x = arr[0]; // 1, within-bounds int?+ y = arr[1]; // null, within-bounds int?+ z = arr[100]; // undefined, out-of-bounds
The following illustrates the possible type combinations for nullable and existent types:
int a = 1; // 'int' only int? b = 1; // 'int' or 'null' int+ c = 1; // 'int' or 'undefined' int?+ d = 1; // 'int' or 'null' or 'undefined'
So far, we've only discussed arrays. However, nullable and existent types can also be used for dictionaries and other containers. Here's a creative example using System.Dictionary<T>
:
import System; Dictionary<bool?> inviteeDecisions = { "Roger": true, "Anton": true, "James": null, // James is undecided "Qin": false }; bool?+ isJamesAttending = inviteeDecisions["James"]; // 'null' bool?+ isBryceAttending = inviteeDecisions["Bryce"]; // 'undefined'
In the above code, we use the ?+
syntax to combine nullable and existent types. We're throwing a party, and we want to keep track of the decisions of our invitees. If the invitee's decision is 'true', he's coming to the party. If the invitee's decision is 'false', he won't be attending. If the invitee's decision is 'null', he is undecided. Finally, if the invitee's decision evaluates to 'undefined', he was not actually invited.
By default, there is no automatic conversion from an existent type T+
to T
. Concretely, you cannot assign a value of type int+
to int
without getting an error:
int[] arr = [ 1, 2, 3 ]; int+ x = arr[0]; // within-bounds int+ y = arr[1000]; // out-of-bounds int a = x;
[ ERROR ] JSPPE5206: Cannot convert existent type (`int+') to `int'. To manually convert and avoid exceptions, try using the safe default operator '??'. You can also perform an explicit cast to `int', but it can cause a runtime 'System.Exceptions.CastException' at line 5 char 8
We will explore later in this chapter how to cast correctly (to avoid runtime exceptions... despite what the error message says), but, first, let's explore the better option: the safe default operator ??
:
a ?? b
The safe default operator ??
will start by evaluating the expression on its left-hand side (a
). If the expression on the left-hand side evaluates to undefined
(or null for nullable types), then the evaluated value on the right-hand side of the operator (b
) is returned. If the left-hand side does not evaluate to undefined
(or null for nullable types), then the evaluated value of the left-hand side (a
) is returned.
In simpler terms, the safe default operator (??
) allows you to provide an alternative value if an out-of-bounds access occurred (or if a null value is encountered for nullable types). Let's change our code above to use the safe default operator:
import System; int[] arr = [ 1, 2, 3 ]; int+ x = arr[0]; // within-bounds int+ y = arr[1000]; // out-of-bounds int a = x ?? 0; int b = y ?? 0; Console.log(a); Console.log(b);
On Windows, right-click the file and select "Execute with JS++". In Mac or Linux, run the following command in your terminal:
> js++ --execute test.jspp
You should see the following output:
1
0
Observe that our within-bounds access (x
) was successfully converted from int+
to int
and retained its value: the first element of arr
(1). Meanwhile, our out-of-bounds access (y
) was also converted, but we used the "alternative value" (the right-hand side) of the ??
safe default operator. Thus, y
was also converted to int
but with a value of zero.
In most cases, when using the safe default operator, the right-hand side will usually be supplied with the default value for the type you want to convert to. For example, zero (0) for int
, an empty string ("") for string
, and false
for bool
. JS++ does not provide default values for you because conversions involving existent types should be handled on a case-by-case basis; for example, accidentally default initializing a timeout or price value to zero will incur different bug severities compared to default initializing a word count to zero for a missing word. Requiring a default value via the safe default operator is by design. It would have been a simple change for the design team to add an automatic conversion from T+ to T to make existent types less verbose, but we didn't want to open up your code to bugs.
The following table describes what the safe default operator checks for each type:
Type | Checks for... | |
---|---|---|
Nullable (?) | null |
|
Existent (+) | undefined |
|
Nullable + Existent (?+) | null and undefined |
It's not uncommon to write code inside a loop or function that assumes only within-bounds accesses occur. Oftentimes, rather than desiring an exception that can terminate the program, we'd rather just "skip" complex logic when we detect an out-of-bounds error. We can do that in JS++ with an 'if' statement that checks for 'undefined':
import System; int[] arr = [ 1 ]; for (int i = 0; i < 10; ++i) { int+ element = arr[i]; if (element == undefined) { continue; } int x = element ?? 0; // the ?? 0 path never gets followed Console.log(x + 1); Console.log(x + 2); Console.log(x + 3); }
In the code above, we simply skip the iteration if an out-of-bounds access was detected. The rest of the code, after the 'if' statement, operates with the assumption that we are only dealing with within-bounds values. The output will be:
2
3
4
Notice I marked a line with a comment:
int x = element ?? 0; // the ?? 0 path never gets followed
The reason the ?? 0
path never gets followed is because we've already checked that the 'element' variable does not equal 'undefined'. Alternatively, we can cast...
In JS++, some type conversions cannot be proven to be safe by the compiler and need to be performed explicitly. Consider the following example:
import System; int x = 1; byte y = x;
[ ERROR ] JSPPE5016: Cannot convert `int' to `byte'. A cast is available at line 4 char 9
Since we know the value one (1) is within the range of the 'byte' data type (0-255), we can provide an explicit cast to make the error go away:
import System; int x = 1; byte y = (byte) x;
The syntax for a type cast is:
(type) expression
Let's re-visit a previous existent types example where we received an error suggesting to use either the safe default operator or a cast. We decided to go with the safe default operator because it was... safe. The safe default operator can never result in runtime program termination. In using existent types, the safe default operator will suffice for the vast majority of cases. However, there are times when you might know the cast is safe or might not want to supply a default value.
Here's our previous example:
int[] arr = [ 1, 2, 3 ]; int+ x = arr[0]; // within-bounds int+ y = arr[1000]; // out-of-bounds int a = x;
[ ERROR ] JSPPE5206: Cannot convert existent type (`int+') to `int'. To manually convert and avoid exceptions, try using the safe default operator '??'. You can also perform an explicit cast to `int', but it can cause a runtime 'System.Exceptions.CastException' at line 5 char 8
Let's first experiment by casting the variable 'x' from 'int+' to 'int':
import System; int[] arr = [ 1, 2, 3 ]; int+ x = arr[0]; // within-bounds int+ y = arr[1000]; // out-of-bounds int a = (int) x; Console.log(a);
There should be no problems, and you should see this output:
1
However, let's try casting the variable 'y' (which made an out-of-bounds access) from 'int+' to 'int':
import System; int[] arr = [ 1, 2, 3 ]; int+ x = arr[0]; // within-bounds int+ y = arr[1000]; // out-of-bounds int a = (int) y; Console.log(a);
Execution Error: System.Exceptions.CastException: Failed to cast `undefined' to `int'
An incorrect cast led to a runtime error. Let's re-visit the original error message before we started casting (with relevant details in bold):
JSPPE5206: Cannot convert existent type (`int+') to `int'. To manually convert and avoid exceptions, try using the safe default operator '??'. You can also perform an explicit cast to `int', but it can cause a runtime 'System.Exceptions.CastException'
Now you can see why we started with and recommended the safe default operator: it cannot cause runtime errors. Now you can also see why the error message warns you against the explicit cast. However, there is a way to use casts safely.
Now that we've learned what unsafe casting looks like, we can learn how to cast safely and correctly when using existent types. The rule is very simple: check that you don't have an out-of-bounds access before you perform a cast. Here's an example:
int[] arr = [ 1, 2, 3 ]; int+ x = arr[0]; // within-bounds int+ y = arr[1000]; // out-of-bounds if (x != undefined) { int a = (int) x; } if (y != undefined) { int b = (int) y; }
Now you should never get runtime errors if you want to "cast away" the existent type.
Once again, we want to stress: When existent types are used correctly, you should never get premature or unexpected program termination.
Before we announced existent types, we tested the feasability and usability of existent types on 11,000 lines of code. Our findings were that you should almost never need to cast, and the safe default operator '??' will handle most cases. In our refactoring of the 11,000 lines of code, we only had one or two instances where casts were needed, and, in those instances, the casts were performed safely by checking for 'undefined' first.
Here's the relevant original code (before existent types were introduced):
FeedItem item = this.queue.pop(); Crawler _this = this; auto tokenizer = new TextTokenizer(); auto docProcessor = new DocumentProcessor(); string url = item.url; Date pagePublishedDate = item.date;
And here's the refactoring:
FeedItem+ item = this.queue.pop(); if (item == undefined) { return; } FeedItem feedItem = (FeedItem) item; Crawler _this = this; auto tokenizer = new TextTokenizer(); auto docProcessor = new DocumentProcessor(); string url = feedItem.url; Date pagePublishedDate = feedItem.date;
It would also be equivalent to refactor like so (without the need for casts):
FeedItem+ item = this.queue.pop(); if (item == undefined) { return; } Crawler _this = this; auto tokenizer = new TextTokenizer(); auto docProcessor = new DocumentProcessor(); string url = item?.url ?? ""; Date+ pagePublishedDate = item?.date; if (pagePublishedDate == undefined) { return; }
Notice in our code that we are applying the same concepts we've been teaching in this chapter: checking for 'undefined', "skipping" code (e.g. with the 'return' or 'continue' statements), casting safely, etc.
If the code from our test project is confusing, it's because we haven't taught classes and user-defined types yet. We'll get to that starting in the next chapter. For readers coming to JS++ from languages that have classes, the above code should illustrate further how to use existent types. Casts are the one corner case with existent types that can cause runtime exceptions, but, if used correctly, it should never happen.
Nevertheless, in the vast majority of cases, the safe default operator '??' should suffice and should be favored over casts.
In C++, there is operator[]
and .at()
. The former does not perform bounds checking; the latter differs from the former because it performs bounds checking and throws an std::out_of_range
exception. Needless to say, the reason this design exists in C++ is because C++ programmers intuitively have a sense of whether they might perform an out-of-bounds access and don't want to pay the performance penalty for bounds checking.
... And it's not just C++ programmers. Most programmers have a sense of intuition beyond the compiler's knowledge. Intuition was one of the fundamentals behind JS++ "type guarantees," and JS++ introduced a sound, gradual type system that is fault-tolerant and scales for complex projects. Likewise, for containers, we have a sense of intuition:
import System; int[] arr = [ 1, 2, 3 ]; for (int i = 0, len = arr.length; i < len; ++i) { arr[i]++; }
A programmer only needs to look at the above code to know that an out-of-bounds error will never occur.
Why is intuition important? Because, for existent types, it is solves major usability issues that would have prevented existent types from becoming practical. The above increment code is actually valid JS++ code with existent types. For common operations (such as ++ and +=), code will be generated so that an error value (undefined
) will be returned if the operation occurred on an out-of-bounds element. If the operation occurred for a within-bounds element, it will succeed.
Back to intuition. If you intuitively believe you might make an out-of-bounds access, you can customize how you want to handle the error by first checking for the error value occurring:
import System; int[] arr = [ 1, 2, 3 ]; for (int i = 0, len = arr.length; i < len; ++i) { int+ x = arr[i]++; if (x != undefined) { Console.log("Success!"); } else { Console.error("Increment on out-of-bounds element."); continue; } // ... }
I explicitly added the continue
statement even though it seems unnecessary. Oftentimes, we don't actually want our program to potentially terminate with an IndexOutOfBoundsException
or the like. When we're iterating an array, usually we can just "skip" to the next iteration (e.g. via continue
) if an error condition occurred. This is much more eloquently described with if/else than try/catch.
Effectively, you get a NOP ("no operation") instead of an exception for ++, +=, and assignment (=) if the operation is performed on an out-of-bounds element. Semantically, a NOP operation remains correct, and it's also better than an exception that can terminate the program. We could have designed this more "safely," but we chose user experience (UX). If you intuitively fear there might be an error condition, check for undefined
at runtime.