Or
An underrated language construct
In my programming language Martinaise, I don't have a fixed set of operators that are built into the language. Instead, you can define your own operators by writing a function:
fun ==(a: Bool, b: Bool): Bool {
if a then b else not(b)
}
fun !=[T](a: T, b: T): Bool {
not(a == b)
}
fun main() {
println(true != false)
}
This is also how I initially implemented boolean operators (because |
is already the character for comments, I use /
for or):
fun &(a: Bool, b: Bool): Bool { if a then b else false } | and
fun /(a: Bool, b: Bool): Bool { if a then true else b } | or
fun ^(a: Bool, b: Bool): Bool { if a then not(b) else b } | xor
However, there's a subtle difference between how these operators behave compared to those of most mainstream languages: In Martinaise, foo() & bar()
first calls foo()
and bar()
, and then calls the &
operator. In most popular languages, equivalent code will first call foo()
and only if foo
returns true
, will bar
be called. Otherwise, the entire expression directly evaluates to false
. This "short-circuiting" behavior is useful in lots of situations. Consider this example written in C:
if (index < buffer_len && buffer[index] == 'f') {
...
}
If the &&
operator was not short-circuiting, an index
that's bigger than buffer_len
would still access buffer[index]
, potentially accessing memory outside of the buffer (and thereby triggering undefined behavior). Naively translating the C code into Martinaise code will fail exactly in this way. Instead, you have to nest if
s:
if index < buffer.len then
if buffer.get(index) == #f then
...
Well, you'd probably rather check if buffer.get_maybe(index) == some(#f)
, but the point still stands: For complex condition checks, I want to have short circuiting behavior. So, I looked at what other programming languages do.
Short-circuiting &&
and ||
I could change &&
and ||
to be short-circuiting, similar to how they work in C, Rust, JavaScript, Java, Dart, and many more languages. This would allow you to do this:
if condition then { ... }
// equivalent:
condition && { ... }
This works in the languages mentioned above and I think this behavior is weird. It feels like they are in denial about the fact that &&
and ||
are full-blown control flow constructs (which usually have keywords). Instead, they try to conceal them as operators, even though they behave different from all the other operators.
Short-circuiting and
and or
Some languages (notably Python and Zig) go another way: They acknowledge that short-circuiting boolean operations influence the program flow, so they give it just as much visibility as other control flow constructs. They give it a keyword.
if foo() or bar():
...
In this Python code, you intuitively understand that or
may affect the control flow just as much as if
.
Using short-circuiting for more than Bools
I got curious how and
and or
behave for non-boolean objects in dynamically typed languages. In Python, the docs say this:
By default, an object is considered true unless its class defines either a
__bool__()
method that returnsFalse
or a__len__()
method that returns zero, when called with the object. Here are most of the built-in objects considered false:
constants defined to be false:
None
andFalse
zero of any numeric type:
0
,0.0
,0j
,Decimal(0)
,Fraction(0, 1)
empty sequences and collections:
''
,()
,[]
,{}
,set()
,range(0)
The docs also contain this nice handy table:
Operation Result x or y
if x is true, then x, else y x and y
if x is false, then x, else y
This means you can use or
to select default values:
foo = my_map[key] or "default"
and
and or
in Martinaise
I tried to unify and
, or
, and default values into one concept, and make it extensible to custom types. If you write foo() and bar()
in your code several things happen.
The left side,
foo()
, gets evaluated.An
and
function is called with the result of the left expression. This function must return aControlFlow
:enum ControlFlow[S, M] { short_circuit: S, | evaluate to S immediately evaluate_more: M, | evaluate to the right side, passing M as a binding }
Function overloading is used to find type-specific behavior for
and
. For example, these are theand
andor
functions forBool
s:fun and(bool: Bool): ControlFlow[Bool, Nothing] { if bool then ControlFlow[Bool, Nothing].evaluate_alternative else ControlFlow[Bool, Nothing].short_circuit(false) } fun or(bool: Bool): ControlFlow[Bool, Nothing] { if bool then ControlFlow[Bool, Nothing].short_circuit(true) else ControlFlow[Bool, Nothing].evaluate_alternative }
Depending on the variant of the
ControlFlow
, the right behavior is chosen.
If you're wondering about the M
binding, this is useful for handling errors of results:
var content = read_file("hello.txt") or(error) panic("Couldn't open file: {error}")
Essentially, or
becomes a switch
:
# equivalent to the above:
var content =
switch read_file("hello.txt").or()
case short_circuit(content) content
case evaluate_more(error) panic("Couldn't open file: {error}")
Using and
and or
Once you acknowledge that and
and or
are control flow constructs just as much as if
s, they can make your code a lot clearer. Unlike if
, an or
allows you to first state the expected expression (usually a Bool
, Maybe
, or Result
) and handle the exceptional case after the fact.
Let's look at a few real-world use cases! I won't explain much about the code – the details are pretty irrelevant. I'll just let you admire the or
keyword in the wild.
Comparing two slices:
fun <=>[T](a: Slice[T], b: Slice[T]): Ordering {
var i = 0
loop {
if i == a.len and i == b.len then return Ordering.equal
if i == a.len then return Ordering.less
if i == b.len then return Ordering.greater
var ord = a.get(i) <=> b.get(i)
ord is equal or return ord
i = i + 1
}
}
Parsing imports in the Martinaise compiler:
fun parse_imports(parser: &Parser): Result[Vec[AstStr], Str] {
var imports = vec[AstStr]()
loop imports.&.push(parser.parse_import()? or break)
ok[Vec[AstStr], Str](imports)
}
Copying slices:
fun copy_to[T](from: Slice[T], to: Slice[T]) {
from.len == to.len or
panic("copy_to slice lens don't match ({from.len} and {to.len})")
memcopy(from.data, to.data, from.len * stride_size_of[T]())
}
Conclusion
If you're creating a programming language, consider making and
and or
keywords. Also, allowing developers to customize their behavior for types is a huge opportunity for improving ergonomics. Happy coding!