Or

An underrated language construct

by Marcel Garus · 2024-9-16
available at www.marcelgarus.dev/or

In my programming language Martinaise, I don't have a fixed set of operators that are built into the language. Instead, you can define your own operators by writing a function:

fun ==(aBoolbBool): Bool {
  if then else not(b)
}

fun !=[T](aTbT): Bool {
  not(== b)
}

fun main() {
  println(true != false)
}

This is also how I initially implemented boolean operators (because | is already the character for comments, I use / for or):

fun &(aBoolbBool): Bool if then else false }  | and
fun /(aBoolbBool): Bool if then true else }   | or
fun ^(aBoolbBool): Bool if then not(belse | xor

However, there's a subtle difference between how these operators behave compared to those of most mainstream languages: In Martinaise, foo() & bar() first calls foo() and bar(), and then calls the & operator. In most popular languages, equivalent code will first call foo() and only if foo returns true, will bar be called. Otherwise, the entire expression directly evaluates to false. This "short-circuiting" behavior is useful in lots of situations. Consider this example written in C:

if (index buffer_len && buffer[index] == 'f') {
  ...
}

If the && operator was not short-circuiting, an index that's bigger than buffer_len would still access buffer[index], potentially accessing memory outside of the buffer (and thereby triggering undefined behavior). Naively translating the C code into Martinaise code will fail exactly in this way. Instead, you have to nest ifs:

if index buffer.len then
  if buffer.get(index) == #then
    ...

Well, you'd probably rather check if buffer.get_maybe(index) == some(#f), but the point still stands: For complex condition checks, I want to have short circuiting behavior. So, I looked at what other programming languages do.

Short-circuiting && and ||

I could change && and || to be short-circuiting, similar to how they work in C, Rust, JavaScript, Java, Dart, and many more languages. This would allow you to do this:

if condition then { ... }

// equivalent:

condition && { ... }

This works in the languages mentioned above and I think this behavior is weird. It feels like they are in denial about the fact that && and || are full-blown control flow constructs (which usually have keywords). Instead, they try to conceal them as operators, even though they behave different from all the other operators.

Short-circuiting and and or

Some languages (notably Python and Zig) go another way: They acknowledge that short-circuiting boolean operations influence the program flow, so they give it just as much visibility as other control flow constructs. They give it a keyword.

if foo() or bar():
    ...

In this Python code, you intuitively understand that or may affect the control flow just as much as if.

Using short-circuiting for more than Bools

I got curious how and and or behave for non-boolean objects in dynamically typed languages. In Python, the docs say this:

By default, an object is considered true unless its class defines either a __bool__() method that returns False or a __len__() method that returns zero, when called with the object. Here are most of the built-in objects considered false:

The docs also contain this nice handy table:

Operation Result
x or y if x is true, then x, else y
x and y if x is false, then x, else y

This means you can use or to select default values:

foo my_map[keyor "default"

and and or in Martinaise

I tried to unify and, or, and default values into one concept, and make it extensible to custom types. If you write foo() and bar() in your code several things happen.

  1. The left side, foo(), gets evaluated.

  2. An and function is called with the result of the left expression. This function must return a ControlFlow:

    enum ControlFlow[SM] {
      short_circuitS,  | evaluate to S immediately
      evaluate_moreM,  | evaluate to the right side, passing M as a binding
    }
    

    Function overloading is used to find type-specific behavior for and. For example, these are the and and or functions for Bools:

    fun and(boolBool): ControlFlow[BoolNothing] {
      if bool
      then ControlFlow[BoolNothing].evaluate_alternative
      else ControlFlow[BoolNothing].short_circuit(false)
    }
    
    fun or(boolBool): ControlFlow[BoolNothing] {
      if bool
      then ControlFlow[BoolNothing].short_circuit(true)
      else ControlFlow[BoolNothing].evaluate_alternative
    }
    
  3. Depending on the variant of the ControlFlow, the right behavior is chosen.

If you're wondering about the M binding, this is useful for handling errors of results:

var content read_file("hello.txt"or(errorpanic("Couldn't open file: {error}")

Essentially, or becomes a switch:

equivalent to the above:
var content =
  switch read_file("hello.txt").or()
  case short_circuit(contentcontent
  case evaluate_more(errorpanic("Couldn't open file: {error}")

Using and and or

Once you acknowledge that and and or are control flow constructs just as much as ifs, they can make your code a lot clearer. Unlike if, an or allows you to first state the expected expression (usually a Bool, Maybe, or Result) and handle the exceptional case after the fact.

Let's look at a few real-world use cases! I won't explain much about the code – the details are pretty irrelevant. I'll just let you admire the or keyword in the wild.

Comparing two slices:

fun <=>[T](aSlice[T], bSlice[T]): Ordering {
  var 0
  loop {
    if == a.len and == b.len then return Ordering.equal
    if == a.len then return Ordering.less
    if == b.len then return Ordering.greater
    var ord a.get(i) <=> b.get(i)
    ord is equal or return ord
    i 1
  }
}

Parsing imports in the Martinaise compiler:

fun parse_imports(parser: &Parser): Result[Vec[AstStr], Str] {
  var imports vec[AstStr]()
  loop imports.&.push(parser.parse_import()? or break)
  ok[Vec[AstStr], Str](imports)
}

Copying slices:

fun copy_to[T](fromSlice[T], toSlice[T]) {
  from.len == to.len or
    panic("copy_to slice lens don't match ({from.len} and {to.len})")
  memcopy(from.datato.datafrom.len stride_size_of[T]())
}

Conclusion

If you're creating a programming language, consider making and and or keywords. Also, allowing developers to customize their behavior for types is a huge opportunity for improving ergonomics. Happy coding!