Just How Big Is Max, Really? Weekly Challenge 358 part 1

Task 1: Max Str Value:

You are given an array of alphanumeric string, @strings.

 
Write a script to find the max value of alphanumeric string in the given array. The numeric representation of the string, if it comprises of digits only otherwise length of the string.

I'm going to assume digits and alphanumeric here mean ASCII. Numeric representation of a sequence including non-ASCII digits would be quite a can of worms[1], and for non-numbers, length of string could mean glyphs, not characters, so we'll just keep it simple.

Or not so simple; let's see.

In Python:

def max_str_value(strings: list[str]) -> int:
    return max(int(s) if re.fullmatch(r'^[0-9]+\z', s) else len(s) for s in strings)

Other than Python's out-of-order ternary operator syntax (instead of a ? b : c, b if a else c), this is nicely both terse and readable. The only problem is that it doesn't cleanly handle an empty list of strings, giving an exception instead.

In Go:

var allDigits = regexp.MustCompile(`^[0-9]+\z`)

func MaxStrValue(strings []string) (int, error) {
    maxValue := 0
    for _, s := range strings {
        var value int
        if allDigits.MatchString(s) {
            // all digits, but still could be larger than an int, bail if so
            var err error
            value, err = strconv.Atoi(s)
            if err != nil {
                return 0, err
            }
        } else {
            value = len([]rune(s))
        }
        if value > maxValue {
            maxValue = value
        }
    }
    return maxValue, nil
}

Using a regex here is kind of a cop-out. It could more efficiently loop through the characters, keeping track of both length and numeric value so far, but since there are two types of input, it doesn't seem awful to just separate them into two cases. And if we are limited to ASCII, the conversion to runes isn't actually needed, but a good practice anyway.

But we are using the standard library to get the numeric value, which only works up to the maximum int (which may be 32 or 64 bits, signed). In Python, ints are unbounded, so we don't even have to worry about larger numbers. For Go, I've chosen to assume a standard int type as the result, so limiting to that makes sense; uint or int64 or uint64 or a math/big Int could be used instead (changing the conversion function from Atoi as appropriate). Nevertheless, unless math/big is used, there is a possible error result. If given an empty slice of strings, I've chosen to return a 0, not an error.

In Perl:

sub max_str_value($strings) {
    List::Util::max(map((/^\d+\z/a ? 0+$_ : length), @$strings)) // 0
}

Essentially the same as the Python code, but even terser and more readable, and providing a self-documenting return value for when the array is empty. But this code handles too-large numbers worse: Perl numbers can morph between storage as integers, unsigned integers, or floating point as needed, which is very flexible but here causes outputing large values in scientific notation as well as losing precision. Math::BigInt could be used, but for many purposes, you don't want a sub that returns an object representing a number instead of a number, so I've left it simple.

The moral to me here is to be aware of the expected range of your inputs and outputs and use and document appropriate types. Even with Python, which seemingly has no issue with large numbers, if its large integers are output in JSON, this JSON may not be parseable in other languages, at least without some loss of precision.

full script, Python
full script, Go
full script, Perl

Comments greatly appreciated. See you next time.


  1. though Unicode::UCD::num() makes the attempt ↩︎





Read more