Not So Loopy Digits: Weekly Challenge 352 Task 2

Task 2: Binary Prefix:

Given an array of 0s and 1s, return Booleans indicating whether each non-empty prefix of the array (shortest to longest) is divisible by 5 when interpreted as a binary number (highest bit first).

Excellent, a problem involving arrays and math; what a great opportunity to use a regex.

We start with a DFA for binary numbers divisible by 5[1], where each state is the current value modulo 5, and transitions represent adding a new lowest binary digit (so multiplying the state by 2 and adding 0 or 1, modulo 5). S₀ is the initial and final state[2]:

S₀:
    0       -> S₀
    1       -> S₁
S₁:
    0       -> S₂
    1       -> S₃
S₂:
    0       -> S₄
    1       -> S₀
S₃:
    0       -> S₁
    1       -> S₂
S₄:
    0       -> S₃
    1       -> S₄

Remove each non-accepting node one by one and replace transitions to it with regular expressions that end where the removed node did. First S₄:

S₂:
    01*0    -> S₃
    1       -> S₀

Then remove transitions to S₃:

S₁:
    0|11    -> S₂
    10      -> S₁
S₂:
    01*00   -> S₁
    01*01   -> S₂
    1       -> S₀

Remove self-transition in S₂:

S₂:
    (01*01)*01*00   -> S₁
    (01*01)*1       -> S₀

Then remove S₁ transition to S₂:

S₁:
    (0|11)(01*01)*01*00|10  -> S₁

Remove self-transition in S₁:

S₁:
    ((0|11)(01*01)*01*00|10)*(0|11)(01*01)*1    -> S₀

And remove S₀ transition to S₁:

S₀:
    0|1((0|11)(01*01)*01*00|10)*(0|11)(01*01)*1 -> S₀

Leaving us with this regular expression:

(0|1((0|11)(01*01)*01*00|10)*(0|11)(01*01)*1)+

Removing nodes in a different order will produce different but equivalent regexes.

So our task is to join the digits into a string and match against prefixes of it. But that involves iterating over the possible prefixes or lengths; wouldn't it be nicer to have the regex matching itself do that? The trouble is that finding all matches, with most regular expression implementations, only allows finding non-overlapping matches. You can use lookahead to capture some overlaps, but that still doesn't allow multiple non-empty matches from the same starting point, as we have here.

The answer is to reverse the string and the regex and the matches returned.

Reversing the regex[3] produces:

(1(101*0)*(11|0)(01|001*0(101*0)*(11|0))*1|0)+

Instead of matching prefixes of a string of binary digits ordered from high bits to low bits, we will use this to match suffixes of the reversed string — the substrings starting at each point in the string before the end, giving us a different starting point for each match.

To allow overlap, capture that (with the divisibility-regex parens changed to non-capturing) in a lookahead, followed by a single character match (that last to prevent matching an empty string at the beginning). That gets us all the matches of binary numbers divisible by 5; to include returning false for those not divisible, we will add an empty alternation in the capture and return false where the capture is empty.

In Python:

def binary_prefix(nums: list[int]) -> tuple[bool, ...]:
    return tuple(len(m) != 0 for m in re.findall(r'(?=((?:1(?:101*0)*(?:11|0)(?:01|001*0(?:101*0)*(?:11|0))*1|0)+\z|)).', ''.join(str(d) for d in nums[::-1])))[::-1]

In Perl:

sub binary_prefix($nums) {
    reverse map !!length, reverse(join '', @$nums) =~ /(?=((?:1(?:101*0)*(?:11|0)(?:01|001*0(?:101*0)*(?:11|0))*1|0)+\z|))./g
}

full script, Python
full script, Perl

Comments greatly appreciated. See you next week.


  1. This approach can be used for generating a DFA and then a regex for numbers in any base, divisible by any number. ↩︎

  2. This is sloppy wording; since the empty string is not a binary number divisible by 5, S₀ cannot be both the initial and final state; really, there's an additional state S that is the initial state, with transitions the same as S₀'s. ↩︎

  3. Including reversing the order of alternations, which isn't actually necessary. ↩︎





Read more