Back in the good old days, I was invited to Redmond for an interview with Microsoft.
They asked me standard interview questions about trees and recursion and pointers. No problem. And then, I was asked this question:
Given a coloured grid, find the number of contiguous regions of each colour.
Easy enough? NO!
After the interview, I spent days staring at this puzzle, trying to figure it out.
I went back to it last night, and I'm pretty proud of my solution. I made a visualization here.
In fairness to me-from-the-past, I didn't know about what BFS looked like at the implementation level back then. But there are still a few subtleties. In particular, there are two phases:
- Pick a region to count, and count it
- Make sure no block in that region is ever double-counted
Phase 2 is accomplished by 'burning' all blocks which are in the same region as a given block, and this bit can be done with BFS. Phase 1 is accomplished by maintaining a border list, which acts as an open list. Elements in this list are blocks from the regions adjacent to recently burned regions. It gives us an idea of what the frontier is for the search. However, notice that it is expensive to remove blocks from here as their region is visited by the algo. Therefore, we simply leave them. Once they are popped from this border list for examination, we just check if they have been visited (lazily) and only then are they examined. Burned frontier blocks are simply discarded. In summary, a nested BFS is used, with a lazy open list for the outer implementation.
The code is here in a Javascript implementation.
I've also provided a live demo here.
As an aside, the graphs are almost randomly-generated in the demo, but not exactly.
There is some affinity for clumping there. I think that's pretty cool. If you want to know how I did that, you can tweet at me and then I can do a blog post about it.
Or if you're a DIY sort of person, check out the source (it's JS after all, and I didn't obfusticate it).