Why adieu is a terrible first guess in wordle

Adieu is a terrible first word in Wordle. It's not just a little bad, and not just in a few rare cases, like Matt Damon movies. It's really bad, basically all the time, like Adam Sandler movies. On average, "adieu" is about 2.4 times worse than the optimal word. What is the optimal word? read more

send + more = money

A friend nerd-sniped me with this problem:

      send
    + more
    ------
     money

where each letter represents a digit 0-9 and must take on a unique value.

I solved this problem on pen and paper, but I wondered how we might solve it computationally. We can of course try brute force search (108 is not very big for a computer), but is there a smarter way? I evaluated three different solutions... read more

Introducing BotSight

I've been working on disinformation since the beginning of 2018, and BotSight is the culmination of that effort. BotSight is an inline real-time Twitter annotator which computes the likelihood that each account in your feed is a bot or sockpuppet. BotSight is available as a app for iOS or a browser extension for all major browsers.

BotSight has been featured in the press a little. Here is some partial coverage:

You can learn more by going to the official BotSight download page.

The Privacy Challenges of Digital Contact Tracing

There has been a lot of recent discussion about contact-tracing apps and how to balance correctness and privacy. While there are a lot of positives around the Apple-Google contact tracing protocol, it is not nearly as private as its proponents suggest. In this blog for NortonLifeLock, I suggest a (to my knowledge) novel attack on the Apple-Google contact tracing protocol, complete with some great visualizations by Alex Gurbin.

Planetary Orbits

Did you know the relative distances between the planets change as they move around the sun? A recent paper shows that this effect means Mercury, not Venus, is actually Earth's closest neighbour most of the time. I wrote a little simulator to show the effect, which can be found here.

Civ6 Timeline

After trying to get better at Civ6 (Civilization 6, a turn-based strategy game), I found no good way to really analyze my games. Since I couldn't find anything, I built it myself. I present to you (tadadada) a Civ6 timeline visualizer.

You can read more about how it works here.

Here are some example games from my own playthroughs:

A Gentle Introduction to Attribute-Based Encryption

After some recent problems suffered by Facebook, my friend Will and I started thinking about how developers can build secure systems in an enterprise setting. What does secure mean, and how secure is secure enough? While Will tried to tackle the issue from a legal perspective, I've been thinking about building provably secure systems using recent advances in cryptography. I don't have all the answers, but I think some clues on how to build future systems lie in Attribute-Based Encryption. I wrote a (perhaps long-winded) explainer, which is largely my journey of understanding the math myself.

State of Python in 2018

After attending, and speaking at PyCon Canada 2018, I was blown away by how much the Python ecosystem had grown. But no one likes to gripe like engineers, so I had many hallway conversations about how things didn't work, or how hard something was to set up. I wanted to write down those whispers into something more cohesive, and add some of my thoughts on where we can go from here. You can find it here.

Book Classics - Intro

Growing up, I was encouraged to read "classic works of literature". These, I was told, are the shared culture of humanity, passed down through the ages. But what is a classic?

Read more

St. Petersburg Board Game Simulator

I wrote a board game simulator and AI for the Saint Petersburg board game. You can read more about how I did that or just play the game against the AI and have a good time.

SmokeDetector Accepted!

My first project at Symantec has been accepted to the ACSAC conference! I worked primarily on the AWS Spark implementation, and the evaluation section, which involved generating graphs which "look like" production data. I enjoyed this project immensely and it was a pleasure working with Kevin Roundy, who nurtured the key ideas behind SmokeDetector from conception until publication. You can read the full paper in the proceedings of ACSAC 2017.

Crane Takes Flight!

After a lot of work and some late nights, we finally published the work which contains the bulk of my Master's Thesis! You can read the full paper in the proceedings of SYSTOR 2017.

Meeting Bingo

Have you ever looked at your calendar and seen a long and pointless meeting bearing down on you from a distance? Like a tidal wave observed from a small boat in the ocean, you know of the terrible fate about to befall you, yet you are powerless to stop it. But what if you could prepare for just such a calamity? I introduce to you the solution: meeting bingo!

read more

Cafeteria Intervals

Over the past month, I've been interviewing and have seen quite a few interesting questions as a result. Here is my favourite so far: cafeteria intervals. What makes this problem interesting is the different ways that it can be solved, and the weird time/space complexity of some of the solutions.

Visa Map

A friend of mine asked an interesting question:

Which passports should I hold to travel the world without needing a visa?

I tried looking it up, but it turns out there is no central database of visa information, except for "passport-power" indices, which provide the number of countries you can visit with a given passport without providing the countries themselves. I decided to scrape Wikipedia for the answers, and made an interactive map from the results. It turned out surprisingly well, and is linked here. You can see the code for both the scraping and the map itself here.

This was a quick-and-dirty project, so I used Google GeoCharts for map visualization, which is far from perfect. Let me know what you think on Twitter!

I Wrote a Password Manager

I wrote a password manager called PassZero and made it open source. It runs on the web, but you can also host it locally by cloning the GitHub repo.

A password manager is a place to keep all your passwords securely, by remembering a single strong password, instead of 100-200 weak ones (or repeating passwords, which is the devil).

Read more about it here, or just try it out.

ICPC Regionals - Generalized Roman Numerals

My team and I wrote the ICPC regionals last week, and brought the 2014 season to a close. This is likely the last problem postmortem that I will write about from this year's ACM, but it's a good one.

The problems at regionals this year were characterized by either being quite easy to reason about but very bug-prone to implement, or else rested on some fine-grained optimization of C++ code. I can't say I like these sorts of problems on the whole, but at least one of them was interesting nevertheless.

On to the problem! Roman numerals have very specific rules associated with how they are written, but what if those rules were relaxed? This leads to an ambiguous number system. The goal of this problem was calculating all the numbers which corresponded to some Roman numeral string under these relaxed assumptions.

Check out the full writeup here, or my solution on Github.

ICPC Practice - Sign Games

When preparing for ICPC last week, I didn't have a chance to post this summary of a fun little problem. But better late than never, here is the writeup! The full write-up is here. The Github code is here.

ICPC Practice - Picky Lounges

Another week, another tough contest problem. This one comes from the 2014 Nordic contest, and draws on some cool graph theory ideas. It also requires some creativity in the graph representation. Read about it here, or see the solution directly on github.

ICPC Practice - Clan Warface

This week was midterm week, so I didn't have a chance to write up the ICPC problem my group failed to solve from last week until today. During the previous practice round, we went through the ICPC problem set from the 2013 Pacific Northwest Regionals. This week's challenge was to quickly identify easy problems are write up solutions, then focus the remaining time on solving 1-2 difficult ones.

We failed in the last regard, but with the benefit of time, I present here my solution to one of the 2 'difficult' problems - CLAN WARFARE.

Check out my analysis here. Visualization to come (hopefully)!

ICPC Practice - Bankrupt Kingdoms

As I said last week, I'm on the U of T ICPC team. This week, we went through the 2012 Europe regionals problems. On the whole, the problems were of medium difficult, and the main challenge was to write as many bug-free programs as possible.

One problem which interested me was a problem about cascading bankruptcy filings. Initially I classified it as difficult, and skipped it (I tend to complete easier problems first), and didn't go back to the problem until the end of the contest, at which point I failed to write a correct solution in time. After the contest, I wrote up the correct solution, along with a visualisation (as usual).

This problem is tricky, in that it deceives you twice. Check out my analysis here.

ICPC Practice - Wards and Triangles

This year, I'm on the University of Toronto ICPC team. ICPC is an annual competition for university students to solve tough programming problems given 5 hours.

Two weeks ago, while doing past contests, I found a particularly tough problem. Check out my solution, and a visualization here

Microsoft Interview Revisited

Back in the good old days, I was invited to Redmond for an interview with Microsoft. They asked me standard interview questions about trees and recursion and pointers. No problem. And then, I was asked this question:

Given a coloured grid, find the number of contiguous regions of each colour.
Easy enough? NO! After the interview, I spent days staring at this puzzle, trying to figure it out. I went back to it last night, and I'm pretty proud of my solution. I made a visualization here.

In fairness to me-from-the-past, I didn't know about what BFS looked like at the implementation level back then. But there are still a few subtleties. In particular, there are two phases:

  1. Pick a region to count, and count it
  2. Make sure no block in that region is ever double-counted

Phase 2 is accomplished by 'burning' all blocks which are in the same region as a given block, and this bit can be done with BFS. Phase 1 is accomplished by maintaining a border list, which acts as an open list. Elements in this list are blocks from the regions adjacent to recently burned regions. It gives us an idea of what the frontier is for the search. However, notice that it is expensive to remove blocks from here as their region is visited by the algo. Therefore, we simply leave them. Once they are popped from this border list for examination, we just check if they have been visited (lazily) and only then are they examined. Burned frontier blocks are simply discarded. In summary, a nested BFS is used, with a lazy open list for the outer implementation. The code is here in a Javascript implementation. I've also provided a live demo here.

As an aside, the graphs are almost randomly-generated in the demo, but not exactly. There is some affinity for clumping there. I think that's pretty cool. If you want to know how I did that, you can tweet at me and then I can do a blog post about it. Or if you're a DIY sort of person, check out the source (it's JS after all, and I didn't obfusticate it).