For historic purposes, this was the rolling post throughout the day…

Hi everyone,

This is just a very quick update (in between actually trying to figure what’s happening!) to say we know something is wrong with our systems, and we’ve been working to get it back online all morning.

Feel free to write to Lisa or I (support@activeinboxhq.com), but I’ll announce here when we’re back up. (Although forgive me if I don’t respond quickly, as I’m fully in code trying to fix it).

PS To the handful of people who tried to call, I apologise we couldn’t answer earlier. As you might imagine, it’s panic stations here. Lisa is now on top of email and manning the phone (but we probably can’t add any more detail than is posted here… feel free to vent frustration though, I know exactly how serious this is.)

ETA on it being fixed – It’s fixed!

It depends how deep we have to go to find a solution.

Currently we’re looking for a quick fix by attempting to resolve the security certificate issue (mentioned in Update 1). That means it’ll be online in a matter of hours.

If that’s not enough, then we’re going to have to rebuild the server from scratch. That might still be done today, but it could roll into tomorrow… mainly because we have to do tons of integrity checks on your data before committing to anything so major.

Update: It’s requiring a rebuild. I’m about half way through and hopeful it can be done today. (Or rather, I’ll keep going until it’s done… I mean I hope it can be done within most of your business day time zones.)

Update: And 8 hours later, we’re fixed!

Keeping working while it’s down

Thanks to Aaron for reminding me to say this (it’s fair to say my brain is a little fragmented)…

Because ActiveInbox’s organization system is based entirely on Gmail labels, you can just use labels to move around.

For example, if you want to look at items you marked as due today, go to Gmail’s sidebar and load the label ZD/20241007.

(For reference: ZD/ is a prefix to tell ActiveInbox it’s a date; then it’s formatted as yyyymmdd… so 20241007 is the 7th October 2024.)

You can also add Gmail labels manually, e.g. to set a deadline for tomorrow, add a label for ZD/20241008.

What has caused it?

Our hosting platform (Heroku – who for decades have been rock solid, but in the last 2 years have had an increasing number of mishaps) appears to have modified a security certificate with no warning, which stopped our multiple servers from talking to one another.

It’s going to take some time to digest exactly how to deal with this in the future (e.g. moving platforms); but for now I’m still trying to get the server back online. I’ll write a full post mortem when it’s all calmed down.

The updates so far…

Update 1

All signs point to our hosting platform having changed something at 00:00 this morning, which started the problem.

I’m speaking to them but they’re not responding quickly (a separate discussion for when I have more breathing space for a post-mortem); so we’ve also been combing through the systems to try to identify what changed and broke.

I’m homing in on it – it looks like it’s an altered security certificate (that doesn’t mean any of your data is exposed, it just means two or more systems are out of sync and not talking to each other).

Also, because ActiveInbox should work even if it can’t connect to the server – at least for a few days – I regret that this has seemingly instantly disabled ActiveInbox. That’s another investigation for the post-mortem.

Update 2

The attempts to find a quick fix so far haven’t worked.

I’m now looking at rebuilding the server from scratch. I’ll know very quickly if that will resolve the underlying problem – and if it does, I’ll then need several hours to verify that the data will be safely migrated to the new system.

(If you’re wondering why it takes so long… that’s the nature of the beast… but normally I’d have an entire weekend to do this safely while virtually no one is logged in. Doing it on a cold Monday in business hours is a regrettable first!)

Update 3

I’ve managed to confirm that the rebuild is going to work to solve the problem. I’m just moving methodically – especially with your data – to make sure this fairly drastic action goes as smoothly as it can.

In the meantime, see “Keeping working while it’s down” for advice on how you can use deadlines at least.

Update 4 – WE ARE BACK!

What. A. Horrible. Day. But we’re up and running again.

Just refresh Gmail.

I’ll write a post mortem tomorrow, but for now I just need to lay in a dark room.

If any one is in any doubt, I’m really sorry for the downtime. It’s been a manic 8 hours, and I don’t think we’re actually responsible for the root cause (well, we are in the sense we’re still using a platform I’m losing faith in – but I’ll solve that as a next step), but all the time I was incredibly conscious that your workflows were broken and every second counted.

Thank to everyone who wrote in the comments – that gave me a real boost (and Lisa tells me lots of folks were lovely with support too – we feel very grateful for that).



Andy

  

This was written by Andy Mitchell