Hi again everyone, I just want to let people here know that I've just closed our TPA-RFC-71 milestone, which aimed at deploying new mail infrastructure to deal with deliverability issues, alongside the Mailman 3 upgrade. It was, to a certain extent, more complicated than we were expecting, and unearthed a lot of old mail issues, but I think we're better off than we were before. Details of the work performed are visible in this milestone: https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/16 As a reminder, here is the original proposal: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-71-emergency-email-deployments-round-2 This also side-tracked into "getting everyone to use the submission server", AKA "you should be able to send mail as @torproject.org": https://gitlab.torproject.org/groups/tpo/tpa/-/milestones/19 But we are not done with email yet! I still have, in my back pocket, a proposal I've been working on for years at this point, labeled TPA-RFC-45: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41009 Many of the ideas from that proposal were actually implemented in TPA-RFC-71, so what's left is essentially "do we host our own mailboxes or outsource" at this point. I hope to work on this some time this year, possibly with a proposal in 2026. A. On 2024-10-02 11:27:35, Antoine Beaupré wrote: > Hi, > > So TPA has adopted this proposal, internally, to make yet another set of > emergency changes to our mail system, to respond to critical issues > affecting delivery and sustainability of our infrastructure. > > I encourage you to read the "Affected users" section and "Timeline" > below. In particular, we will be experimenting with "sender rewriting" > soon, which will involve mangling emails we forward around to try and > fix deliverability on those. > > The schleuder mailing list will also move servers. > > Maintenance windows for those changes will be communicated separately. > > Thank you for your attention! > > PS: and no, we didn't submit this for adoption to everyone, because it > was felt it was mostly technical changes that didn't warrant outside > approval, let me know if that doesn't make sense, of course. > > -- > Antoine Beaupré > torproject.org system administration > > From: Antoine Beaupré via tpa-team <tpa-team@xxxxxxxxxxxxxxxxxxxx> > Subject: [tpa-team] TPA-RFC-71: Emergency email deployments, phase B > To: tpa-team@xxxxxxxxxxxxxxxxxxxx > Cc: micah anderson <micah@xxxxxxxxxxxxxx> > Date: Thu, 26 Sep 2024 16:09:20 -0400 > > --- > title: TPA-RFC-71: Emergency email deployments, phase B > costs: staff > approval: TPA > affected users: all torproject.org email users > deadline: 5 days, 2024-10-01 > status: draft > discussion: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41778 > --- > > Summary: deploy a new sender-rewriting mail forwarder ASAP, migrate > mailing lists off the legacy server to a new machine, migrate the > remaining Schleuder list to the Tails server, upgrade `eugeni`. > > Table of contents: > > - Background > - Proposal > - Actual changes > - Mailman 3 upgrade > - New sender-rewriting mail exchanger > - Schleuder migration > - Upgrade legacy mail server > - Goals > - Must have > - Nice to have > - Non-Goals > - Scope > - Affected users > - Personas > - Timeline > - Optimistic timeline > - Worst case scenario > - Alternatives considered > - References > - History > - Personas descriptions > - Ariel, the fundraiser > - Blipblop, the bot > - Gary, the support guy > - John, the contractor > - Mallory, the director > - Nancy, the fancy sysadmin > - Orpheus, the developer > > # Background > > In [#41773][], we had yet another report of issues with mail delivery, > particularly with email forwards, that are plaguing Gmail-backed > aliases like grants@ and travel@. > > [#41773]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41773 > > This is becoming critical. It has been impeding people's capacity of > using their email at work for a while, but it's been more acute since > google's recent changes in email validation (see [#41399][]) as now > hosts that have adopted the SPF/DKIM rules are bouncing. > > [#41399]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41399 > > On top of that, we're way behind on our buster upgrade schedule. We > still have to upgrade our primary mail server, `eugeni`. The plan for > that ([TPA-RFC-45][], [#41009][]) was to basically re-architecture > everything. That won't happen fast enough for the LTS retirement which > we have crossed two months ago (in July 2024) already. > > [TPA-RFC-45]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-45-mail-architecture > [#41009]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41009 > > So, in essence, our main mail server is unsupported now, and we need > to fix this as soon as possible > > Finally, we also have problems with certain servers (e.g. `state.gov`) > that seem to dislike our bespoke certificate authority (CA) which > makes *receiving* mails difficult for us. > > # Proposal > > So those are the main problems to fix: > > - Email forwarding is broken > - Email reception is unreliable over TLS for some servers > - Mail server is out of date and hard to upgrade (mostly because of > Mailman) > > ## Actual changes > > The proposed solution is: > > - **Mailman 3 upgrade** ([#40471][]) > > - **New sender-rewriting mail exchanger** ([#40987][]) > > - **Schleuder migration** > > - **Upgrade legacy mail server** ([#40694][]) > > [#40471]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40471 > [#40987]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40987 > [#40694]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40694 > > ### Mailman 3 upgrade > > Build a new mailing list server to host the upgraded Mailman 3 > service. Move old lists over and convert them while retaining the old > archives available for posterity. > > This includes lots of URL changes and user-visible disruption, little > can be done to work around that necessary change. We'll do our best to > come up with redirections and rewrite rules, but ultimately this is a > disruptive change. > > This involves yet another authentication system being rolled out, as > Mailman 3 has its own user database, just like Mailman 2. At least > it's one user per site, instead of per list, so it's a slight > improvement. > > This is issue [#40471][]. > > ### New sender-rewriting mail exchanger > > This step is carried over from [TPA-RFC-45][], mostly unchanged. > > [Sender Rewriting Scheme]: https://en.wikipedia.org/wiki/Sender_Rewriting_Scheme > [postsrsd]: https://github.com/roehling/postsrsd > [postforward]: https://github.com/zoni/postforward > > Configure a new "mail exchanger" (MX) server with TLS certificates > signed by our normal public CA (Let's Encrypt). This replaces that > part of `eugeni`, will hopefully resolve issues with `state.gov` and > others ([#41073][], [#41287][], [#40202][], [#33413][]). > > [#33413]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/33413 > [#40202]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40202 > [#41287]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41287 > [#41073]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41073 > > This would handle forwarding mail to other services (e.g. mailing > lists) but also end-users. > > To work around reputation problems with forwards ([#40632][], > [#41524][], [#41773][]), deploy a [Sender Rewriting Scheme][] (SRS) > with [postsrsd][] (packaged in Debian, but [not in the best shape][]) > and [postforward][] (not packaged in Debian, but zero-dependency > Golang program). > > It's possible deploying [ARC][] headers with [OpenARC][], Fastmail's > [authentication milter][] (which [apparently works better][]), or > [rspamd's arc module][] might be sufficient as well, to be tested. > > [OpenARC]: https://tracker.debian.org/pkg/openarc > > [not in the best shape]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1017361 > > Having it on a separate mail exchanger will make it easier to swap in > and out of the infrastructure if problems would occur. > > The mail exchangers should also sign outgoing mail with DKIM, and > *may* start doing better validation of incoming mail. > > [authentication milter]: https://github.com/fastmail/authentication_milter > [apparently works better]: https://old.reddit.com/r/postfix/comments/17bbhd2/about_arc/k5iluvn/ > [rspamd's arc module]: https://rspamd.com/doc/modules/arc.html > [#41524]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41524 > [#40632]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40632 > [ARC]: http://arc-spec.org/ > > ### Schleuder migration > > Migrate the remaining mailing list left (the Community Council) to the > Tails Shleuder server, retiring our Schleuder server entirely. > > This requires configuring the Tails server to accept mail for > `@torproject.org`. > > Note that this may require changing the addresses of the existing > Tails list to `@torproject.org` if Schleuder doesn't support virtual > hosting (which is likely). > > ### Upgrade legacy mail server > > Once Mailman has been safely moved aside and is shown to be working > correctly, upgrade Eugeni using the normal procedures. This should be > a less disruptive upgrade, but is still risky because it's such an old > box with lots of legacy. > > One key idea of this proposal is to keep the legacy mail server, > `eugeni`, in place. It will continue handling the "MTA" (Mail Transfer > Agent) work, which is to relay mail for other hosts, as a legacy > system. > > The full eugeni replacement is seen as too complicated and unnecessary > at this stage. The legacy server will be isolated from the rewriting > forwarder so that outgoing mail is mostly unaffected by the forwarding > changes. > > ## Goals > > This is not an exhaustive solution to all our email problems, > [TPA-RFC-45][] is that longer-term project. > > ### Must have > > - Up to date, supported infrastructure. > > - Functional legacy email forwarding. > > ### Nice to have > > - Improve email forward deliverability to Gmail. > > ### Non-Goals > > - **Clean email forwarding**: email forwards *may* be mangled and > rewritten to appear as coming from `@torproject.org` instead of the > original address. This will be figured out at the implementation > stage. > > - **Mailbox storage**: out of scope, see [TPA-RFC-45][]. It is hoped, > however, that we *eventually* are able to provide such a service, as > the sender-rewriting stuff might be too disruptive in the long run. > > - **Technical debt**: we keep the legacy mail server, `eugeni`. > > - **Improved monitoring**: we won't have a better view in how well we > can deliver email. > > - **High availability**: the new servers will not add additional > "single point of failures", but will not improve our availability > situation (issue [#40604][]) > > [#40604]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40604 > > ## Scope > > This proposal affects the all inbound and outbound email services > hosted under `torproject.org`. Services hosted under `torproject.net` > are *not* affected. > > It also does *not* address directly phishing and scamming attacks > ([#40596][]), but it is hoped the new mail exchanger will provide > a place where it is easier to make such improvements in the future. > > [#40596]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40596 > > ## Affected users > > This affects all users which interact with `torproject.org` and its > subdomains over email. It particularly affects all "tor-internal" > users, users with LDAP accounts, or forwards under `@torproject.org`, > as their mails will get rewritten on the way out. > > ### Personas > > Here we collect a few "personas" and try to see how the changes will > affect them, largely derived from [TPA-RFC-45][], but without the > alpha/beta/prod test groups. > > For *all* users, a common impact is that emails will be rewritten by > the sender rewriting system. As mentioned above, the impact of this > still remains to be clarified, but at least the hidden `Return-Path` > header will be changed for bounces to go to our servers. > > Actual personas are in the Reference section, see [Personas > descriptions][]. > > | Persona | Task | Impact | > |---------|-------------|--------------------------------------------------------------------------| > | Ariel | Fundraising | Improved incoming delivery | > | Blipbot | Bot | No change | > | Gary | Support | Improved incoming delivery, new moderator account on mailing list server | > | John | Contractor | Improved incoming delivery | > | Mallory | Director | Same as Ariel | > | Nancy | Sysadmin | No change in delivery, new moderator account on mailing list server | > | Orpheus | Developer | No change in delivery | > > [Personas descriptions]: #personas-descriptions > > ## Timeline > > ### Optimistic timeline > > - Late September (W39): issue raised again, proposal drafted (now) > - October: > - W40: proposal approved, installing new rewriting server > - W41: rewriting server deployment, new mailman 3 server > - W42: mailman 3 mailing list conversion tests, users required for testing > - W43: mailman 2 retirement, mailman 3 in production > - W44: Schleuder mailing list migration > - November: > - W45: `eugeni` upgrade > > ### Worst case scenario > > - Late September (W39): issue raised again, proposal drafted (now) > - October: > - W40: proposal approved, installing new rewriting server > - W41-44: difficult rewriting server deployment > - November: > - W44-W48: difficult mailman 3 mailing list conversion and testing > - December: > - W49: Schleuder mailing list migration vetoed, Schleuder stays on > `eugeni` > - W50-W51: `eugeni` upgrade postponed to 2025 > - January 2025: > - W3: `eugeni` upgrade > > # Alternatives considered > > We decided to not just run the sender-rewriting on the legacy mail > server because too many things are tangled up in that server. It is > just too risky. > > We have also decided to not upgrade Mailman in place for the same > reason: it's seen as too risky as well, because we'd first need to > upgrade the Debian base system and if that fails, rolling back is too > hard. > > # References > > - [discussion issue][] > > [discussion issue]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/41778 > > ## History > > This is the the *fifth* proposal about our email services, here are > the previous ones: > > * [TPA-RFC-15: Email services][] (rejected, replaced with TPA-RFC-31) > * [TPA-RFC-31: outsource email services][] (rejected, in favor of > TPA-RFC-44 and following) > * [TPA-RFC-44: Email emergency recovery, phase A][] (standard, and > mostly implemented except the sender-rewriting) > * [TPA-RFC-45: Mail architecture][] (still draft) > > [TPA-RFC-15: Email services]: policy/tpa-rfc-15-email-services > [TPA-RFC-31: outsource email services]: policy/tpa-rfc-31-outsource-email > [TPA-RFC-44: Email emergency recovery, phase A]: policy/tpa-rfc-44-email-emergency-recovery > [TPA-RFC-45: Mail architecture]: policy/tpa-rfc-45-mail-architecture > > ## Personas descriptions > > ### Ariel, the fundraiser > > Ariel does a lot of mailing. From talking to fundraisers through > their normal inbox to doing mass newsletters to thousands of people on > CiviCRM, they get a lot done and make sure we have bread on the table > at the end of the month. They're awesome and we want to make them > happy. > > Email is absolutely mission critical for them. Sometimes email gets > lost and that's a major problem. They frequently tell partners their > personal Gmail account address to work around those problems. Sometimes > they send individual emails through CiviCRM because it doesn't work > through Gmail! > > Their email forwards to Google Mail and they now have an LDAP account > to do email delivery. > > ### Blipblop, the bot > > Blipblop is not a real human being, it's a program that receives > mails and acts on them. It can send you a list of bridges (bridgedb), > or a copy of the Tor program (gettor), when requested. It has a > brother bot called Nagios/Icinga who also sends unsolicited mail when > things fail. > > There are also bots that sends email when commits get pushed to some > secret git repositories. > > ### Gary, the support guy > > Gary is the ticket overlord. He eats tickets for breakfast, then > files 10 more before coffee. A hundred tickets is just a normal day at > the office. Tickets come in through email, RT, Discourse, Telegram, > Snapchat and soon, TikTok dances. > > Email is absolutely mission critical, but some days he wishes there > could be slightly less of it. He deals with a lot of spam, and surely > something could be done about that. > > His mail forwards to Riseup and he reads his mail over Thunderbird > and sometimes webmail. Some time after TPA-RFC_44, Gary managed to > finally get an OpenPGP key setup and TPA made him a LDAP account so he > can use the submission server. He has already abandoned the Riseup > webmail for TPO-related email, since it cannot relay mail through the > submission server. > > ### John, the contractor > > John is a freelance contractor that's really into privacy. He runs his > own relays with some cools hacks on Amazon, automatically deployed > with Terraform. He typically run his own infra in the cloud, but > for email he just got tired of fighting and moved his stuff to > Microsoft's Office 365 and Outlook. > > Email is important, but not absolutely mission critical. The > submission server doesn't currently work because Outlook doesn't allow > you to add just an SMTP server. John does have an LDAP account, > however. > > ### Mallory, the director > > Mallory also does a lot of mailing. She's on about a dozen aliases > and mailing lists from accounting to HR and other unfathomable > things. She also deals with funders, job applicants, contractors, > volunteers, and staff. > > Email is absolutely mission critical for her. She often fails to > contact funders and critical partners because `state.gov` blocks our > email -- or we block theirs! Sometimes, she gets told through LinkedIn > that a job application failed, because mail bounced at Gmail. > > She has an LDAP account and it forwards to Gmail. She uses Apple Mail > to read their mail. > > ### Nancy, the fancy sysadmin > > Nancy has all the elite skills in the world. She can configure a > Postfix server with her left hand while her right hand writes the > Puppet manifest for the Dovecot authentication backend. She browses > her mail through a UUCP over SSH tunnel using mutt. She runs her own > mail server in her basement since 1996. > > Email is a pain in the back and she kind of hates it, but she still > believes entitled to run their own mail server. > > Her email is, of course, hosted on her own mail server, and she has > an LDAP account. She has already reconfigured her Postfix server to > relay mail through the submission servers. > > ### Orpheus, the developer > > Orpheus doesn't particular like or dislike email, but sometimes has > to use it to talk to people instead of compilers. They sometimes have > to talk to funders (`#grantlyfe`), external researchers, teammates or > other teams, and that often happens over email. Sometimes email is > used to get important things like ticket updates from GitLab or > security disclosures from third parties. > > They have an LDAP account and it forwards to their self-hosted mail > server on a OVH virtual machine. They have already reconfigured their > mail server to relay mail over SSH through the jump host, to the > surprise of the TPA team. > > Email is not mission critical, and it's kind of nice when it goes > down because they can get in the zone, but it should really be working > eventually. > > -- > Antoine Beaupré > torproject.org system administration > -- > tpa-team mailing list > tpa-team@xxxxxxxxxxxxxxxxxxxx > https://lists.torproject.org/cgi-bin/mailman/listinfo/tpa-team > _______________________________________________ > tor-project mailing list > tor-project@xxxxxxxxxxxxxxxxxxxx > https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project -- Antoine Beaupré torproject.org system administration
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ tor-project mailing list -- tor-project@xxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to tor-project-leave@xxxxxxxxxxxxxxxxxxxx