Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Page MenuHomePhabricator

tools.wmflabs.org email isn't received (puppet failure at tools-mail-02)
Closed, ResolvedPublic

Description

Email sent to username@tools.wmflabs.org isn't be being received, I'm guessing its not being forwarded because I didn't get a bounce-back email.

Event Timeline

DNS looks fine:

zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~ $ host -v tools.wmflabs.org
[...]
Trying "tools.wmflabs.org"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2905
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tools.wmflabs.org.		IN	MX

;; ANSWER SECTION:
tools.wmflabs.org.	58	IN	MX	10 mail.tools.wmflabs.org.

Received 56 bytes from 66.253.214.16#53 in 1 ms
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~ $ host mail.tools.wmflabs.org
mail.tools.wmflabs.org has address 185.15.56.63
zhuyifei1999@zhuyifei1999-ThinkPad-T480 ~ $ host 185.15.56.63
63.56.15.185.in-addr.arpa domain name pointer mail.tools.wmflabs.org.
63.56.15.185.in-addr.arpa domain name pointer instance-tools-mail-02.tools.wmflabs.org.
63.56.15.185.in-addr.arpa domain name pointer mailsender.tools.wmflabs.org.

I see lots of fails that look like config issue in exim4 mainlog:

07:54:21 0 ✓ zhuyifei1999@tools-mail-02: ~$ sudo tail /var/log/exim4/mainlog
2020-03-28 07:52:04 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:05 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:06 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:06 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:07 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:08 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:09 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:10 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:11 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory
2020-03-28 07:52:11 H=[REDACTED] X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no temporarily rejected MAIL <info@[REDACTED]>: failed to expand ACL string "${lookup{$sender_host_address}iplsearch{/etc/exim4/ratelimits/host_hourly_limits}}": failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory

failed to open /etc/exim4/ratelimits/host_hourly_limits for linear search: No such file or directory

Empty directory:

07:58:23 0 ✓ zhuyifei1999@tools-mail-02: ~$ sudo ls -al /etc/exim4/ratelimits/
total 8
dr-xr-x--- 2 root Debian-exim 4096 Mar 27 21:55 .
drwxr-xr-x 6 root root        4096 Mar 27 21:55 ..

This file is supposed to be provisioned in https://github.com/wikimedia/puppet/blob/1a925f799baaa8bb9f0727c9e5cc5cd7667ed594/modules/profile/manifests/toolforge/mailrelay.pp#L43:

file { '/etc/exim4/ratelimits/host_hourly_limits':
    ensure  => present,
    owner   => 'root',
    group   => 'Debian-exim',
    mode    => '0440',
    require => File['/etc/exim4/ratelimits'],
    source  => 'puppet:///modules/profile/toolforge/mailrelay/ratelimits/host_hourly_limits',
}

puppet fail?

root@tools-mail-02:~# puppet agent -tv
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for tools-mail-02.tools.eqiad.wmflabs
Notice: /Stage[main]/Base::Environment/Tidy[/var/tmp/core]: Tidying 0 files
Info: Applying configuration version '(1a925f799b) Bstorm - Add rate limiting to profile::toolforge::mailrelay with warn action'
Notice: The LDAP client stack for this host is: classic/sudoldap
Notice: /Stage[main]/Profile::Ldap::Client::Labs/Notify[LDAP client stack]/message: defined 'message' as 'The LDAP client stack for this host is: classic/sudoldap'
Error: /Stage[main]/Profile::Toolforge::Mailrelay/File[/etc/exim4/ratelimits/sender_hourly_limits]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/profile/toolforge/mailrelay/ratelimits/sender_hourly_limits
Error: /Stage[main]/Profile::Toolforge::Mailrelay/File[/etc/exim4/ratelimits/host_hourly_limits]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/profile/toolforge/mailrelay/ratelimits/host_hourly_limits
Notice: /Stage[main]/Profile::Toolforge::Mailrelay/Letsencrypt::Cert::Integrated[tools_mail]/Exec[acme-setup-acme-tools_mail]/returns: executed successfully
Info: Class[Profile::Toolforge::Mailrelay]: Unscheduling all events on Class[Profile::Toolforge::Mailrelay]
Info: Stage[main]: Unscheduling all events on Stage[main]
Notice: Applied catalog in 12.70 seconds
zhuyifei1999 renamed this task from tools.wmflabs.org email isn't received to tools.wmflabs.org email isn't received (puppet failure at tools-mail-02).Mar 28 2020, 8:03 AM

Change 584102 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[operations/puppet@production] toolforge/mailrelay: Fix wrong source URI

https://gerrit.wikimedia.org/r/584102

Change 584102 merged by Bstorm:
[operations/puppet@production] toolforge/mailrelay: Fix wrong source URI

https://gerrit.wikimedia.org/r/584102

It's working again. The log messages this produces may or may not be useful, but we'll be able to test the logic at least.

2020-03-30 14:56:08 H=tools-sgeexec-0921.tools.eqiad.wmflabs [172.16.1.235] Warning: Sender address tools.jimmy@tools.wmflabs.org has exceeded rate limit of messages per 1h

Bstorm claimed this task.

From the logs, I think mail is being sent now.

Yep its working, got a test email I set through, thanks for the help.

test email sent/received via (tool).maintainers@tools.wmflabs.org - appears to be working as intended.