[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [tor-bugs] #24857 [Core Tor/Tor]: tor 0.3.1.9 100% cpu load
#24857: tor 0.3.1.9 100% cpu load
--------------------------+-----------------------------------
Reporter: Eugene646 | Owner: (none)
Type: defect | Status: needs_information
Priority: Medium | Milestone:
Component: Core Tor/Tor | Version: Tor: 0.3.1.9
Severity: Normal | Resolution:
Keywords: cpu, windows | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
--------------------------+-----------------------------------
Comment (by creideiki):
I think I'm seeing this same bug, or at least something very similar.
I run Tor as a non-exit relay in an LXC container on a Gentoo Linux box on
an Intel Core i7-6700K, with Tor constantly shuffling 10Mb/s of traffic.
Some random time after starting the Tor service, it starts eating one
entire CPU core.
I'm not sure if it's related, since I don't have CPU usage history for
this container and thus don't know when the 100% CPU usage started, but
the Tor log contains a lot of these messages:
Jan 24 05:17:24.000 [warn] Failing because we have 4063 connections
already. Please read doc/TUNING for guidance. [over 16000001 similar
message(s) suppressed in last 21600 seconds]
If that suppression count is correct, that's 740 messages per second,
which seems a bit excessive.
I'm currently running this package:
net-vpn/tor-0.3.2.9::gentoo was built with the following:
USE="-libressl -lzma -scrypt seccomp (-selinux) -systemd -test tor-
hardening -web -zstd" ABI_X86="(64)"
The logs suggest the following timeline:
Dec 9: First occurrence of the "Failing because we have X connections
already" message, X=29967, on a Tor 0.3.2.2-alpha with 55 days of uptime.
Dec 10: Next message, X=29967.
Dec 15: Tor upgraded to 0.3.2.6-alpha.
Dec 22: Next message, X=29967, at 6 days 6 hours uptime.
Dec 30: Next message, X=29967, at 15 days 6 hours uptime.
Jan 1: 2 messages, X=29967, with 6 hours interval.
Jan 2: 1 message, X=29967, 21 hours after the last one.
Jan 7: Machine is rebooted to kernel 4.14.11 to get initial Meltdown
patches. Failure messages start coming 4 minutes after starting the
service, with X=4063.
Jan 7-21: Failure messages come regularly every 6 hours, X=4063.
Jan 21: Tor is upgraded to 0.3.2.9. Failure messages start coming 7
minutes after starting the service, with X=4063.
Jan 21-24 (now): Failure messages come regularly every 6 hours, X=4063.
The Tor process uses 100% CPU (i.e. one core), but not very much memory -
currently 900M virtual, 300M resident.
"perf top" on the Tor process isn't very helpful; at the top is perf's own
BPF stuff at 5-10% CPU time, followed by pthread functions in glibc at
~2-3%. Actual Tor code is way down the list; the only function visible
without going ridiculously low is "assert_connection_ok" at 1% CPU time.
"strace" on the Tor process sees mostly calls to epoll_pwait():
[pid 23279] 1516784382.485502 epoll_pwait(3, [{EPOLLIN, {u32=9,
u64=9}}], 512, 9, NULL, 8) = 1
Actually, a whole lot of them; strace counts around 69000 such calls per
second, with other syscalls during one random second coming in at much
lower counts:
1 accept4
1 close
1 setsockopt
174 getpid
311 futex
682 write
967 getsockopt
967 ioctl
1116 epoll_ctl
1561 read
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24857#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
_______________________________________________
tor-bugs mailing list
tor-bugs@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs