{"id":411,"date":"2015-09-07T16:41:19","date_gmt":"2015-09-07T12:41:19","guid":{"rendered":"http:\/\/blog.mgvmi-hipl.ru\/?p=411"},"modified":"2015-09-07T16:41:19","modified_gmt":"2015-09-07T12:41:19","slug":"ntp-%d0%bd%d0%b5-%d1%81%d0%b8%d0%bd%d1%85%d1%80%d0%be%d0%bd%d0%b8%d0%b7%d0%b8%d1%80%d1%83%d0%b5%d1%82-%d0%b2%d1%80%d0%b5%d0%bc%d1%8f","status":"publish","type":"post","link":"https:\/\/blog.5flor.ru\/?p=411","title":{"rendered":"NTP \u043d\u0435 \u0441\u0438\u043d\u0445\u0440\u043e\u043d\u0438\u0437\u0438\u0440\u0443\u0435\u0442 \u0432\u0440\u0435\u043c\u044f"},"content":{"rendered":"<p>Fixing NTP Refusing to Sync<br \/>\nI have just been confronted by NTP absolutely refusing to touch my system\u2019s clock. The trouble with NTP is that it is absolute PITA to debug it at all since when it does not get in sync with its peers, it goes at great lengths to make its reasons as incomprehensible as possible.<br \/>\nFor some reason, my system had absolutely massive drift \u2013 something in the order of half a second per minute, making the clock drift by several tens of minutes per day. So I installed NTP and hoped that it would magically fix up the issue, but it turns out that NTP by itself is absolutely unhelpful not only in cases of big offset, but also in cases of big drift \u2013 it will fix your clock when it is slightly inaccurate, but not when it is inaccurate a lot (\u2026that is, when you would want to use it all the more).<br \/>\nFirst thing I did was check the hardware\u2019s opinion. Comparing date and hwclock &#8212;show has shown that the hardware clock is doing fine, only kernel\u2019s idea of time is drifting off. Next, it\u2019s time to see what NTP thinks about its peers:<\/p>\n<p>ntpq&gt; peers<br \/>\nremote refid st t when poll reach delay offset jitter<br \/>\n==============================================================================<br \/>\ntik.cesnet.cz .GPS. 1 u 12 64 377 0.641 8494.05 2911.29<br \/>\ntak.cesnet.cz .GPS. 1 u 2 64 377 0.636 8594.86 2945.05<\/p>\n<p>NTP polls each peer every \u201cpoll\u201d seconds, \u201cwhen\u201d is relative time of last poll; \u201creach\u201d keeps track of last successful polls, 377 is best. \u201cDelay\u201d is network delay, this is fine. \u201cOffset\u201d is the offset between local and peer clock, it\u2019s at 8.5s now \u2013 not so good, but trouble is it gets bigger quickly. But what\u2019s the real culprit is \u201cjitter\u201d \u2013 it\u2019s huge! This means that the variance of offsets is huge \u2013 to put it simply, the offset is very different each time it is measured. Since no symbols are printed in the first column of the output, there is no peer synchronization going on.<\/p>\n<p>So if we know a lot about NTP already, the high jitter should hint us that the offset measurements are unreliable. But the network connection of our server is very good, it would be nice to look at the actual measurements. Instead of peers, let\u2019s look at their associations:<\/p>\n<p>ntpq&gt; as<\/p>\n<p>ind assID status conf reach auth condition last_event cnt<br \/>\n===========================================================<br \/>\n1 55713 9014 yes yes none reject reachable 1<br \/>\n2 55714 9014 yes yes none reject reachable 1<br \/>\nNTP is not liking our peers. No surprise, with the big jitter. But what we are after are the assID numbers:<\/p>\n<p>ntpq&gt; rv 55713<br \/>\nassID=55713 status=9014 reach, conf, 1 event, event_reach,<br \/>\nsrcadr=tik.cesnet.cz, srcport=123, dstadr=195.113.20.142, dstport=123,<br \/>\nleap=00, stratum=1, precision=-20, rootdelay=0.000,<br \/>\nrootdispersion=0.000, refid=GPS, reach=377, unreach=0, hmode=3, pmode=4,<br \/>\nhpoll=6, ppoll=6, flash=400 peer_dist, keyid=0, ttl=0, offset=13041.231,<br \/>\ndelay=0.602, dispersion=0.944, jitter=2918.331,<br \/>\nreftime=cf803b51.ddd3e70e Mon, Apr 26 2010 18:18:25.866,<br \/>\norg=cf803b83.e9b29181 Mon, Apr 26 2010 18:19:15.912,<br \/>\nrec=cf803b76.df382c7c Mon, Apr 26 2010 18:19:02.871,<br \/>\nxmt=cf803b76.df0d40c7 Mon, Apr 26 2010 18:19:02.871,<br \/>\nfiltdelay= 0.60 0.64 0.60 0.51 0.82 0.67 0.69 0.64,<br \/>\nfiltoffset= 13041.2 12385.8 11720.4 11075.2 10409.6 9774.54 9129.22 8494.06,<br \/>\nfiltdisp= 0.00 0.98 1.97 2.93 3.92 4.86 5.82 6.77<br \/>\nLooking at the last three lines, the reason for the huge jitter finally seems clear! Our clock drifts so fast that the offset will go up by several seconds through our few measurements.<\/p>\n<p>Unfortunately, NTP does not seem to be giving us the actual estimated drift value between local clock and the peer. This would be very useful since that\u2019s actually what makes NTP decide whether go ahead and sync or keep its hands away from the clock; it is said that 500ppm is the max. drift value for possible synchronization, but I don\u2019t know how to connect that to any of the other numbers I see; when the clock is already in sync, it is probably the \u2018frequency\u2019 value in \u2018rv\u2019 (and it is stored in the drift file), but this value stays untouched before synchronization. Too bad.<\/p>\n<p>So, now we know the issue is that kernel clock is going too slow and that NTP is not going to fix it for ourselves. So, we must resort to manual tinkering using adjtimex:<\/p>\n<p># adjtimex -p<br \/>\nmode: 0<br \/>\noffset: 0<br \/>\nfrequency: 0<br \/>\nmaxerror: 0<br \/>\nesterror: 0<br \/>\nstatus: 64<br \/>\ntime_constant: 4<br \/>\nprecision: 1<br \/>\ntolerance: 32768000<br \/>\ntick: 9900<br \/>\nraw time: 1272299204s 17444us = 1272299204.017444<br \/>\nreturn value = 5<br \/>\nWow, a lot of numbers. But the one that tells how fast the clock is going is the \u2018tick\u2019 value, and you can adjust it using adjtimex -t 10000 \u2013 that will make the clock go a lot faster, and is also sort-of canonical value. Let\u2019s just do that and restart ntpd:<\/p>\n<p>remote refid st t when poll reach delay offset jitter<br \/>\n==============================================================================<br \/>\ntik.cesnet.cz .GPS. 1 u 1 64 7 0.659 16852.5 1.840<br \/>\ntak.cesnet.cz .GPS. 1 u 2 64 7 0.665 16852.5 1.863<br \/>\nThis is MUCH better! In fact, after few minutes NTP will decide to step the clock to compensate the offset, and after another while it will finally get in sync with the peers. If the jitter is still too big (but different), keep tweaking the tick value.<\/p>\n<p>EDIT: It seems that alternatively, you can try to change your clock source \u2013 this might help especially in case of virtualization:<\/p>\n<p># cat \/sys\/devices\/system\/clocksource\/clocksource0\/available_clocksource<br \/>\nhpet acpi_pm jiffies tsc<br \/>\n# cat \/sys\/devices\/system\/clocksource\/clocksource0\/current_clocksource<br \/>\nhpet<\/p>\n<p>Hope this helps if your NTP also refuses to fix your clock.<\/p>\n<p>Open questions remain:<\/p>\n<p>Why was my tick value so off? I guess I will never know. Maybe a reboot would fix it too, but I wasn\u2019t keen to do that.<br \/>\nHow to determine drift-per-peer value to see how much out of bounds it is?<br \/>\nHow to make NTP automatically fix even huge drifts?<br \/>\nWhy is NTP crafted to be so hard to debug without spending tens of minutes googling, staring at bunches of floats and decoding bitmasks manually?<br \/>\nThanks to prema and otis for ideas and help.<\/p>\n<p>Categories: linuxTags: clock, debian, ntp<br \/>\nComments (8)Trackbacks (2)Leave a commentTrackback<\/p>\n<p>Stefan Seyfried<br \/>\nApril 28th, 2010 at 18:58 | #1 Reply | Quote<br \/>\nHi pasky,<br \/>\nI have seen similar stuff when \/etc\/adjtime was totally off. \/etc\/adjtime is used to record the time drift and to initialize adjtimex values.<br \/>\nI think it should not be used when NTP is in use, unless the kernel time is off grossly, which is seldom the case nowadays.<br \/>\nUnfortunately, many distributions still do a \u201chwclock \u2013systohc\u201d on shutdown \u2013 which updates \/etc\/adjtime, even nowadays, where even the CMOS clocks on PCs are pretty accurate. I think this is a bug, but I gave up, at least for SUSE to argue, and I just disable it manually in the shutdown scripts.<br \/>\nHope this helps,<br \/>\nseife<\/p>\n<p>Stefan Seyfried<br \/>\nApril 28th, 2010 at 18:59 | #2 Reply | Quote<br \/>\nOh, I forgot: usually just removing \/etc\/adjtime and doing \u201chwclock \u2013systohc \u2013utc\u201d once is enough to get it fixed for all time.<\/p>\n<p>Larry McCarthy<br \/>\nMay 29th, 2010 at 17:56 | #3 Reply | Quote<br \/>\nYou asked: \u201cWhy was my tick value so off? I guess I will never know. Maybe a reboot would fix it too, but I wasn\u2019t keen to do that.\u201d<br \/>\nI was having drift problems, so installed adjtime. When adjtime installed is spent 70 seconds \u201cinitializing\u201d itself and set ticks to 9768. As you can imagine, this made the problem waaaay worse, to the point that NTP wouldn\u2019t sync and I\u2019d lost ~30 minutes over night.<br \/>\nSo, your excellent adjtime vs. NTP analysis was just the thing I needed! Thanks!<\/p>\n<p>Larry McCarthy<br \/>\nMay 29th, 2010 at 18:20 | #4 Reply | Quote<br \/>\nSo, based on this really excellent article (I\u2019m stoked; this helped *a lot*), here\u2019s a little trick (sorry; don\u2019t know how to tag \u201ccode\u201d here):<br \/>\n# \/etc\/init.d\/ntp stop ; ntpd -q ; sleep 100s ; ntpd -q; \/etc\/init.d\/ntp start<br \/>\nThis stops NTP, forces it to sync the clock (to \u201cprime the pump\u201d), sleeps for 100 seconds, forces a second clock sync, and restarts NTP. It produces output like this:<br \/>\nStopping NTP server: ntpd.<br \/>\nntpd: time set +12.262938s<br \/>\nntpd: time set +2.623381s &lt;\u2014 drift per 100s<br \/>\nStarting NTP server: ntpd.<br \/>\nThe second &#171;time set&#187; \u2013 +2.623381s \u2013 is your 100s drift. Take that drift, (as a proportion of the current ticks), add (use the sign of the drift \u2013 if the sign on the drift is &#171;-&#171;, you&#8217;d subtract ticks) it to the adjtime adjustment, and repeat &#8217;til satisfied, like so:<br \/>\n# adjtimex -p<br \/>\n[\u2026]<br \/>\ntick: 10000<br \/>\n[\u2026]<br \/>\n# # add (+2.62s \/ 100s) * 10000 = 262 ticks<br \/>\n# adjtimex -t 10262<br \/>\n# \/etc\/init.d\/ntp stop ; ntpd -q ; sleep 100s ; ntpd -q; \/etc\/init.d\/ntp start<br \/>\nStopping NTP server: ntpd.<br \/>\nntpd: time set +3.044932s<br \/>\nntpd: time set -0.259021s<br \/>\nStarting NTP server: ntpd<br \/>\nNow, -0.26s drift per 100s is probably &#171;correctable&#187; by NTP. If not, repeat this process some more\u2026<br \/>\n\u2013 Larry<\/p>\n<p>Larry McCarthy<br \/>\nMay 29th, 2010 at 18:53 | #5 Reply | Quote<br \/>\nSorry to go on, but just one more thing, I promise\u2026<br \/>\nSo, once you get the drift below a second, you probably want to do a longer drift sample. To keep the arithmetic simple, continuing with the above example:<br \/>\n# \/etc\/init.d\/ntp stop ; ntpd -q ; sleep 1026s ; ntpd -q; \/etc\/init.d\/ntp start<br \/>\nStopping NTP server: ntpd.<br \/>\nntpd: time set -1.428262s &lt;\u2013 Remember, ignore this. Take a 17m coffee break.<br \/>\nntpd: time set -2.333034s<br \/>\nStarting NTP server: ntpd.<br \/>\n# # OK, remember, [tick] is currently 10262, so ((-2.33 \/ 1026) * 10262) ~ (-23)<br \/>\n# # and 10262 + (-23) = 10239, so\u2026<br \/>\n# adjtimex -t 10239<br \/>\nTo watch the results, I do:<br \/>\n# watch -n 8 ntpq -p<br \/>\nYeah, 8s is kinda fast, but I&#8217;m the kind of guy who takes surface streets when the freeway&#8217;s slow, just to have something to do\u2026<br \/>\nThanks again for this great post!<br \/>\n\u2013 Larry<\/p>\n<p>Larry McCarthy<br \/>\nMay 29th, 2010 at 19:04 | #6 Reply | Quote<br \/>\nI know I promised, but this is important:<br \/>\nOnce you\u2019re happy with your jitter values \u2013 Single digits! Finally! \u2013 as Stefan said, you MUST do:<br \/>\n# rm \/etc\/adjtime<br \/>\n# hwclock \u2013systohc \u2013utc<br \/>\n*NOTE*: Windows dual-booters probably want to do:<br \/>\n# hwclock \u2013systohc \u2013localtime<br \/>\nAnd those are double-dashes; WordPress turns them into a single em-dash. Sorry.<br \/>\nOK. That\u2019s it. I promise. \ud83d\ude42<\/p>\n<p>Deiva<br \/>\nMay 10th, 2013 at 07:27 | #7 Reply | Quote<br \/>\nHi,<br \/>\nFor me manually the ntp service syncing using ntpdate -u but when we start the service its giving the below output\u2026please advice should i need to change the adtimex here am using redhat linux<br \/>\n[root@deiva ~]# ntpq -pn<br \/>\nremote refid st t when poll reach delay offset jitter<br \/>\n==============================================================================<br \/>\n*127.127.1.0 .LOCL. 10 l 7 64 37 0.000 0.000 0.001<br \/>\n172.16.8.4 .GPS. 1 u 8 64 37 5.462 -3029.9 1910.44<br \/>\n172.16.8.5 .GPS. 1 u 3 64 37 5.198 -3091.9 1946.43<br \/>\n[root@deiva ~]# service ntpd restart<br \/>\nShutting down ntpd: [ OK ]<br \/>\nntpd: Synchronizing with time server: [ OK ]<br \/>\nSyncing hardware clock to system time [ OK ]<br \/>\nStarting ntpd: [ OK ]<br \/>\n[root@deiva ~]# date<br \/>\nSat May 4 17:36:25 CEST 2013<br \/>\n[root@deiva ~]# date<br \/>\nSat May 4 17:39:56 CEST 2013<br \/>\n[root@deiva ~]# ntpq -pn<br \/>\nremote refid st t when poll reach delay offset jitter<br \/>\n==============================================================================<br \/>\n*127.127.1.0 .LOCL. 10 l 33 64 17 0.000 0.000 0.001<br \/>\n172.16.8.4 .GPS. 1 u 32 64 17 4.721 -56.787 2147.63<br \/>\n172.16.8.5 .GPS. 1 u 31 64 17 4.971 -72.407 2161.70<br \/>\n[root@deiva ~]# exit<br \/>\n== manuall service ==<br \/>\n[root@deiva ~]# ntpdate -u 172.16.8.4<br \/>\n30 Apr 20:12:28 ntpdate[96890]: step time server 172.16.8.4 offset -0.975695 sec<br \/>\n[root@deiva ~]# ntpq -pn<br \/>\nremote refid st t when poll reach delay offset jitter<br \/>\n==============================================================================<br \/>\n127.127.1.0 .LOCL. 10 l 21 64 3 0.000 0.000 0.001<br \/>\n172.16.8.4 .GPS. 1 u 21 64 3 5.175 -55.482 0.702<br \/>\n172.16.8.5 .GPS. 1 u 19 64 3 5.673 -71.212 15.585<br \/>\nntpq&gt; as<br \/>\nind assID status conf reach auth condition last_event cnt<br \/>\n===========================================================<br \/>\n1 56510 9614 yes yes none sys.peer reachable 1<br \/>\n2 56511 9014 yes yes none reject reachable 1<br \/>\n3 56512 9014 yes yes none reject reachable 1<br \/>\nntpq&gt; rv 56511<br \/>\nassID=56511 status=9014 reach, conf, 1 event, event_reach,<br \/>\nsrcadr=172.16.8.4, srcport=123, dstadr=10.69.23.2, dstport=123, leap=00,<br \/>\nstratum=1, precision=-9, rootdelay=0.000, rootdispersion=5.676,<br \/>\nrefid=GPS, reach=377, unreach=0, hmode=3, pmode=4, hpoll=10, ppoll=10,<br \/>\nflash=400 peer_dist, keyid=0, ttl=0, offset=-23003.227, delay=4.651,<br \/>\ndispersion=12.835, jitter=10710.871,<br \/>\nreftime=d529fedf.146a7ef9 Tue, Apr 30 2013 10:27:11.079,<br \/>\norg=d529fef8.9c49ba5e Tue, Apr 30 2013 10:27:36.610,<br \/>\nrec=d529ff17.88f783ee Tue, Apr 30 2013 10:28:07.535,<br \/>\nxmt=d529ff17.8242ad4e Tue, Apr 30 2013 10:28:07.508,<br \/>\nfiltdelay= 4.95 4.65 5.95 6.56 4.99 4.82 5.27 5.94,<br \/>\nfiltoffset= -30922. -23003. -19036. -15038. -13023<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Fixing NTP Refusing to Sync I have just been confronted by NTP absolutely refusing to touch my system\u2019s clock. The trouble with NTP is that it is absolute PITA to debug it at all since when it does not get &hellip; <a href=\"https:\/\/blog.5flor.ru\/?p=411\">\u0427\u0438\u0442\u0430\u0442\u044c \u0434\u0430\u043b\u0435\u0435 <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,1],"tags":[],"class_list":["post-411","post","type-post","status-publish","format-standard","hentry","category-linux","category-1"],"_links":{"self":[{"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=\/wp\/v2\/posts\/411"}],"collection":[{"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=411"}],"version-history":[{"count":0,"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=\/wp\/v2\/posts\/411\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=411"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=411"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.5flor.ru\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=411"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}