DansGuardian Documentation Wiki

You are here: Main Index » common_problems


|

Wiki Information

Differences

This shows you the differences between the selected revision and the current version of the page.

common_problems 2009/06/30 15:04 common_problems 2010/05/25 22:26 current
Line 121: Line 121:
Use your favorite administration tool or text editor or a GUI or commands like\ \ <color #351>chmod\ -R\ ...+...\ ...</color>\ \ and\ \ Use your favorite administration tool or text editor or a GUI or commands like\ \ <color #351>chmod\ -R\ ...+...\ ...</color>\ \ and\ \
<color #351>chown\ -R\ ...**:**...\ ...</color>\ \ to do these things. <color #351>chown\ -R\ ...**:**...\ ...</color>\ \ to do these things.
 +
 +(There is disagreement in the DansGuardian community about whether nobody:nobody (or\ nouser:nogroup) is just a safety fallback value indicating that <color #351>daemonuser=</color> and <color #351>daemongroup=</color> have not yet been set when they should have been, or is a real value that should be made to work.
 +[This is related to the larger question of whether many of the build (./configure) defaults are just minimal values, or are actually intended to be used by typical DansGuardians.]
 +Be sure to follow the lead of your distribution in this regard: for example if your distribution supplies a pre-built DansGuardian and says to //not// use <color #351>daemonuser=</color> and <color #351>daemongroup=</color>, follow their instructions.)
If you're using anti-virus scanning, optimally DansGuardian and clamd If you're using anti-virus scanning, optimally DansGuardian and clamd
Line 128: Line 132:
so just clamd being a member of the so just clamd being a member of the
DansGuardian daemon group is sufficient. DansGuardian daemon group is sufficient.
- + 
 +Incomplete DansGuardian daemon permissions setting can be another cause of the rather mysterious message <color maroon>Unable\ to\ getgrnam():\ Success</color>. (The most frequent cause is the value of daemongroup= is not defined in /etc/group, usually because only `useradd` was executed and `groupadd` was forgotten.) The user:group that DansGuardian runs as (possibly nobody:nogroup) must not be forbidden from accessing /etc/group. //For example// you may need to make the userid that DansGuardian runs as a member of group 'daemon'.
==== Don't Prematurely Lock Down == ==== Don't Prematurely Lock Down ==
Line 172: Line 177:
Activate the lines for all the categories you care about. Activate the lines for all the categories you care about.
-But don't activate too many more lines than necessary (especially not those for languages none of your users ever access anyway).+But don't activate too many more lines than necessary (especially not those for languages  
 +none of your users can read anyway
 +most especially not for Chinese, Japanese, or Malay).
Because of the inevitable false positives even with sophisticated weighted phrase list scoring, Because of the inevitable false positives even with sophisticated weighted phrase list scoring,
every category you activate will block a few more legitimate web pages. every category you activate will block a few more legitimate web pages.
Line 211: Line 218:
you can by [[Log File Analysis#Using Squid Stub Logs|setting some configuration options]] you can by [[Log File Analysis#Using Squid Stub Logs|setting some configuration options]]
in DansGuardian (and maybe in Squid too). in DansGuardian (and maybe in Squid too).
- 
==== Many Blacklists Actually Categorize Rathern Than Ban ==== ==== Many Blacklists Actually Categorize Rathern Than Ban ====
Many "blacklists" actually categorize websites; Many "blacklists" actually categorize websites;
Line 219: Line 225:
you consider "bad", you consider "bad",
rather than all the website categories. rather than all the website categories.
-For example you probably don't want to ban the "homerepair" category,+For example you probably don't want to ban the "homerepair" category (or maybe you do),
and depending on your environment you may or may not not want to ban the "mail" category. and depending on your environment you may or may not not want to ban the "mail" category.
Line 240: Line 246:
and so tries to match the single most frequently used blacklist, and so tries to match the single most frequently used blacklist,
but DansGuardian configuration can be expanded far beyond the defaults if you wish.) but DansGuardian configuration can be expanded far beyond the defaults if you wish.)
 +
==== Squid Works By Itself, But Not With DansGuardian === ==== Squid Works By Itself, But Not With DansGuardian ===
When an end user computer accesses Squid directly, When an end user computer accesses Squid directly,
Line 311: Line 318:
  - Omit any leading period\\ (this may be different from some other software that won't work right with//out// the leading period)   - Omit any leading period\\ (this may be different from some other software that won't work right with//out// the leading period)
  - Use the longest possible (i.e. most specific) entry that will work yet remain flexible   - Use the longest possible (i.e. most specific) entry that will work yet remain flexible
-  - If shorter entries already exist and they conflict with your new entry, first lengthen the existing entries (without making them inoperative)+  - If shorter entries already exist and they conflict with your new entry, try using both 'banned...' and 'exception... ' lists (the 'exception...' lists take precedence, but only for exactly what's specified in them, for example banning "foobar.org" then excepting "bake.foobar.org" allows any webservers named *.bake.foobar.org but disallows all the rest of the foobar.org webservers)\\ \ another alternative is to try lengthening  the existing 'banned...' entries (without making them inoperative)
==== Operation Under NetBSD/FreeBSD/OpenBSD Is Somewhat Unreliable ==== ==== Operation Under NetBSD/FreeBSD/OpenBSD Is Somewhat Unreliable ====
-For years there has been a very low level but naggingly persistent series of reports that DansGuardian doesn't run as reliably under OpenBSD as it does under Linux. +For years there has been a low level but naggingly persistent series of reports that DansGuardian doesn't run as reliably under OpenBSD as it does under Linux.
Most users never see any problem at all\ \ ...but a few unlucky ones do. Most users never see any problem at all\ \ ...but a few unlucky ones do.
 +Occasional failure of a DansGuardian child process may be tolerable, as recovery is automatic and the jerky operation is visible to only a single user. However frequent failure of all (or at least most) DansGuardian child processes, or failure of the DansGuardian parent process, will not be tolerable.
-To put it as briefly as possible (perhaps oversimplifying), __BSD-derived kernels may need to be tuned in order to obtain stable DansGuardian operation__. If there are kernel issues, DansGuardian is likely to start up and run for a while but then fail with a segment fault (SIGSEGV) in an "impossible" location. The kernel should be tuned for "peak" conditions; if it's been closely tuned for "average" conditions (or worse tuned to "minimize" kernel size), unstable DansGuardian operation is almost inevitable. +To improve the web search rankings of this important question, its detailed answer has been moved out to [[Operation Under NetBSD/FreeBSD/OpenBSD|its own separate document]]. (Also see questions Installation#26 and Installation#26b in the [[FAQ|Wiki FAQ]].)
-Fortunately even though the problem has not yet been completely pinned down, it's fairly well understood.  The current OpenBSD kernel doesn't handle a couple of conditions as well as a typical Linux kernel. One of those problem conditions is sustained high load; the other condition is long-lived processes whose memory address space gets very fragmented (usually because they handle lots and lots of different small requests). The three common applications that are most likely to expose these kernel limitations are Apache (the web server), Squid, and DansGuardian. +==== Eliminate Weird ClamAV Library Dependency ====
-The programmers behind OpenBSD are //very// aware of these problems, and keep fiddling with the problematic parts of their kernel. As a result, the exact failure symptoms can change considerably from one kernel version to another. The kernel itself might panic (thus shutting down the entire system), or an application may just hang, or an application may disappear without proper warning or notice, or an application may shut itself down after receiving more failure return codes than it can handle. +In some circumstances some DansGuardian executables 
 +will refuse to start up after issuing a message something like this: 
 +<code> 
 +dansguardian: error while loading shared libraries: libclamav.so.5: cannot open 
 +shared object file: No such file or directory 
 +</code> 
 +This strange dependency on ClamAV can manifest //even if//  
 +you don't use any anti-virus at all and have configured 
 +your dansguardian.conf accordingly.
-The chief problems seem to be i)\ an apparent shortage of memory because of massive address space fragmentation and ii)\ a lack of socket structures. It may also be the case that iii)\ there are not enough "file descriptor" structures. Recoding applications to better handle the known OpenBSD limitations does not seem to be a reasonable option, as it would probably both a)\ require the tremendous effort of a complete rewrite and b)\ just trade stability under OpenBSD for //in//stability under Linux. +Eliminating this weird ClamAV library dependency is  
 +always possible (in fact straightforward)
 +//but __both__// build-time (./configure) and run-time 
 +(dansguardian.conf) options may need to be adjusted 
 +the first time. 
 +The easiest way to correct the build-time options may be 
 +to obtain a corrected DansGuardian package.  
 +(Another alternative is to //re-build//  
 +the dansguardian executable yourself.)
-Such problems are performance-related; the faster an application runs, the less time the kernel has to cover over these flaws before they grow large enough to become visible. Since performance is continually improved in most applications, later versions of most applications expose worse problems. +When building DansGuardian, use the <color #351>--enable-clamd<;/color> ./configure option, but //not// the <color #351>--enable-clamav</color> 
 +option too.  
 +In an ideal world,  
 +all DansGuardian packages obtained from distribution repositories 
 +should already be built this way. 
 +However in the real (not ideal) world, repository errors are possible. 
 +Once DansGuardian is bult correctly,  
 +you can then control whether or not to use ClamAV 
 +purely through the configuration options in dansguardian.conf; 
 +in other words once the build/configure options are correct, 
 +you will never need to revisit them  
 +no matter what you do with anti-virus.
-Problems are often noticed right after an application upgrade. Administrators focus more attention on the application right after an upgrade. The system load level has likely risen over time, but slowly so nobody noticed. Newer application versions typically provide somewhat better performance, making it more likely the kernel will exhibit problems. And other applications and services may have been changed at the same time. As a result of all these things, it's easy to mis-conclude problems have something to do with a "bug" that was recently introduced into the application. +In dansguardian.conf, use the 'clamdscan' option rather than the 'clamav' option. The 'clamdscan' option interfaces to ClamAV through the interprocess named pipe socket provided by the clam daemon. (The old 'clamav' option tries to interface to ClamAV through a version dependent library [a *nix "shared object" (.so) is analogous to a Windows "dynamic link library" (.dll)] which is probably no longer supported nor even available.)
-If you're one of the unlucky ones, you could of course either switch away from OpenBSD or learn to live with the occasional problem. But it's quite likely neither of these options are desirable. So what else can you do? Here's a thorough list of suggestions; most likely the first thing you should try is __tune the kernel__; simply tuning the kernel may completely resolve the problem. (Some suggestions mainly address high load, and probably won't help the memory fragmentation problem very much. Some suggestions mainly address process memory, and probably won't help very much if you suffer from frequent overloads.) Find the suggestions that fits your situation, and pursue them.  
-  * Upgrade your kernel\\ Each kernel version seems to be an improvement over the previous one. (The problems may not be //completely// fixed yet though; problems have been reported on kernels at least as late as version 4.3 and perhaps later.)  
-  * Add RAM\\ This helps the problem in three different ways. First, newer OpenBSD kernels (but not older ones) reconfigure themselves every boot depending on how much RAM they see; if there's more RAM, all the kernel configuration options are increased. Second, more memory allows applications to spread out a little more so problems don't become visible quite so soon. And third, more memory makes everything run a little faster, including the kernel which has a bit more time to repair small problems before they grow too large. 
-  * Purposely de-tune DansGuardian\\ If DansGuardian doesn't handle requests quite as quickly, the kernel will have more time to cover its errors before they get out of hand. Whatever you did to improve DansGuardian performance, undo parts of it.  
-  * Cause DansGuardian child tasks to stop and restart more frequently (or alternatively //less// frequently)\\ The idea behind stopping and restarting tasks more frequently is to reduce memory fragmentation and its subsequent problems. Try this and see what happens. If there's no improvement, return the variables to their original values.  Reduce //minsparechildren//, //maxsparechildren//, and //preforkchildren//. And reduce //maxagechildren//, perhaps to 300 or even 200. (Doing this will almost certainly have the side effect of reducing performance, perhaps noticeably.) \\ \ Sometimes the opposite change of stopping and restarting child tasks __less__ frequently will improve stability. So also experiment with increasing //minsparechildren//, //maxsparechildren//, and //preforkchildren//, and greatly increasing //maxagechildren//, perhaps to 10000 or even 50000. 
-  * Reduce the average load\\ Maybe you can tune other things or provide other capabilities so your users don't hit the web quite so hard. Or maybe you'll just have to change your users' behavior\ \ ...if you can; if you're not sure, it may be better not to even try. Maybe users can be persuaded to drop their load just a few percent. But then again maybe they won't change and won't change and won't change until they suddenly drop out altogether (and even worse don't come back). 
-  * Cap the peak system load\\ Use the DansGuardian //maxips// parameter to set a hard limit on how many computers can access the web at the same time. Set the number slightly lower than current peaks: high enough to not overly inconvenience users, but low enough to provide the desired reliability.  
-  * Manually tune the kernel\\ __This is frequently required for BSD-derived kernels.__ \\ \ Perhaps all you need to do is increase //maxusers// (//kern.maxusers// ?) beyond 256, as most other parameters are connected to it. (Don't even do just this if you have a kernel that automatically adjusts //maxusers// at boot depending on how much RAM it finds.)  If you want to get more detailed, consider increasing OPEN_MAX,  or  BUFCACHEPERCENT, MAX_KMAPENT, NKMEMPAGES, NKMEMPAGES_MAX and //de//creasing NMBCLUSTERS. \\ \ Remember you're tuning for //peak// conditions (not //average// conditions). Even a performance monitoring tool that displays every second won't show conditions that last less than 100 milliseconds. Yet these short load spikes on an inadequately tuned kernel may be the main cause of DansGuardian crashing. \\ \ Only change runtime values; __//do not//__ rebuild a kernel (except as a last resort if you really really know what you're doing.) //Rebulding OpenBSD kernels is no longer recommended// (or even acceptable in most cases). Currently manually re-tuning a BSD-derived kernel often involves either the //sysctl// command or modifying the file /etc/sysctl.conf.  
-  * Use an older version of DansGuardian\\ If you have a borderline case (DansGuardian doesn't fail very often, and just a very small improvement in reliability would be enough), an easy way to slightly de-tune DansGuardian may be to run an older version which does not include recent performance improvements. \\ \ (Note this may not be reasonably possible. All 2.10.x.x versions of DansGuardian provide such similar performance that changing would not be worthwhile. And the 2.8.x.x versions of DansGuardian are now several years old.)  
-  * Add an auto-restart capability (for example a 'cron' auto-restart script)\\ Have a script wake up every few minutes and check if DansGuardian is still responsive. If not, stop both DansGuardian and Squid then start them fresh (Squid first then DansGuardian). 
-  * Add a 'cron' periodic restart script\\ Have a script wake up once in a while (say every day or every few hours), and forcibly stop both DansGuardian and Squid then start them again (Squid first, then DansGuardian) no matter what. This will ensure that there's a properly functioning DansGuardian most of the time even if the server is unattended. It may however dramatically inconvenience users by aborting their web connection once in a while.