How to Use Antivirus Software on CDP Hosts

If you use antivirus software on your servers, consider configuring it to skip scans on certain types of Hadoop-specific resources.

It can take a long time to scan large files or directories with a large number of files. In addition, if your antivirus software locks files or directories as it scans them, those resources will be unavailable to your Hadoop processes during the scan, and can cause latency or unavailability of resources in your cluster. Consider skipping scans on the following types of resources:
  • Scratch directories used by services such as Impala, and Data directories, which can grow to petabytes in size, including Kudu directories on disk.
    To find the local data directories and files:
    1. Go to Cloudera Manager.
    2. On the Home > Status tab, click a cluster name.
    3. Select Configuration > Local Data Directories and Files.
    4. Find the link of the page.

      For example, https://CM_SERVER:port/cmf/clusters/#/config2?task=ALL_LOCAL_DATA_DIRS_AND_FILES

  • Log directories used by various Hadoop services.
    To find log directories:
    1. Go to Cloudera Manager.
    2. On the Home > Status tab, click a cluster name.
    3. Select Configuration > Log Directories.
    4. Find the link of the page.

      For example, https://CM_SERVER:port/cmf/clusters/#/config2?task=ALL_LOG_DIRECTORIES

The specific directory names and locations depend on the services your cluster uses and your configuration. In general, avoid scanning very large directories and filesystems. Instead, limit write access to these locations using security mechanisms such as access controls at the level of the operating system, HDFS, or at the service level.