Sunday 23 August 2015

Improve Server Performance

What’s your goal?

Optimizing for Extracts
The data engine process stores extracts and answers queries; the background process refreshes extracts. Because both are demanding of CPU resources, the best approach to improving performance for an extract-intensive deployment is to isolate these two processes from one another, and from the other server processes. This may take three machines. If you don’t have three machines to work with, there are still strategies you can use.
Optimizing for Users and Viewing
The VizQL server process handles loading and rendering views for Tableau Server users. If you are trying to optimize your deployment for a high number of users and a lot of view interaction, this is the process you should focus on.

How Many Processes to Run

This topic assumes that you are running the 64-bit version of Tableau Server on a 64-bit operating system, on a computer with 8 cores and 16 GB of RAM. In this situation, two instances of each process should meet your needs. If your machine has just 4 cores or only meets the minimum RAM requirement for Tableau Server, which is 8 GB, your limit should be one instance of each process.
Background Process
A single background process can consume 100% of a single CPU core, and sometimes even more for certain tasks. As a result of this, the total number of instances you should run depends on the machine’s available cores—as well as on what you’re trying to improve. The deployment examples below use N to represent the machine’s total number of cores, and each suggests a different strategy where the background process is concerned. When in doubt, start with the low end of the suggested range and assess performance before increasing the number.
Data Engine and Repository Processes
There are scenarios where the data engine process should be isolated on its own node—such as if you are trying to improve an extract-intensive deployment and you want to emphasize querying more than extract refreshes. The deployment examples below provide specifics. Because the data engine stores real-time data, transferring it is a multi-phased procedure. For more information, see Move the Data Engine and File Store Processes
Another reason to isolate the data engine (and/or the repository) is to minimize your deployment’s potential for downtime. Unless you’re configuring for high availability, the repository can usually remain on the primary Tableau Server. For more information, see High Availability.

Where to Configure Processes

You configure the type and number of processes any machine is running using the Tableau Server Configuration dialog box. If you are adding new machines as part of your reconfiguration, they must already have Tableau Worker software installed on them. For more information, see Install and Configure Worker Nodes.
If you are reconfiguring the processes on your primary or standalone Tableau Server, see Reconfigure Processes.

Optimizing the Extracts and Workbooks

Fast server performance with extracts is partly a function of the extracts and workbooks themselves. Workbook authors can help improve server performance by keeping the extract’s data set short, through filtering or aggregating, and narrow, by hiding unused fields. Use the Tableau Desktop options Hide All Unused Fields and Aggregate data for visible dimensions to do this. For steps, see Creating an Extract (Tableau Desktop help). For general tips on building well-performing workbooks, search for “performance” in the Tableau Desktop help. To see how workbooks perform after they've been published to Tableau Server you can create a performance recording. For more information, see Create a Performance Recording.

Assessing View Responsiveness

When a user opens a view, the components of the view are first retrieved and interpreted, then displayed in the user's web browser. For most views, the display rendering phase occurs in the user's web browser and in most cases, this yields the fastest results and highest level of interactive responsiveness. Handling most interactions in the client web browser reduces bandwidth and eliminates round-trip request latencies. If a view is very complex, Tableau Server handles the rendering phase on the server instead of in the client web browser—because that generally results in the best performance. If you find that views aren't as responsive as you'd like, you can test and change the threshold that causes views to be rendered by the server instead of in the client web browser. For more information, see About Client-Side Rendering.

One-Machine Example: Extracts

A 64-bit Tableau Server installation with heavy extract usage can run on a single 64-bit machine configured as follows:
This configuration would look like the following Process Status table on the Server Status page.
Configuration Notes:
  • Run 2 VizQL server processes. Run a cache server process for every VizQL server process.
  • Calculate the least number of backgrounder processes to run by taking the machine’s total number of cores and divide it by 4. To determine the maximum number, divide by 2.
  • Both the backgrounder and data engine processes are CPU-intensive and the configuration shown above balances them.
  • Schedule extract refreshes for off-peak times to help the VizQL server, application server, data engine, and background processes to not compete with one another for system resources.

Two-Machine Example: Extracts

This example shows the possible configuration for a two-machine Tableau Server deployment that handles heavy extract usage. Note that the VizQL server, application server, cache server, data server, and data engine processes are isolated from the background processes.
With this configuration, the Server Status page would look like this:
Configuration Notes:
  • Run 2 VizQL server processes on the primary server. Run a cache server process for every VizQL server process.
  • Isolate the backgrounder processes on the worker. To figure out the minimum number of background processes to run, take the machine’s total number of cores and divide it by 4. For the maximum number, divide by 2.
  • Isolate the backgrounder processes from the VizQL server, application server, data server, and data engine processes.
  • Adding cache servers on the worker node with backgrounders can make cache requests on behalf of users or jobs.

Two-Machine Example: Viewing

A two-machine deployment with light extract usage and heavier viewing can be configured as follows:
The Process Status table for this configuration would look like this:
Configuration Notes:
  • Run 2 VizQL server processes on the primary server. Run a cache server process for every VizQL server process.
  • A minimum of 2 background processes should be run on the worker. The maximum number you should run is equal to the machine’s total number of cores.
  • Run the data engine process on both nodes to split view requests between the two nodes. In a deployment where extracts are refreshed infrequently, the data engine and background processes can be on the same node.
  • If extract refresh jobs will be only run during off hours, you can add more background processes on each node to maximize their parallelism.
  • Adding cache servers on the worker node with backgrounders can make cache requests on behalf of users or jobs.
  • The number of nodes in the cluster is determined by the total number of cores and main memory available across all nodes.

Three-Machine Example: Extracts & Viewing

A three-machine configuration is the recommended minimum number of machines to achieve the best performance if you have both a high amount of extract refreshing and usage, and a high number of concurrent users.
The Process Status table for this configuration would look like this:
Configuration Notes:
  • For this configuration, 16 cores are recommended.
  • Run 2 VizQL Server processes on the primary and the worker that is not running the background processes. Run a cache server process for every VizQL server process.
  • The background processes are on their own machine so that their work does not compete with that of the other processes. Because the machine is dedicated to background processes and they can consume 100% of the CPU resources, the low end of the suggested range equals the total number of cores. Depending on the size of the data being refreshed, it’s possible for some deployments to run up to twice as many background processes than cores and still obtain parallel speed-up.
  • Run the data engine process on the primary and the worker that is not running background processes to split view requests between the two nodes.
  • The user loads for the application server and data server processes can typically be handled by 1 process each but they can be set to 2 to provide redundancy.
  • Under most conditions, the primary Tableau Server and the data engine will not be a bottleneck for the system’s overall throughput as long as sufficient CPU cycles exist for them. To increase viewing capacity, add nodes dedicated to the VizQL server process. To increase capacity for refreshing extracts, add nodes dedicated to the background process.
  • Adding cache servers on the worker node with backgrounders can make cache requests on behalf of users or jobs.

No comments:

Post a Comment