Posts

Showing posts from 2019

Some important points consider when developing an Apache Storm topology

Kafka Consumer lag ---kafka tool to monitor Consumer group configuration of flux weaving YAML Storm UI metrics such as capacity < 1 Spout parallelism should be equal to partition count of relevant topic Bolt parallelism should be increased according to the following equation                      **Throughput = Executor count * 1000/(process latency) * Capacity** Topology workers and supervisor server configurations accordingly Hbase database connection bottlenecks in Kerberized environment. Other Storm-related configuration parameters to be set in flux topology YAML or Ambari Maintain Tuple anchoring to guarantee the message processing through the downstream References, http://storm.apache.org/releases/1.0.6/Guaranteeing-message-processing.html http://storm.apache.org/releases/1.0.6/Understanding-the-parallelism-of-a-Storm-topology.html

Apache Storm Running on PROD

Short Description: A series of operational steps to troubleshoot issues with Storm Topologies in production Article General Recommendations - Only fail tuples if there's a recoverable exception - Use an Exception/Error Logging Bolt/Stream and send the output to a Kafka topic so all logs are consolidated in place - Be careful about supervisor saturation / overallocation i.e. running more threads than available CPU resources - Configure parallelism based on throughput requirements (use Throughput Equation to calculate) - Be aware of data spikes and plan for them in your topology / use cases / operations - If using Kafka, setup Kafka Producer-Consumer Lag analysis and dashboard - Benchmark topologies to assess "health" operations numbers for individual bolt performance ## Troubleshooting topology performance issues: - Check the topology stats from the Nimbus UI - Look for Failures (in spout or specific bolt) if any that module is having issues. -...