2023-08-15: Real-time Interactive Dashboard for Data in a Distributed System

A distributed system refers to a collection of interconnected computers or nodes that work together to achieve a common goal. Real-time visualization of data collected from multiple nodes in a distributed system allows us to identify trends and patterns as they emerge. Moreover, interactive visualizations let us explore, analyze, and share insights into data through interactive controls, filtering options, and drill-down capabilities. In a distributed system, having a real-time interactive dashboard enables real-time decision-making, supports exploratory data analysis, facilitates communication and collaboration, enhances monitoring and alerts, strengthens predictive analytics, and empowers interactive data exploration. In this blog, I present an approach for developing an interactive visualization dashboard for real-time data in a distributed system using Python.

In this approach, we use the MQTT broker as the messaging protocol between the computers and the dashboard, the HoloViz ecosystem to implement the interactive dashboard, and Streamz to transmit data from the MQTT broker into the dashboard in real time.

What is MQTT?

MQTT, also known as Message Queuing Telemetry Transport, is a lightweight messaging protocol (publish and subscribe) designed for connections with remote locations. MQTT is commonly designed for devices with resource constraints and networks with limited bandwidth, high latency, and unreliable networks, such as in the Internet of Things (IoT).

What is HoloViz?

Python offers several libraries for data visualization such as Matplotlib, Plotly, etc. HvPlot can generate plots with Bokeh, Matplotlib, or Plotly from Pandas Dataframe and many other data structures of the PyData ecosystems. HvPlot is one of the libraries maintained by the HoloViz and it is a high-level API that provides us with the ability to easily layout plots, include widgets to get more interactive control of plots, visualize data, and interactively explore data in Python. 

What is Streamz?

Streamz Python library helps us build pipelines to manage continuous streams of data. 
 

Integrate MQTT broker and HvPlot to visualize real-time data streams

The architecture of the interactive dashboard integration with the MQTT broker
Figure 1: The architecture of the interactive dashboard integration with the MQTT broker

As illustrated in Figure 1, the architecture of the real-time dashboard integration with the MQTT broker contains the following components.

MQTT Broker

We require a message broker that implements the MQTT protocol. Here, we use an open-source message broker, Eclipse Mosquitto. Mosquitto is lightweight and can be run on devices from low-power single-board computers to full servers.

We can download and install the Eclipse Mosquitto package on various operating systems.

To start the broker,
  • In Windows, 
    • Go to the location where the Mosquitto is installed.
      • eg: cd c:\Program Files\mosquitto
    • Open Command Prompt and run the following command.
      • net start mosquitto
    • To find the active and listening ports, run the following command.
      • netstat -a
    • Find the IP address and port associated with the MQTT broker.
      • eg: "127.0.0.1" and 1883
  • In Linux,
    • Run the following command in Terminal to start the MQTT broker.
      • mosquitto
  • In MacOS,
    • Run the following command in Terminal to start the MQTT broker.
      • /usr/local/sbin/mosquitto

Publisher Devices

Let's consider a sample application of visualizing real-time weather data collected from multiple devices. Each device in this distributed system considers as a publisher. We implement the publisher using the Paho MQTT Python client library. As mentioned earlier, the MQTT broker is associated with an IP address and a port. The following code snippet can be used in the publisher to connect to the MQTT broker, subscribe to a topic, and publish the weather data into the topic.

Dashboard Client

Similar to the publishers, we implement the dashboard client using the Paho MQTT Python client library. Using the following code snippet, our real-time weather data visualization application can connect to the MQTT broker, subscribe to a topic, receive weather data from publisher devices, and emit the data into the data stream.

Data Stream

Python Streamz allows us to build data pipelines to manage continuous streams of data. Streamz can work with Pandas dataframes, to provide sensible streaming operations on continuous tabular data. HvPlot supports streamz dataframe (streaming dataframe) objects, automatically generating streaming plots in interactive dashboards deployed as a Bokeh Server app.

Interactive Dashboard

In this visualization dashboard, we plot different weather data such as temperature, wind speed, humidity, etc. over time by location in real-time. Hvplot.pandas allows us to create interactive dataframes. Panel has a lot of dashboarding templates which allows us to easily layout visualizations together. Using the following code snippet we can design the layout of our visualization dashboard. 


Make sure these packages are imported in order to compile this code correctly: pandas, hvplot.pandas, hvplot.streamz, streamz.Stream, streamz.dataframe.DataFrame

Similar to how we design websites, here we can define title, sidebar to add text or images in markdown format, and main body of the dashboard. We can include the contents as rows and columns. After we define the template, we need to show it. 

The dashboard we designed in this sample application is shown below.

Figure 2: Real-time visualization dashboard designed with HvPlot


Figure 2 illustrates the analytics dashboard that is designed in this application to provide a detailed real-time visualization of weather data from multiple locations. This dashboard has more interactive functionalities to monitor, analyze, and control the data. We designed this analytics dashboard to have different components as below.
  1. Tabs: Tabs provide us with the capability to switch between the views of different data visualizations. 
    Figure 3: Tabs to switch between different views

  2. Play/Pause Control: As the weather data are visualized in real-time charts (data streaming charts), the charts automatically update themselves after every n seconds. Hence, this play/pause control provides us with the functionality to pause the real-time charts and replay as necessary.
    Figure 4: Play/Pause Controls


  3. Weather Data: Real-time visualization of the weather data with color-coded visualizations of individual weather data for each location.
    Figure 5: Visualization of temperature over time by location

  4. Location Names: This component displays the name of the location to which each color-coded visualization of weather data belongs. Moreover, it allows us to visualize data for the selected location by clicking on the location name while fading the data for the other locations.
    Figure 6: Visualizing data for one location (Blue) while other location data are faded.

  5. Controls: The control of the experimental setup. Actions include box zoom, wheel zoom, save, and reset. 
    Figure 7: Box zoom function to zoom in 

  6. Data point: We can place the mouse cursor near a specific point in the chart and see the value of the specific data point.
    Figure 8: Values of specific data points


-- Yasasi Abeysinghe (@Yasasi_Abey)

Comments