2023-08-15: Real-time Interactive Dashboard for Data in a Distributed System

A distributed system refers to a collection of interconnected computers or nodes that work together to achieve a common goal. Real-time visualization of data collected from multiple nodes in a distributed system allows us to identify trends and patterns as they emerge. Moreover, interactive visualizations let us explore, analyze, and share insights into data through interactive controls, filtering options, and drill-down capabilities. In a distributed system, having a real-time interactive dashboard enables real-time decision-making, supports exploratory data analysis, facilitates communication and collaboration, enhances monitoring and alerts, strengthens predictive analytics, and empowers interactive data exploration. In this blog, I present an approach for developing an interactive visualization dashboard for real-time data in a distributed system using Python.

In this approach, we use the MQTT broker as the messaging protocol between the computers and the dashboard, the HoloViz ecosystem to implement the interactive dashboard, and Streamz to transmit data from the MQTT broker into the dashboard in real time.

What is MQTT?

MQTT, also known as Message Queuing Telemetry Transport, is a lightweight messaging protocol (publish and subscribe) designed for connections with remote locations. MQTT is commonly designed for devices with resource constraints and networks with limited bandwidth, high latency, and unreliable networks, such as in the Internet of Things (IoT).

What is HoloViz?

Python offers several libraries for data visualization such as Matplotlib, Plotly, etc. HvPlot can generate plots with Bokeh, Matplotlib, or Plotly from Pandas Dataframe and many other data structures of the PyData ecosystems. HvPlot is one of the libraries maintained by the HoloViz and it is a high-level API that provides us with the ability to easily layout plots, include widgets to get more interactive control of plots, visualize data, and interactively explore data in Python. 

What is Streamz?

Streamz Python library helps us build pipelines to manage continuous streams of data. 
 

Integrate MQTT broker and HvPlot to visualize real-time data streams

The architecture of the interactive dashboard integration with the MQTT broker
Figure 1: The architecture of the interactive dashboard integration with the MQTT broker

As illustrated in Figure 1, the architecture of the real-time dashboard integration with the MQTT broker contains the following components.

MQTT Broker

We require a message broker that implements the MQTT protocol. Here, we use an open-source message broker, Eclipse Mosquitto. Mosquitto is lightweight and can be run on devices from low-power single-board computers to full servers.

We can download and install the Eclipse Mosquitto package on various operating systems.

To start the broker,
  • In Windows, 
    • Go to the location where the Mosquitto is installed.
      • eg: cd c:\Program Files\mosquitto
    • Open Command Prompt and run the following command.
      • net start mosquitto
    • To find the active and listening ports, run the following command.
      • netstat -a
    • Find the IP address and port associated with the MQTT broker.
      • eg: "127.0.0.1" and 1883
  • In Linux,
    • Run the following command in Terminal to start the MQTT broker.
      • mosquitto
  • In MacOS,
    • Run the following command in Terminal to start the MQTT broker.
      • /usr/local/sbin/mosquitto

Publisher Devices

Let's consider a sample application of visualizing real-time weather data collected from multiple devices. Each device in this distributed system considers as a publisher. We implement the publisher using the Paho MQTT Python client library. As mentioned earlier, the MQTT broker is associated with an IP address and a port. The following code snippet can be used in the publisher to connect to the MQTT broker, subscribe to a topic, and publish the weather data into the topic.

import paho.mqtt.client as mqtt
class PublisherClient:
MQTT_HOST = "localhost"
MQTT_PORT = 1883
topic = "weather"
client = mqtt.Client("MQTT")
def on_publish(self, client,userdata,result): #create function for callback
print("data published \n")
pass
def on_connect(self, client, userdata, flags, rc):
global Connected
if rc == 0:
print("Connected to broker")
Connected = True # Signal connection
else:
print("Connection failed")
def connect(self):
# Create an MQTT client and connect to the broker
self.client.on_connect = self.on_connect
self.client.on_publish = self.on_publish
self.client.connect(self.MQTT_HOST, port=self.MQTT_PORT)
def publish_data(self, payload):
self.client.publish(self.topic, payload)

Dashboard Client

Similar to the publishers, we implement the dashboard client using the Paho MQTT Python client library. Using the following code snippet, our real-time weather data visualization application can connect to the MQTT broker, subscribe to a topic, receive weather data from publisher devices, and emit the data into the data stream.

import paho.mqtt.client as mqtt
import hvplot.streamz
from streamz import Stream
# Define the MQTT broker details
MQTT_HOST = "localhost"
MQTT_PORT = 1883
topic = "weather"
Connected = False
playing = True
data_stream = Stream()
# Define on_connect fuction
def on_connect(client, userdata, flags, rc):
print("Connected with result code " + str(rc))
# Subscribe to the topic and set the callback function
client.subscribe(topic)
# Define the callback function to handle incoming messages
def on_message(client, userdata, msg):
data = json.loads(msg.payload.decode())
# emit the data into the data stream
data_stream.emit(df)
# Create an MQTT client and connect to the broker
client = mqtt.Client()
client.connect(MQTT_HOST, MQTT_PORT, 60)
client.on_connect = on_connect
client.on_message = on_message
# Run the MQTT client loop
client.loop_forever()

Data Stream

Python Streamz allows us to build data pipelines to manage continuous streams of data. Streamz can work with Pandas dataframes, to provide sensible streaming operations on continuous tabular data. HvPlot supports streamz dataframe (streaming dataframe) objects, automatically generating streaming plots in interactive dashboards deployed as a Bokeh Server app.

Interactive Dashboard

In this visualization dashboard, we plot different weather data such as temperature, wind speed, humidity, etc. over time by location in real-time. Hvplot.pandas allows us to create interactive dataframes. Panel has a lot of dashboarding templates which allows us to easily layout visualizations together. Using the following code snippet we can design the layout of our visualization dashboard. 

import pandas as pd
import hvplot.pandas
import panel as pn
import hvplot.streamz
from streamz import Stream
from streamz.dataframe import DataFrame as sDataFrame
from datetime import date, datetime
data_stream = Stream()
# Run the server and listen for incoming messages
def start_server():
pn.extension('vega')
global data_stream
index = pd.DatetimeIndex([])
weather_example = pd.DataFrame(
{'Temp': [], 'Feels_Like': [], 'Wind_Speed': [], 'UV': [], 'Humidity': [], 'Location': [], 'Timestamp': []},
columns=['Temp', 'Feels_Like', 'Wind_Speed', 'UV', 'Humidity', 'Location', 'Timestamp'],
index=[])
# Define streaming dataframe
sdf = sDataFrame(data_stream, example=weather_example)
line_plot_temp = sdf[['Temp', 'Location', 'Timestamp']].hvplot(
x='Timestamp', y='Temp', xlabel='Timestamp', ylabel='Temperature (F)', by='Location',
width=500, height=255)
today = str(datetime.now().strftime('%A')) + ", " + str(date.today())
# Layout using Template
template = pn.template.FastListTemplate(
title='Real-time Interactive Dashboard for Data in a Distributed System',
sidebar=[pn.pane.Markdown("# Weather Dashboard"),
pn.pane.Markdown("### Visualization of real-time weather data collected from multiple locations."),
pn.pane.PNG('image.png', sizing_mode='scale_both'),
pn.pane.Markdown('<br/><br/><br/>'),
pn.pane.Markdown('### Date:'),
today],
main=[pn.Row(
pn.Column(line_plot_temp)
)],
accent_base_color="#88d8b0",
header_background="#88d8b0",
)
template.show()
start_server()

Make sure these packages are imported in order to compile this code correctly: pandas, hvplot.pandas, hvplot.streamz, streamz.Stream, streamz.dataframe.DataFrame

Similar to how we design websites, here we can define title, sidebar to add text or images in markdown format, and main body of the dashboard. We can include the contents as rows and columns. After we define the template, we need to show it. 

The dashboard we designed in this sample application is shown below.

Figure 2: Real-time visualization dashboard designed with HvPlot


Figure 2 illustrates the analytics dashboard that is designed in this application to provide a detailed real-time visualization of weather data from multiple locations. This dashboard has more interactive functionalities to monitor, analyze, and control the data. We designed this analytics dashboard to have different components as below.
  1. Tabs: Tabs provide us with the capability to switch between the views of different data visualizations. 
    Figure 3: Tabs to switch between different views

  2. Play/Pause Control: As the weather data are visualized in real-time charts (data streaming charts), the charts automatically update themselves after every n seconds. Hence, this play/pause control provides us with the functionality to pause the real-time charts and replay as necessary.
    Figure 4: Play/Pause Controls


  3. Weather Data: Real-time visualization of the weather data with color-coded visualizations of individual weather data for each location.
    Figure 5: Visualization of temperature over time by location

  4. Location Names: This component displays the name of the location to which each color-coded visualization of weather data belongs. Moreover, it allows us to visualize data for the selected location by clicking on the location name while fading the data for the other locations.
    Figure 6: Visualizing data for one location (Blue) while other location data are faded.

  5. Controls: The control of the experimental setup. Actions include box zoom, wheel zoom, save, and reset. 
    Figure 7: Box zoom function to zoom in 

  6. Data point: We can place the mouse cursor near a specific point in the chart and see the value of the specific data point.
    Figure 8: Values of specific data points


-- Yasasi Abeysinghe (@Yasasi_Abey)

Comments