About a week after I configured Event Scheduler in MariaDB as in the previous article, I have data like below in the database.
SELECT port_id, `ifOutOctets`, `ifInOctets`,captured_at
FROM ports_hist
WHERE captured_at < '2024-08-10' AND captured_at > '2024-08-02'
AND port_id IN (3,4)
ORDER BY port_id, captured_at;
+---------+--------------+--------------+---------------------+
| port_id | ifOutOctets | ifInOctets | captured_at |
+---------+--------------+--------------+---------------------+
| 3 | 35039656033 | 132556547082 | 2024-08-02 00:31:00 |
| 3 | 35558892925 | 134706292659 | 2024-08-03 00:31:00 |
| 3 | 39006864177 | 141900456510 | 2024-08-04 00:31:00 |
| 3 | 39766996221 | 155890098068 | 2024-08-05 00:31:00 |
| 3 | 40262419929 | 157388859352 | 2024-08-06 00:31:00 |
| 3 | 40613651178 | 158126070440 | 2024-08-07 00:31:00 |
| 3 | 41111053655 | 159948605991 | 2024-08-08 00:31:00 |
| 3 | 41447751875 | 160794683442 | 2024-08-09 00:31:00 |
| 4 | 839009919291 | 25483043737 | 2024-08-02 00:31:00 |
| 4 | 848572779644 | 25702077812 | 2024-08-03 00:31:00 |
...
I have multiple rows for each port_id
. The next task is calculating deltas of these numbers.
The task apparently includes pivoting. The above example should be transformed like below.
ifOutOctets
+---------+--------------+--------------+-------------+-----
| port_id | 2024-08-02 | 2024-08-03 | 2024-08-04 |
+---------+---------------------+---------------------+-----
| 3 | 35039656033 | 35558892925 | 39006864177 |...
| 4 | 839009919291 | 848572779644 | ... |
...
Though Excel would be the most popular pivoting tool, I used Python Pandas to make the process easier to regularly redo.
The code is below.
### | |
### LibreNMS daily port traffic calculation | |
### Copyright 2024 by Tak Yanagida | |
### Licensed under MIT license | |
### | |
### Write configurations (DB access credentials) in .env | |
### Execute this script by python3 portsstats.py | |
### This code is mostly inteneded for use as your own development base. | |
import numpy as np | |
import pandas as pd | |
from sqlalchemy import create_engine | |
import os | |
from dotenv import load_dotenv | |
load_dotenv() | |
db_url = 'mysql+pymysql://{0}:{1}@{2}/{3}'.format( | |
os.environ['MARIAUSER'], os.environ['MARIAPASS'], | |
os.environ['MARIAHOST'], os.environ['MARIADB']) | |
engine = create_engine(db_url, paramstyle='named', pool_recycle=3600, pool_pre_ping=True) | |
conn = engine.connect() | |
sql = ''' | |
SELECT captured_at | |
FROM ports_hist | |
GROUP BY captured_at | |
ORDER BY captured_at DESC | |
LIMIT 8; | |
''' | |
recent_capture_timesamp_df = pd.read_sql_query(sql, conn) | |
from_date_timestamp = recent_capture_timesamp_df['captured_at'].min() | |
sql = ''' | |
SELECT port_id, ifInErrors, ifOutErrors, ifInOctets, ifOutOctets, captured_at | |
FROM ports_hist | |
WHERE captured_at >= %(from_date)s | |
ORDER BY port_id, captured_at; | |
''' | |
port_traffic_counter_df = pd.read_sql_query(sql, conn, params={'from_date': from_date_timestamp}) | |
port_traffic_df = port_traffic_counter_df.pivot(columns='captured_at', index='port_id') | |
port_traffic_df.columns.rename({None: 'counter_name'}, inplace=True) | |
### | |
### Pivoting | |
### | |
grouped = port_traffic_df.T.groupby(level=0) # used T, since axis='column' is deprecated | |
df = grouped.apply(lambda x: pd.DataFrame({'7-days': x.iloc[7] - x.iloc[0], | |
'3-days': x.iloc[7] - x.iloc[4], | |
'1-day': x.iloc[7] - x.iloc[6]})) | |
df.columns.rename('duration', inplace=True) | |
df2 = df.unstack(level=0).swaplevel(axis='columns') | |
counter_names_order = ['ifInErrors', 'ifOutErrors', 'ifInOctets', 'ifOutOctets'] | |
def sorter(v): | |
return counter_names_order.index(v) | |
weekly_traffic_df = df2.sort_index(axis='columns', level=0, sort_remaining=False, | |
key=lambda s: list(map(sorter, s))) | |
### | |
### Add port information | |
### | |
sql = ''' | |
SELECT a.port_id, b.display, a.ifName, a.ifAlias, a.ifSpeed/1000/1000 AS ifSpeed_Mbps | |
FROM ports a | |
LEFT JOIN devices b ON a.device_id = b.device_id; | |
''' | |
ports_df = pd.read_sql_query(sql, conn, index_col='port_id') | |
df_flat = weekly_traffic_df.copy() | |
df_flat.columns = ['_'.join(col).strip() for col in weekly_traffic_df.columns.values] | |
final_result_df = df_flat.join(ports_df) | |
final_result_df.to_excel('out.xlsx', freeze_panes=(1,1)) |
Code explanation
Basic idea is pivoting the whole table as described above.
Then, subtract value of yesterday from value of today. Since all values are counters, this calculation gives daily volume. I calculated 1-day, 3-days, and 7-days.
Repeat the same for some other fields like ifInOctets
, ifOutErrors
, ifInErrors
.
Then, add some fields, such as ifAlias
from ports
for readability.
The generated table is saved as Excel file (can be opened with LibreOffice, too) for browsing like below.