BTS Module

Core backend logic for lightning data processing in the HLMA application.

This module provides functions for reading, processing, and analyzing lightning datasets from LYLOUT and ENTLN sources. It includes algorithms for detecting and grouping lightning flashes, as well as utilities for preprocessing and structuring the data for visualization and analysis in the HLMA GUI.

Functions

zipped_lylout_reader(file, skiprows=55): Reads a compressed LYLOUT .dat.gz file. Used by open_lylout.
lylout_reader(file, skiprows=55): Reads a plain-text LYLOUT .dat file. Used by open_lylout.
open_lylout(files): Reads multiple LYLOUT files and combines them into a single DataFrame.
entln_reader(file, min_date): Reads a single ENTLN CSV file. Used by open_entln.
open_entln(files, min_date): Reads multiple ENTLN CSV files and combines them into a single DataFrame.
dot_to_dot(env): Applies the dot-to-dot flash detection algorithm on lightning data.
mc_caul(env): Applies the McCaul flash detection algorithm on lightning data.

Notes

Reader functions are used by open_lylout and open_entln to process multiple files.
Other functions are mostly internal and intended for use by the HLMA application.
The flash detection algorithms are experimental and should be interpreted with caution.

bts.dot_to_dot(env)[source]

Apply the dot-to-dot flash detection algorithm on lightning data.

Groups lightning events in space and time to identify flashes, updates the flash_id column in the dataset, and projects data to ECEF coordinates. Computation is parallelized for speed.

Parameters:: env (State) – State object containing lightning data (env.all) and station coordinates (env.stations).
Return type:: None

bts.entln_reader(file, min_date)[source]

Read a single ENTLN CSV file and format it to match LYLOUT data. Used by open_entln.

Converts timestamps, renames columns to standard names, computes UTC seconds, and returns a DataFrame with essential columns.

Parameters:

file (str) – Path to the ENTLN CSV file.
min_date (pd.Timestamp) – Reference datetime for computing UTC seconds.

Returns:

DataFrame with columns: [“datetime”, “lat”, “lon”, “alt”, “peakcurrent”, “numbersensors”, “utc_sec”, “type”]. Returns None if the file cannot be read or processed.

Return type:

pd.DataFrame or None

bts.lylout_reader(file, skiprows=55)[source]

Read a plain-text LYLOUT .dat file into a DataFrame. Used by open_lylout.

Parses the file, calculates the number of stations contributing from the mask, extracts the date from the filename, computes absolute datetimes, and initializes a flash_id column.

Parameters:

file (str) – Path to the LYLOUT .dat file.
skiprows (int, optional) – Number of initial rows to skip (default is 55).

Returns:

DataFrame with columns: [“datetime”, “lat”, “lon”, “alt”, “chi”, “pdb”,

”number_stations”, “utc_sec”, “mask”, “flash_id”].

Returns None if the file could not be read.

Return type:

pd.DataFrame or None

bts.mc_caul(env)[source]

Apply the McCaul flash detection algorithm on lightning data.

Groups lightning events using distance, time, and azimuth thresholds, updates the flash_id column in the dataset, and projects data to ECEF coordinates. Computation is parallelized for speed.

Parameters:: env (State) – State object containing lightning data (env.all) and station coordinates (env.stations).
Return type:: None

bts.open_entln(files, min_date)[source]

Read multiple ENTLN CSV files and combine them into a single DataFrame.

Uses entln_reader to process each file, then concatenates the results and computes a ‘seconds’ column relative to the first midnight of the dataset.

Parameters:

files (list of str) – List of paths to ENTLN CSV files.
min_date (pd.Timestamp) – Minimum datetime reference to filter the data.

Returns:

Combined DataFrame containing all ENTLN data with a computed ‘seconds’ column.

Return type:

pd.DataFrame

bts.open_lylout(files)[source]

Read multiple LYLOUT files (compressed or plain) and combine them into a single DataFrame.

Determines the number of header rows by inspecting the first file, extracts LMA station coordinates, reads all files in parallel, concatenates the results, and computes a ‘seconds’ column relative to the first midnight.

Parameters:: files (list of str) – List of paths to LYLOUT files (.dat or .dat.gz).
Returns:: Tuple containing: - pd.DataFrame: All LYLOUT data concatenated. - np.ndarray: LMA station coordinates as float32, shape (n_stations, 2).
Return type:: tuple

bts.zipped_lylout_reader(file, skiprows=55)[source]

Read a compressed LYLOUT .dat.gz file into a DataFrame. Used by open_lylout.

Parses the file, calculates the number of stations contributing from the mask, extracts the date from the filename, computes absolute datetimes, and initializes a flash_id column.

Parameters:

file (str) – Path to the .dat.gz LYLOUT file.
skiprows (int, optional) – Number of initial rows to skip (default is 55).

Returns:

DataFrame with columns: [“datetime”, “lat”, “lon”, “alt”, “chi”, “pdb”,

”number_stations”, “utc_sec”, “mask”, “flash_id”].

Returns None if the file could not be read.

Return type:

pd.DataFrame or None