API (Python)

pyDANT.Preprocess

pyDANT.Preprocess.preprocess(user_settings)

Preprocess the data and save the features. Compute the ISI, autocorrelogram, and location of each unit and save the features to the output folder.

Arguments:

user_settings (dict): User settings

Inputs:

waveform_all.npy: The waveform of each unit
session_index.npy: The session index of each unit
channel_locations.npy: The location of each channel
channel_shanks.npy (optional): The shank ID of each channel. If missing, all channels are treated as a single shank.
spike_times/: A folder that contains the spike times of each unit
peth.npy (optional): The peri-event time histogram of each unit

Outputs:

locations.npy: The location of each unit in the 3D space
amplitude.npy: The amplitude of each unit
peak_channels.npy: The peak channel of each unit
unit_shanks.npy: The shank ID of each unit, defined as the shank of its peak channel. If channel_shanks.npy is missing, this is a vector of ones for single-shank data.
auto_corr.npy: The autocorrelogram of each unit
isi.npy: The ISI of each unit
peth.npy: The peri-event time histogram of each unit
waveforms_centered.npy: The centered waveforms of each unit

pyDANT.Preprocess.spikeInfo2npy(user_settings)

Convert the spikeInfo.mat file from MATLAB to numpy arrays that can be used in pyDANT.

Arguments:

user_settings (dict): User settings

Outputs:

waveform_all.npy: The waveform of each unit
session_index.npy: The session index of each unit
channel_locations.npy: The location of each channel
channel_shanks.npy: The shank ID of each channel, directly from spikeInfo.Kcoords
spike_times/: A folder that contains the spike times of each unit
spike_times/UnitA.npy: The spike times of unit A
peth.npy: The peri-event time histogram of each unit

Notes:

unit_shanks.npy is generated by preprocess(), not by spikeInfo2npy().

pyDANT.ComputeWaveformFeatures

pyDANT.ComputeWaveformFeatures.computeWaveformFeatures(user_settings, waveform_all, motion)

Compute the corrected waveforms based on the motion of the probe. The corrected waveforms on the reference probe are computed using the Kriging interpolation method and saved to the output folder.

Arguments:

user_settings (dict): User settings
waveform_all (numpy.ndarray): The waveforms of all units (n_unit, n_channel, n_sample)
motion (Motion): The motion object containing the linear and constant parameters for correction

Outputs:

waveforms_corrected.npy: The corrected waveforms.

pyDANT.MotionEstimation

pyDANT.MotionEstimation.computeMotion(user_settings)

Compute the motion of the electrode and save the results. Compute the features of each unit and do clustering the find the matching units. Motion estimation is then performed to minimize the distance between the matching units.

Arguments:

user_settings (dict): User settings

Outputs:

motion.npy: The motion of the electrode
SimilarityForCorretion.npz (optional): The similarity information used for motion estimation

pyDANT.MotionEstimation.getMotionFeatureSets(user_settings)

Resolve feature sets and iteration cap for motion estimation.

Reads the motionEstimation section of user_settings and determines how many motion-estimation iterations should be attempted. Legacy settings are preserved by default. If repeat_last_feature_set is true, the motion loop can continue beyond the explicitly listed feature sets by reusing the final set until max_iter is reached or stop_early terminates the loop.

Arguments:

user_settings (dict): User settings

Returns:

similarity_names_all (list): Feature sets from motionEstimation.features
n_iter_motion_estimation (int): Maximum number of iterations to attempt

pyDANT.MotionEstimation.initializeMotion(user_settings, waveforms_all)

Initialize the motion of the electrode based on waveform shifts.

Arguments:

user_settings (dict): User settings
waveforms_all (np.ndarray): All waveforms

Returns:

Motion: Initialized motion object

pyDANT.MotionEstimation.motionEstimation(user_settings)

Estimate the motion of the electrode and save the results. Compute the features of each unit and do clustering the find the matching units. Motion estimation is then performed to minimize the distance between the matching units.

Arguments:

user_settings (dict): User settings

Outputs:

motion.npy: The motion of the electrode
SimilarityForCorretion.npz (optional): The similarity information used for motion estimation

pyDANT.IterativeClustering

pyDANT.IterativeClustering.computeAllSimilarityMatrix(user_settings, waveforms, feature_names)

Compute the similarity matrix of the units based on the similarity metrics.

Arguments:

user_settings (dict): User settings
waveforms (ndarray): The waveforms of the units (n_units, n_channels, n_samples)
feature_names (list): The names of the features to be computed. The options are ‘Waveform’, ‘ISI’, ‘AutoCorr’, and ‘PETH’. If path_to_data/channel_shanks.npy exists, waveform similarity restricts nearest-channel neighborhoods to channels on the same shank. PETH bins marked as NaN are ignored pairwise during PETH similarity calculation.

Outputs:

similarity_matrix_all (ndarray): The similarity matrix of the units (n_units, n_units, n_features)
feature_names_all (list): The names of the features computed.
waveform_similarity_matrix.npy: The waveform similarity matrix of the units (n_units, n_units)
ISI_similarity_matrix.npy: The ISI similarity matrix of the units (n_units, n_units)
AutoCorr_similarity_matrix.npy: The autocorrelogram similarity matrix of the units (n_units, n_units)
PETH_similarity_matrix.npy: The PETH similarity matrix of the units (n_units, n_units)

pyDANT.IterativeClustering.computeWaveformSimilarityMatrix(user_settings, waveforms, channel_locations, channel_shanks=None)

Compute waveform similarity using channel neighborhoods.

Arguments:

user_settings (dict): User settings
waveforms (ndarray): The waveforms of the units
channel_locations (ndarray): The location of each channel
channel_shanks (ndarray, optional): The shank ID of each channel. If missing, all channels are treated as a single shank.

Outputs:

waveform_similarity_matrix (ndarray): The waveform similarity matrix of the units

pyDANT.IterativeClustering.finalClustering(user_settings)

Final clustering of the units based on the similarity metrics using HDBSCAN and LDA.

Arguments:

user_settings (dict): User settings

pyDANT.IterativeClustering.getNearbyPairs(max_distance, sessions, locations, motion=None)

Get the pairs of units that are within the max_distance.

Arguments:

max_distance (float): The maximum distance between the units.
sessions (ndarray): The session index of the units.
locations (ndarray): The locations of the units.
motion (ndarray): The motion of the probe.

Outputs:

idx_unit_pairs (ndarray): The pairs of units that are within the max_distance.
session_pairs (ndarray): The session index of the pairs of units.

pyDANT.IterativeClustering.iterativeClustering(user_settings, similarity_names, waveforms, motion=None)

Iterative clustering of the units based on the similarity metrics using HDBSCAN and LDA. The similarity metrics are computed firstly, and then HDBSCAN and LDA are performed alternatively to find the best clustering results. The clustering results are saved to the output folder.

Arguments:

user_settings (dict): User settings
similarity_names (list): The names of the similarity metrics to be computed. The options are ‘Waveform’, ‘ISI’, ‘AutoCorr’, and ‘PETH’.

Outputs:

SimilarityMatrix.npy: The similarity matrix of the units
SimilarityWeights.npy: The weights of the similarity metrics
SimilarityThreshold.npy: The threshold of the similarity metrics from LDA
ClusteringResults.npz: The clustering results of the units
DistanceMatrix.npy: The distance matrix used for HDBSCAN
waveform_similarity_matrix.npy: The waveform similarity matrix of the units
ISI_similarity_matrix.npy: The ISI similarity matrix of the units
AutoCorr_similarity_matrix.npy: The autocorrelogram similarity matrix of the units
PETH_similarity_matrix.npy: The PETH similarity matrix of the units
AllSimilarity.npz (optional): The similarity metrics of all units used for clustering

pyDANT.utils

class pyDANT.utils.Motion(num_sessions=None)

Class to handle motion estimation data. This class allows for saving and loading motion data, as well as retrieving motion values for specific sessions.

Attributes:

LinearScale: scaling factor for linear motion (default: 0.001)
Linear: linear motion parameters for each session (if num_sessions is provided)
Constant: constant motion parameters for each session (if num_sessions is provided)

Methods:

__init__(num_sessions=None): Initializes the Motion object.
save(output_folder): Saves the motion data to a file.
load(output_folder): Loads the motion data from a file.
get_motion(session, depth=None): Retrieves the motion for a specific session, optionally considering depth.

get_motion(session, depth=None)

Get the motion for a specific session.

Args:: session (int): The session number. depth (float, optional): The depth value. If None, only the constant motion is returned.
Returns:: float: The motion value for the specified session and depth.

static load(output_folder)

Load the motion data from a file.

Args:: output_folder (str): Path to the folder where the motion data is saved.
Returns:: Motion: An instance of the Motion class with loaded data.

save(output_folder)

Save the motion data to a file.

Args:: output_folder (str): Path to the folder where the motion data will be saved.
Returns:: None

pyDANT.utils.computeAutoCorr(spike_times, window, binwidth)

Compute the autocorrelation of spike times.

Refer to the elegant Python impletantation from phylib:

https://github.com/cortex-lab/phylib/blob/master/phylib/stats/ccg.py#L34

Arguments:

spike_times: 1D array of spike times
window: time window for autocorrelation (in ms, default: 300 ms)
binwidth: width of the bins for histogram (in ms, default: 1 ms)

Returns:

auto_corr: autocorrelation values
lag: lag values

pyDANT.utils.computeKernel2D(xp, yp, sig=20)

Compute the 2D kernel matrix for the given points xp and yp.

Arguments:

xp: 2D array of points (n_samples, 2)
yp: 2D array of points (n_samples, 2)
sig: standard deviation for the Gaussian kernel (default: 20)

Returns:

K: kernel matrix (n_samples_xp, n_samples_yp)

pyDANT.utils.corrcoef2(x, y)

Compute the Pearson correlation coefficient between two matrices x and y.

Arguments:

x: 2D array of shape (n_samples, n_features_x)
y: 2D array of shape (n_samples, n_features_y)

Returns:

r: Pearson correlation coefficient matrix of shape (n_features_x, n_features_y)

pyDANT.utils.graphEditNumber(matA, matB)

Compute the merge number of two graphs A and B and the number of same merges.

Arguments:

matA: connectivity matrix of graph A (n_nodes_A, n_nodes_A)
matB: connectivity matrix of graph B (n_nodes_B, n_nodes_B)

Returns:

nSame: number of same merges
nA: number of merges in graph A
nB: number of merges in graph B

pyDANT.utils.spikeLocation(waveforms_mean, channel_locations, n_nearest_channels=20, algorithm='monopolar_triangulation')

Spike location estimation using either center_of_mass or monopolar_triangulation

monopolar_triangulation: refer to Boussard, Julien, Erdem Varol, Hyun Dong Lee, Nishchal Dethe, and Liam Paninski. “Three-Dimensional Spike Localization and Improved Motion Correction for Neuropixels Recordings.” In Advances in Neural Information Processing Systems, 34:22095–105. Curran Associates, Inc., 2021. https://proceedings.neurips.cc/paper/2021/hash/b950ea26ca12daae142bd74dba4427c8-Abstract.html. > https://spikeinterface.readthedocs.io/en/stable/modules/postprocessing.html#spike-locations > https://github.com/SpikeInterface/spikeinterface/blob/main/src/spikeinterface/postprocessing/localization_tools.py#L334

Arguments:

waveforms_mean: mean waveforms (n_channels, n_samples)
channel_locations: 2D array of channel locations (n_channels, 2)
n_nearest_channels: number of nearest channels to consider for localization, default is 20
algorithm: ‘center_of_mass’ or ‘monopolar_triangulation’, default is ‘monopolar_triangulation’

returns:

x: x coordinate of the spike location
y: y coordinate of the spike location
z: z coordinate of the spike location
ptt: peak-to-trough value of the spike waveform

pyDANT.utils.waveformEstimation(waveform_mean, location, channel_locations, location_new)

Waveform estimation with Kriging interpolation.

Arguments:

waveform_mean: mean waveform (n_channels, n_samples)
location: original location of the spike (x, y), 1D array of length 2
channel_locations: 2D array of channel locations (n_channels, 2)
location_new: new location of the spike (x, y), 1D array of length 2

Returns:

waveform_out: estimated waveform at the new location (n_samples)

pyDANT.AutoCuration

pyDANT.AutoCuration.autoCuration(user_settings)

Automatic curation of clustering results. Perform automatic curation of clustering results based on user settings. This function loads precomputed features, applies auto-splitting and auto-merging of clusters, and saves the curated results.

Arguments:

user_settings (dict): User settings

Outputs:

ClusterMatrix.npy: The connectivity matrix of clusters after curation.
IdxCluster.npy: The cluster index of each unit after curation. -1 indicates unpaired units.
MatchedPairs.npy: The matched pairs of units after curation.
CurationPairs.npy: The pairs of units that were curated.
CurationTypes.npy: The types of curation applied to each pair.
CurationTypeNames.npy: The names of the curation types.
Output.npz (optional): A dictionary containing other information about the final results.

pyDANT.Runner

pyDANT.Runner.merge_multishank_outputs(user_settings, shank_ids, unit_shanks)

Merge per-shank pyDANT outputs into root-level global output files.

This function preserves single-shank output file names in the root output folder while keeping rows and columns in the original global unit order. Per-shank matrices are written back to the global matrix positions for those units, and cross-shank entries are left as uncomputed zero values. Local unit indices from each Shank<ID> folder are remapped to original global unit indices using original_unit_indices.npy. Positive cluster IDs are offset across shanks, and -1 remains the unmatched-unit label.

Arguments:

user_settings (dict): User settings
shank_ids (ndarray): Shank IDs processed by runDANTMultiShank()
unit_shanks (ndarray): Global unit-level shank IDs

Outputs:

Root output files matching the single-shank pipeline where applicable, including similarity matrices, SimilarityMatrix.npy, SimilarityPairs.npy, DistanceMatrix.npy, ClusterMatrix.npy, IdxCluster.npy, MatchedPairs.npy, Curation*.npy, ClusteringResults.npz, Output.npz, motion*.npy, and waveforms_corrected.npy.

pyDANT.Runner.runDANT(user_settings)

Run the standard single-shank pyDANT pipeline.

Arguments:

user_settings (dict): User settings

Outputs:

Output.npz and intermediate pipeline files in user_settings[“output_folder”]
RunTimeSec.npy: Total runtime in seconds

pyDANT.Runner.runDANTMultiShank(user_settings)

Run pyDANT on multi-shank data by dispatching each shank separately.

The root output folder receives global preprocessing outputs first. Then each shank is processed in output_folder/Shank<ID> with a unit-subset data view while preserving full channel geometry and channel_shanks.npy. Final outputs are merged back to the root folder.

Arguments:

user_settings (dict): User settings. path_to_data must contain channel_shanks.npy.

Outputs:

output_folder/Shank<ID>/: Per-shank pipeline outputs
output_folder/: Global preprocessing outputs plus merged clustering, curation, motion, similarity, and corrected waveform files in original unit order.
output_folder/Output.npz: Merged global output with IdxUnit and IdxShank.
output_folder/RunTimeSec.npy: Total runtime in seconds