Determinantal shot noise Cox processes

Stat

Published On 2022/12

We present a new class of cluster point process models, which we call determinantal shot noise Cox processes (DSNCP), with repulsion between cluster centres. They are the special case of generalized shot noise Cox processes where the cluster centres are determinantal point processes. We establish various moment results and describe how these can be used to easily estimate unknown parameters in two particularly tractable cases, namely, when the offspring density is isotropic Gaussian and the kernel of the determinantal point process of cluster centres is Gaussian or like in a scaled Ginibre point process. Through a simulation study and the analysis of a real point pattern data set, we see that when modelling clustered point patterns, a much lower intensity of cluster centres may be needed in DSNCP models as compared to shot noise Cox processes.

Journal

Stat

Published On

2022/12

Volume

11

Issue

1

Page

e502

Authors

Jesper Møller

Jesper Møller

Aalborg Universitet

Position

Professor in Statistics

H-Index(all)

46

H-Index(since 2020)

23

I-10 Index(all)

0

I-10 Index(since 2020)

0

Citation(all)

0

Citation(since 2020)

0

Cited By

0

Research Interests

Mathematical Statistics

Probability Theory

University Profile Page

Other Articles from authors

Jesper Møller

Jesper Møller

Aalborg Universitet

arXiv preprint arXiv:2404.09525

Coupling results and Markovian structures for number representations of continuous random variables

A general setting for nested subdivisions of a bounded real set into intervals defining the digits of a random variable with a probability density function is considered. Under the weak condition that is almost everywhere lower semi-continuous, a coupling between and a non-negative integer-valued random variable is established so that have an interpretation as the ``sufficient digits'', since the distribution of conditioned on does not depend on . Adding a condition about a Markovian structure of the lengths of the intervals in the nested subdivisions, becomes a Markov chain of a certain order . If then are IID with a known distribution. When and the Markov chain is uniformly geometric ergodic, a coupling is established between and a random time so that the chain after time is stationary and follows a simple known distribution. The results are related to several examples of number representations generated by a dynamical system, including base- expansions, generalized L\"uroth series, -expansions, and continued fraction representations. The importance of the results and some suggestions and open problems for future research are discussed.

Jesper Møller

Jesper Møller

Aalborg Universitet

arXiv preprint arXiv:2404.08387

The asymptotic distribution of the scaled remainder for pseudo golden ratio expansions of a continuous random variable

Let be the base- expansion of a continuous random variable on the unit interval where is the positive solution to for an integer (i.e., is a generalization of the golden mean for which ). We study the asymptotic distribution and convergence rate of the scaled remainder when tends to infinity.

Jesper Møller

Jesper Møller

Aalborg Universitet

Methodology and Computing in Applied Probability

How many digits are needed?

Let be the digits in the base-q expansion of a random variable X defined on [0, 1) where is an integer. For , we study the probability distribution of the (scaled) remainder : If X has an absolutely continuous CDF then converges in the total variation metric to the Lebesgue measure on the unit interval. Under weak smoothness conditions we establish first a coupling between X and a non-negative integer valued random variable N so that follows and is independent of , and second exponentially fast convergence of and its PDF . We discuss how many digits are needed and show examples of our results.

Jesper Møller

Jesper Møller

Aalborg Universitet

arXiv preprint arXiv:2312.09652

The asymptotic distribution of the remainder in a certain base- expansion

Let be the base- expansion of a continuous random variable on the unit interval where is the golden ratio. We study the asymptotic distribution and convergence rate of the scaled remainder when tends to infinity.

2023/12/15

Article Details
Jesper Møller

Jesper Møller

Aalborg Universitet

Proceedings of the London Mathematical Society

Realizability and tameness of fusion systems

A saturated fusion system over a finite p$p$‐group S$S$ is a category whose objects are the subgroups of S$S$ and whose morphisms are injective homomorphisms between the subgroups satisfying certain axioms. A fusion system over S$S$ is realized by a finite group G$G$ if S$S$ is a Sylow p$p$‐subgroup of G$G$ and morphisms in the category are those induced by conjugation in G$G$. One recurrent question in this subject is to find criteria as to whether a given saturated fusion system is realizable or not. One main result in this paper is that a saturated fusion system is realizable if all of its components (in the sense of Aschbacher) are realizable. Another result is that all realizable fusion systems are tame: a finer condition on realizable fusion systems that involves describing automorphisms of a fusion system in terms of those of some group that realizes it. Stated in this way, these results depend on the …

Jesper Møller

Jesper Møller

Aalborg Universitet

ACM Transactions on Spatial Algorithms and Systems

Stochastic Routing with Arrival Windows

Arriving at a destination within a specific time window is important in many transportation settings. For example, trucks may be penalized for early or late arrivals at compact terminals, and early and late arrivals at general practitioners, dentists, and so on, are also discouraged, in part due to COVID. We propose foundations for routing with arrival-window constraints. In a setting where the travel time of a road segment is modeled by a probability distribution, we define two problems where the aim is to find a route from a source to a destination that optimizes or yields a high probability of arriving within a time window while departing as late as possible. In this setting, a core challenge is to enable comparison between paths that may potentially be part of a result path with the goal of determining whether a path is uninteresting and can be disregarded given the existence of another path. We show that existing solutions …

2023/11/21

Article Details
Jesper Møller

Jesper Møller

Aalborg Universitet

Spatial Statistics

Fitting the grain orientation distribution of a polycrystalline material conditioned on a Laguerre tessellation

The description of distributions related to grain microstructure helps physicists to understand the processes in materials and their properties. This paper presents a general statistical methodology for the analysis of crystallographic orientations of grains in a 3D Laguerre tessellation dataset which represents the microstructure of a polycrystalline material. We introduce complex stochastic models which may substitute expensive laboratory experiments: conditional on the Laguerre tessellation, we suggest interaction models for the distribution of cubic crystal lattice orientations, where the interaction is between pairs of orientations for neighbouring grains in the tessellation. We discuss parameter estimation and model comparison methods based on maximum pseudolikelihood as well as graphical procedures for model checking using simulations. Our methodology is applied for analysing a dataset representing a nickel …

Jesper Møller

Jesper Møller

Aalborg Universitet

Methodology and Computing in Applied Probability

Singular distribution functions for random variables with stationary digits

Let F be the cumulative distribution function (CDF) of the base-q expansion , where is an integer and is a stationary stochastic process with state space . In a previous paper we characterized the absolutely continuous and the discrete components of F. In this paper we study special cases of models, including stationary Markov chains of any order and stationary renewal point processes, where we establish a law of pure types: F is then either a uniform or a singular CDF on [0, 1]. Moreover, we study mixtures of such models. In most cases expressions and plots of F are given.

Jesper Møller

Jesper Møller

Aalborg Universitet

arXiv preprint arXiv:2212.08402

Cox processes driven by transformed Gaussian processes on linear networks

There is a lack of point process models on linear networks. For an arbitrary linear network, we use isotropic covariance functions with respect to the geodesic metric or the resistance metric to construct new models for isotropic Gaussian processes and hence new models for various Cox processes with isotropic pair correlation functions. In particular we introduce three model classes given by log Gaussian, interrupted, and permanental Cox processes on linear networks, and consider for the first time statistical procedures and applications for parametric families of such models. Moreover, we construct new simulation algorithms for Gaussian processes on linear networks and discuss whether the geodesic metric or the resistance metric should be used for the kind of Cox processes studied in this paper.

2022/12/16

Article Details
Jesper Møller

Jesper Møller

Aalborg Universitet

International Statistical Review

Should we condition on the number of points when modelling spatial point patterns?

We discuss the practice of directly or indirectly assuming a model for the number of points when modelling spatial point patterns even though it is rarely possible to validate such a model in practice because most point pattern data consist of only one pattern. We therefore explore the possibility to condition on the number of points instead when fitting and validating spatial point process models. In a simulation study with different popular spatial point process models, we consider model validation using global envelope tests based on functional summary statistics. We find that conditioning on the number of points will for some functional summary statistics lead to more narrow envelopes and thus stronger tests and that it can also be useful for correcting for some conservativeness in the tests when testing composite hypothesis. However, for other functional summary statistics, it makes little or no difference to condition …

Jesper Møller

Jesper Møller

Aalborg Universitet

Journal of Applied Probability

Characterization of random variables with stationary digits

Let be an integer, a stochastic process with state space , and F the cumulative distribution function (CDF) of . We show that stationarity of is equivalent to a functional equation obeyed by F, and use this to characterize the characteristic function of X and the structure of F in terms of its Lebesgue decomposition. More precisely, while the absolutely continuous component of F can only be the uniform distribution on the unit interval, its discrete component can only be a countable convex combination of certain explicitly computable CDFs for probability distributions with finite support. We also show that is a Rajchman measure if and only if F is the uniform CDF on [0, 1].

Jesper Møller

Jesper Møller

Aalborg Universitet

Spatial Statistics

Fitting three-dimensional Laguerre tessellations by hierarchical marked point process models

We present a general statistical methodology for analysing a Laguerre tessellation data set viewed as a realization of a marked point process model. In the first step, for the points, we use a nested sequence of multiscale processes which constitute a flexible parametric class of pairwise interaction point process models. In the second step, for the marks/radii conditioned on the points, we consider various exponential family models where the canonical sufficient statistic is based on tessellation characteristics. For each step, parameter estimation based on maximum pseudolikelihood methods is tractable. For model selection, we consider maximized log pseudolikelihood functions for models of the radii conditioned on the points. Model checking is performed using global envelopes and corresponding tests in both steps and moreover by comparing observed and simulated tessellation characteristics in the second step …

Jesper Møller

Jesper Møller

Aalborg Universitet

Translational psychiatry

Layer III pyramidal cells in the prefrontal cortex reveal morphological changes in subjects with depression, schizophrenia, and suicide

Brodmann Area 46 (BA46) has long been regarded as a hotspot of disease pathology in individuals with schizophrenia (SCH) and major depressive disorder (MDD). Pyramidal neurons in layer III of the Brodmann Area 46 (BA46) project to other cortical regions and play a fundamental role in corticocortical and thalamocortical circuits. The AutoCUTS-LM pipeline was used to study the 3-dimensional structural morphology and spatial organization of pyramidal cells. Using quantitative light microscopy, we used stereology to calculate the entire volume of layer III in BA46 and the total number and density of pyramidal cells. Volume tensors estimated by the planar rotator quantified the volume, shape, and nucleus displacement of pyramidal cells. All of these assessments were carried out in four groups of subjects: controls (C, n = 10), SCH (n = 10), MDD (n = 8), and suicide subjects with a history of depression (SU …

Jesper Møller

Jesper Møller

Aalborg Universitet

Graphs and Combinatorics

Equivariant Euler characteristics of symplectic buildings

We compute the equivariant Euler characteristics of the buildings for the symplectic groups over finite fields.

Jesper Møller

Jesper Møller

Aalborg Universitet

Journal of Computational and Graphical Statistics

MCMC computations for Bayesian mixture models using repulsive point processes

Repulsive mixture models have recently gained popularity for Bayesian cluster detection. Compared to more traditional mixture models, repulsive mixture models produce a smaller number of well-separated clusters. The most commonly used methods for posterior inference either require to fix a priori the number of components or are based on reversible jump MCMC computation. We present a general framework for mixture models, when the prior of the “cluster centers” is a finite repulsive point process depending on a hyperparameter, specified by a density which may depend on an intractable normalizing constant. By investigating the posterior characterization of this class of mixture models, we derive a MCMC algorithm which avoids the well-known difficulties associated to reversible jump MCMC computation. In particular, we use an ancillary variable method, which eliminates the problem of having intractable …

Jesper Møller

Jesper Møller

Aalborg Universitet

Scandinavian Journal of Statistics

Approximate Bayesian inference for a spatial point process model exhibiting regularity and random aggregation

In this article, we propose a doubly stochastic spatial point process model with both aggregation and repulsion. This model combines the ideas behind Strauss processes and log Gaussian Cox processes. The likelihood for this model is not expressible in closed form but it is easy to simulate realizations under the model. We therefore explain how to use approximate Bayesian computation (ABC) to carry out statistical inference for this model. We suggest a method for model validation based on posterior predictions and global envelopes. We illustrate the ABC procedure and model validation approach using both simulated point patterns and a real data example.

Jesper Møller

Jesper Møller

Aalborg Universitet

Journal of Algebraic Combinatorics

Equivariant Euler characteristics of unitary buildings

The (p-primary) equivariant Euler characteristics of the buildings for the general unitary groups over finite fields are determined.

Jesper Møller

Jesper Møller

Aalborg Universitet

Communications Biology

Cellular 3D-reconstruction and analysis in the human cerebral cortex using automatic serial sections

Techniques involving three-dimensional (3D) tissue structure reconstruction and analysis provide a better understanding of changes in molecules and function. We have developed AutoCUTS-LM, an automated system that allows the latest advances in 3D tissue reconstruction and cellular analysis developments using light microscopy on various tissues, including archived tissue. The workflow in this paper involved advanced tissue sampling methods of the human cerebral cortex, an automated serial section collection system, digital tissue library, cell detection using convolution neural network, 3D cell reconstruction, and advanced analysis. Our results demonstrated the detailed structure of pyramidal cells (number, volume, diameter, sphericity and orientation) and their 3D spatial organization are arranged in a columnar structure. The pipeline of these combined techniques provides a detailed analysis of tissues …

Jesper Møller

Jesper Møller

Aalborg Universitet

AMERICAN MATHEMATICAL SOCIETY

THE NUMBER OF p-ELEMENTS IN FINITE GROUPS OF LIE TYPE OF CHARACTERISTIC p

The combinatorics of the poset of p-radical p-subgroups of a finite group is used to count the number of p-elements.

Other articles from Stat journal

Takeuchi Ichiro

Takeuchi Ichiro

Nagoya Institute of Technology

Stat

A confidence machine for sparse high‐order interaction model

In predictive modelling for high‐stake decision‐making, predictors must be not only accurate but also reliable. Conformal prediction (CP) is a promising approach for obtaining the coverage of prediction results with fewer theoretical assumptions. To obtain the prediction set by so‐called full‐CP, we need to refit the predictor for all possible values of prediction results, which is only possible for simple predictors. For complex predictors such as random forests (RFs) or neural networks (NNs), split‐CP is often employed where the data is split into two parts: one part for fitting and another for computing the prediction set. Unfortunately, because of the reduced sample size, split‐CP is inferior to full‐CP both in fitting as well as prediction set computation. In this paper, we develop a full‐CP of sparse high‐order interaction model (SHIM), which is sufficiently flexible as it can take into account high‐order interactions among …

J. Philip Miller

J. Philip Miller

Washington University in St. Louis

Stat

Deep learning models to predict primary open‐angle glaucoma

Glaucoma is a major cause of blindness and vision impairment worldwide, and visual field (VF) tests are essential for monitoring the conversion of glaucoma. While previous studies have primarily focused on using VF data at a single time point for glaucoma prediction, there has been limited exploration of longitudinal trajectories. Additionally, many deep learning techniques treat the time‐to‐glaucoma prediction as a binary classification problem (glaucoma Yes/No), resulting in the misclassification of some censored subjects into the nonglaucoma category and decreased power. To tackle these challenges, we propose and implement several deep‐learning approaches that naturally incorporate temporal and spatial information from longitudinal VF data to predict time‐to‐glaucoma. When evaluated on the Ocular Hypertension Treatment Study (OHTS) dataset, our proposed convolutional neural network (CNN)‐long …

Neil Thorpe

Neil Thorpe

Newcastle University

Stat

Using extreme value theory to evaluate the leading pedestrian interval road safety intervention

Improving road safety is hugely important with the number of deaths on the world's roads remaining unacceptably high; an estimated 1.3 million people die each year as a result of road traffic collisions. Current practice for treating collision hotspots is almost always reactive: once a threshold level of collisions has been overtopped during some pre‐determined observation period, treatment is applied (e.g., road safety cameras). Traffic collisions are rare, so prolonged observation periods are necessary. However, traffic conflicts are more frequent and are a margin of the social cost; hence, traffic conflict before/after studies can be conducted over shorter time periods. We investigate the effect of implementing the leading pedestrian interval treatment at signalised intersections as a safety intervention in a city in north America. Pedestrian‐vehicle traffic conflict data were collected from treatment and control sites during the …

David J Edwards

David J Edwards

Virginia Commonwealth University

Stat

Developing partnerships for academic data science consulting and collaboration units

Data science consulting and collaboration units (DSUs) are core infrastructure for research at universities. Activities span data management, study design, data analysis, data visualization, predictive modelling, preparing reports, manuscript writing and advising on statistical methods and may include an experiential or teaching component. Partnerships are needed for a thriving DSU as an active part of the larger university network. Guidance for identifying, developing and managing successful partnerships for DSUs can be summarized in six rules: (1) align with institutional strategic plans, (2) cultivate partnerships that fit your mission, (3) ensure sustainability and prepare for growth, (4) define clear expectations in a partnership agreement, (5) communicate and (6) expect the unexpected. While these rules are not exhaustive, they are derived from experiences in a diverse set of DSUs, which vary by administrative …

Ingrid Van Keilegom

Ingrid Van Keilegom

Katholieke Universiteit Leuven

Stat

Estimation of the density for censored and contaminated data

Consider a situation where one is interested in estimating the density of a survival time that is subject to random right censoring and measurement errors. This happens often in practice, like in public health (pregnancy length), medicine (duration of infection), ecology (duration of forest fire), among others. We assume a classical additive measurement error model with Gaussian noise and unknown error variance and a random right censoring scheme. Under this setup, we develop minimal conditions under which the assumed model is identifiable when no auxiliary variables or validation data are available, and we offer a flexible estimation strategy using Laguerre polynomials for the estimation of the error variance and the density of the survival time. The asymptotic normality of the proposed estimators is established, and the numerical performance of the methodology is investigated on both simulated and real data …

Pablo Martínez-Camblor

Pablo Martínez-Camblor

Dartmouth College

Stat

Comparing the effectiveness of k k‐different treatments through the area under the ROC curve

The area under the receiver‐operating characteristic curve (AUC) has become a popular index not only for measuring the overall prediction capacity of a marker but also the strength of the association between continuous and binary variables. In the current considered study, the AUC was used for comparing the association size of four different interventions involving impulsive decision making, studied through an animal model, in which each animal provides several negative (pretreatment) and positive (posttreatment) measures. The problem of the full comparison of the average AUCs arises therefore in a natural way. We construct an analysis of variance (ANOVA) type test for testing the equality of the impact of these treatments measured through the respective AUCs and considering the random‐effect represented by the animal. The use (and development) of a post hoc Tukey's HSD‐type test is also considered …

Sungkyu Jung

Sungkyu Jung

Seoul National University

Stat

Highly private large‐sample tests for contingency tables

Differential privacy is a foundational concept for safeguarding sensitive individual information when releasing data or statistical analysis results. In this study, we concentrate on the protection of privacy in the context of goodness‐of‐fit (GOF) and independence tests, utilizing perturbed contingency tables that adhere to Gaussian differential privacy within the high‐privacy regime, where the degrees of privacy protection increase as the sample size increases. We introduce private test procedures for GOF, independence of two variables and the equality of proportions in paired samples, similar to McNemar's test. For each of these hypothesis testing situations, we propose private test statistics based on the χ2$$ {\chi}^2 $$ statistics and establish their asymptotic null distributions. We numerically confirm that Type I error rates of the proposed private test procedures are well controlled and have adequate power for larger …

Lei Liu

Lei Liu

Washington University in St. Louis

Stat

Deep learning models to predict primary open‐angle glaucoma

Glaucoma is a major cause of blindness and vision impairment worldwide, and visual field (VF) tests are essential for monitoring the conversion of glaucoma. While previous studies have primarily focused on using VF data at a single time point for glaucoma prediction, there has been limited exploration of longitudinal trajectories. Additionally, many deep learning techniques treat the time‐to‐glaucoma prediction as a binary classification problem (glaucoma Yes/No), resulting in the misclassification of some censored subjects into the nonglaucoma category and decreased power. To tackle these challenges, we propose and implement several deep‐learning approaches that naturally incorporate temporal and spatial information from longitudinal VF data to predict time‐to‐glaucoma. When evaluated on the Ocular Hypertension Treatment Study (OHTS) dataset, our proposed convolutional neural network (CNN)‐long …

Justin Strait

Justin Strait

University of Georgia

Stat

Visualisation and outlier detection for probability density function ensembles

Exploratory data analysis (EDA) for functional data—data objects where observations are entire functions—is a difficult problem that has seen significant attention in recent literature. This surge in interest is motivated by the ubiquitous nature of functional data, which are prevalent in applications across fields such as meteorology, biology, medicine and engineering. Empirical probability density functions (PDFs) can be viewed as constrained functional data objects that must integrate to one and be nonnegative. They show up in contexts such as yearly income distributions, zooplankton size structure in oceanography and in connectivity patterns in the brain, among others. While PDF data are certainly common in modern research, little attention has been given to EDA specifically for PDFs. In this paper, we extend several methods for EDA on functional data for PDFs and compare them on simulated data that exhibit …

Yongdao Zhou

Yongdao Zhou

Nankai University

Stat

Reducing the statistical error of generative adversarial networks using space‐filling sampling

This paper introduces a novel approach to reducing statistical errors in generative models, with a specific focus on generative adversarial networks (GANs). Inspired by the error analysis of GANs, we find that statistical errors mainly arise from random sampling, leading to significant uncertainties in GANs. To address this issue, we propose a selective sampling mechanism called space‐filling sampling. Our method aims to increase the sampling probability in areas with insufficient data, thereby improving the learning performance of the generator. Theoretical analysis confirms the effectiveness of our approach in reducing statistical errors and accelerating convergence in GANs. This research represents a pioneering effort in targeting the reduction of statistical errors in GANs, and it demonstrates the potential for enhancing the training of other generative models.

Marianne Huebner

Marianne Huebner

Michigan State University

Stat

Developing partnerships for academic data science consulting and collaboration units

Data science consulting and collaboration units (DSUs) are core infrastructure for research at universities. Activities span data management, study design, data analysis, data visualization, predictive modelling, preparing reports, manuscript writing and advising on statistical methods and may include an experiential or teaching component. Partnerships are needed for a thriving DSU as an active part of the larger university network. Guidance for identifying, developing and managing successful partnerships for DSUs can be summarized in six rules: (1) align with institutional strategic plans, (2) cultivate partnerships that fit your mission, (3) ensure sustainability and prepare for growth, (4) define clear expectations in a partnership agreement, (5) communicate and (6) expect the unexpected. While these rules are not exhaustive, they are derived from experiences in a diverse set of DSUs, which vary by administrative …

Matthew Reimherr

Matthew Reimherr

Penn State University

stat

On Hypothesis Transfer Learning of Functional Linear Models

We study the transfer learning (TL) for the functional linear regression (FLR) under the Reproducing Kernel Hilbert Space (RKHS) framework, observing the TL techniques in existing high-dimensional linear regression is not compatible with the truncation-based FLR methods as functional data are intrinsically infinite-dimensional and generated by smooth underlying processes. We measure the similarity across tasks using RKHS distance, allowing the type of information being transferred tied to the properties of the imposed RKHS. Building on the hypothesis offset transfer learning paradigm, two algorithms are proposed: one conducts the transfer when positive sources are known, while the other leverages aggregation techniques to achieve robust transfer without prior information about the sources. We establish lower bounds for this learning problem and show the proposed algorithms enjoy a matching asymptotic upper bound. These analyses provide statistical insights into factors that contribute to the dynamics of the transfer. We also extend the results to functional generalized linear models. The effectiveness of the proposed algorithms is demonstrated on extensive synthetic data as well as a financial data application.

Yaping Wang

Yaping Wang

East China Normal University

Stat

New bounds and search for maximin distance U‐type designs

Maximin distance designs have attracted increasing attention in computer experiments owing to their appealing space‐filling properties. The quality of these designs is typically evaluated by comparing their separation distance with the associated upper bound. Nevertheless, deriving tight upper bounds for the separation distance of designs remains a challenging problem that has been infrequently addressed in the literature. In this study, we obtain a new upper bound for the separation distance of certain classes of five‐level U‐type designs. We also investigate the characteristics of maximin distance U‐type designs and show the optimality of some existing orthogonal designs. Based on these theoretical results, we develop an efficient algorithm for searching maximin distance U‐type designs. Numerical studies and comparisons are given to show the superior performance of the obtained designs.

Wenjuan Ma

Wenjuan Ma

Michigan State University

Stat

What matters to graduate students? Experiences at a statistical consulting center from pre- to post-COVID-19 pandemic

The COVID‐19 pandemic led to unprecedented changes in all levels of society, including the statistical consulting field. This paper focuses on the experiences of graduate student consultants and clients at our statistical consulting center (SCC) that operates all year independent of semesters. During the lockdown period, work continued without interruption and was conducted remotely, but there was a temporary reduction in utilization. Advice on statistical methods, help with data analysis and educational offerings are the main appeals to utilize SCC services. We describe our mentoring approach for graduate student research assistants (RAs) and how pandemic changes affected RAs and clients. Based on experiences during the pandemic, we offer practical suggestions for SCCs' approaches to research support, work characteristics and collaborations to improve the experiences of graduate students, both as …

Andrew Golightly

Andrew Golightly

Newcastle University

Stat

Using extreme value theory to evaluate the leading pedestrian interval road safety intervention

Improving road safety is hugely important with the number of deaths on the world's roads remaining unacceptably high; an estimated 1.3 million people die each year as a result of road traffic collisions. Current practice for treating collision hotspots is almost always reactive: once a threshold level of collisions has been overtopped during some pre‐determined observation period, treatment is applied (e.g., road safety cameras). Traffic collisions are rare, so prolonged observation periods are necessary. However, traffic conflicts are more frequent and are a margin of the social cost; hence, traffic conflict before/after studies can be conducted over shorter time periods. We investigate the effect of implementing the leading pedestrian interval treatment at signalised intersections as a safety intervention in a city in north America. Pedestrian‐vehicle traffic conflict data were collected from treatment and control sites during the …

Camille Hochheimer

Camille Hochheimer

University of Virginia

Stat

Reproducible research practices: A tool for effective and efficient leadership in collaborative statistics

With data and code sharing policies more common and version control more widely used in statistics, standards for reproducible research are higher than ever. Reproducible research practices must keep up with the fast pace of research. To do so, we propose combining modern practices of leadership with best practices for reproducible research in collaborative statistics as an effective tool for ensuring quality and accuracy while developing stewardship and autonomy in the people we lead. First, we establish a framework for expectations of reproducible statistical research. Then, we introduce Stephen M.R. Covey's theory of trusting and inspiring leadership. These two are combined as we show how stewardship agreements can be used to make reproducible coding a team norm. We provide an illustrative code example and highlight how this method creates a more collaborative rather than evaluative culture …

Gina-Maria Pomann

Gina-Maria Pomann

Duke University

Stat

Developing partnerships for academic data science consulting and collaboration units

Data science consulting and collaboration units (DSUs) are core infrastructure for research at universities. Activities span data management, study design, data analysis, data visualization, predictive modelling, preparing reports, manuscript writing and advising on statistical methods and may include an experiential or teaching component. Partnerships are needed for a thriving DSU as an active part of the larger university network. Guidance for identifying, developing and managing successful partnerships for DSUs can be summarized in six rules: (1) align with institutional strategic plans, (2) cultivate partnerships that fit your mission, (3) ensure sustainability and prepare for growth, (4) define clear expectations in a partnership agreement, (5) communicate and (6) expect the unexpected. While these rules are not exhaustive, they are derived from experiences in a diverse set of DSUs, which vary by administrative …

Abdul Haq

Abdul Haq

Quaid-i-Azam University

Stat

An EWMA sign chart for monitoring processes with fixed and variable sample sizes

This study addresses limitations in the nonparametric EWMA sign chart with fixed control limits (FCLs), particularly when facing time‐varying sample sizes. The FCLs‐based EWMA sign chart has a variable conditional false alarm rate (CFAR), especially at the startup of a process or after recovering from an out‐of‐control signal. To overcome these limitations, we propose a nonparametric EWMA sign chart based on dynamic probability control limits. This chart is capable of monitoring the process target with fixed, as well as time‐varying sample sizes. Monte Carlo simulations are used to estimate the CFARs, zero‐state (ZS) and steady‐state (SS) average run‐length profiles of the EWMA sign charts. It turns out that the proposed chart outperforms the existing chart, particularly in detecting shifts during the process startup, while maintaining the desired CFAR levels in both ZS and SS scenarios. A real data example is …

Marianne Huebner

Marianne Huebner

Michigan State University

Stat

What matters to graduate students? Experiences at a statistical consulting center from pre‐to post‐COVID‐19 pandemic

The COVID‐19 pandemic led to unprecedented changes in all levels of society, including the statistical consulting field. This paper focuses on the experiences of graduate student consultants and clients at our statistical consulting center (SCC) that operates all year independent of semesters. During the lockdown period, work continued without interruption and was conducted remotely, but there was a temporary reduction in utilization. Advice on statistical methods, help with data analysis and educational offerings are the main appeals to utilize SCC services. We describe our mentoring approach for graduate student research assistants (RAs) and how pandemic changes affected RAs and clients. Based on experiences during the pandemic, we offer practical suggestions for SCCs' approaches to research support, work characteristics and collaborations to improve the experiences of graduate students, both as …

Beom Seuk Hwang

Beom Seuk Hwang

Chung-Ang University

Stat

Ordered probit Bayesian additive regression trees for ordinal data

Bayesian additive regression trees (BART) is a nonparametric model that is known for its flexibility and strong statistical foundation. To address a robust and flexible approach to analyse ordinal data, we extend BART into an ordered probit regression framework (OPBART). Further, we propose a semiparametric setting for OPBART (semi‐OPBART) to model covariates of interest parametrically and confounding variables nonparametrically. We also provide Gibbs sampling procedures to implement the proposed models. In both simulations and real data studies, the proposed models demonstrate superior performance over other competing ordinal models. We also highlight enhanced interpretability of semi‐OPBART in terms of inference through marginal effects.