Here, basic steps to transform intensive longitudinal data into the format needed for the DMBM are explained. For illustrative purposes, we simulate a data set below.
Here, we demonstrate how measurement-burst data can be prepared for the Dynamic Measurement-Burst Model (DMBM). Because the empirical data used in the manuscript cannot be shared publicly, we simulate a small example dataset.
The goal of this section is to illustrate two key preprocessing steps:
1. Transform burst data from long format into a semi-wide format 2. Construct and adjust the TINTERVAL variable required by Mplus
These steps ensure that: - within-burst dynamics can be modeled in long format - between-burst dynamics of process features can be modeled in wide format - Mplus does not waste computation time interpreting large gaps between bursts.
Simulate toy example
As the manuscript uses 3rd party data we are not permitted to share publicly, we will simulate example data below.
# ============================================================# 1) SIMULATE "LONG FORMAT" measurement-burst data (toy example)# - 6 participants# - 3 bursts (W1/W2/W3)# - 5 ESM prompts per burst# - 2 example variables: confidence, depression# - Each burst spans TWO days# ============================================================n_id <-6n_waves <-3k_prompts <-5ids <-sprintf("P%02d", 1:n_id)wave_start <-tibble(W =1:3,wave_gap_days =c(0, 30, 60) # 30-day gaps between bursts)df_long <-expand_grid(UUID = ids,W =1:3,t =1:k_prompts) %>%left_join(wave_start, by ="W") %>%group_by(UUID, W) %>%mutate(base_time =ymd_hms("2020-01-01 08:00:00", tz ="Europe/Paris") +days(wave_gap_days[1]),# First 3 prompts on Day 1, remaining on Day 2day_offset =if_else(t <=ceiling(k_prompts /2), 0, 1),# Within-day spacing every 3 hourshour_offset =3* ((t -1) %%ceiling(k_prompts /2)),Date_Time = base_time +days(day_offset) +hours(hour_offset) +minutes(sample(0:20, 1)) # small person-specific offset ) %>%ungroup() %>%mutate(confidence =round(rnorm(n(), mean =2.5+0.3* W, sd =0.8), 2),depression =round(rnorm(n(), mean =4.0-0.2* W, sd =0.9), 2) ) %>%select(UUID, W, t, Date_Time, confidence, depression)print(df_long, n =20)
The simulated data illustrate the typical structure of measurement-burst designs:
measurement occasions ( t ) nested within bursts ( W )
bursts nested within persons ( UUID )
Importantly, bursts are separated by large time gaps (e.g., 30 days), while observations within bursts are closely spaced (e.g., every 3 hours).
This distinction is crucial when constructing the TINTERVAL variable.
Data Transformation
To prepare the data for the DMBM, we first reshape the dataset into a semi-wide format.
In this format:
intensive longitudinal observations remain in long format
burst-level variables appear as separate columns per burst
This allows the model to simultaneously estimate:
within-burst dynamics
between-burst dynamics of process features.
Step: Convert bursts to columns
Each burst receives its own set of variables (e.g., confidence_W1, confidence_W2, confidence_W3).
#####################################################################################################################df_long <- df_long %>%group_by(UUID, W) %>%arrange(Date_Time, .by_group =TRUE) %>%mutate(t =row_number()) %>%# 1..k per person×waveungroup()# transform to wide format:df_wide <- df_long %>%# choose the variables that should become wave-specific columnspivot_wider(id_cols =c(UUID, t, Date_Time),names_from = W,values_from =c(confidence, depression),names_glue ="{.value}_W{W}" )print(df_wide, n =20)
# A tibble: 90 × 9
UUID t Date_Time confidence_W1 confidence_W2 confidence_W3
<chr> <int> <dttm> <dbl> <dbl> <dbl>
1 P01 1 2020-01-01 08:03:00 1.88 NA NA
2 P01 2 2020-01-01 11:03:00 2.57 NA NA
3 P01 3 2020-01-01 14:03:00 2.56 NA NA
4 P01 4 2020-01-02 08:03:00 2.47 NA NA
5 P01 5 2020-01-02 11:03:00 3 NA NA
6 P01 1 2020-01-31 08:06:00 NA 2.39 NA
7 P01 2 2020-01-31 11:06:00 NA 3.45 NA
8 P01 3 2020-01-31 14:06:00 NA 2.11 NA
9 P01 4 2020-02-01 08:06:00 NA 2.92 NA
10 P01 5 2020-02-01 11:06:00 NA 3.4 NA
11 P01 1 2020-03-01 08:00:00 NA NA 3.51
12 P01 2 2020-03-01 11:00:00 NA NA 4.04
13 P01 3 2020-03-01 14:00:00 NA NA 3.35
14 P01 4 2020-03-02 08:00:00 NA NA 3.8
15 P01 5 2020-03-02 11:00:00 NA NA 4.27
16 P02 1 2020-01-01 08:01:00 2.25 NA NA
17 P02 2 2020-01-01 11:01:00 1.77 NA NA
18 P02 3 2020-01-01 14:01:00 2.84 NA NA
19 P02 4 2020-01-02 08:01:00 2.61 NA NA
20 P02 5 2020-01-02 11:01:00 2.37 NA NA
# ℹ 70 more rows
# ℹ 3 more variables: depression_W1 <dbl>, depression_W2 <dbl>,
# depression_W3 <dbl>
Dynamic models in Mplus (DSEM) require a continuous time variable that represents elapsed time since the beginning of the study for each participant. This variable is then referenced in the TINTERVAL option in Mplus.
Instead of counting measurement occasions, this variable encodes time since the first observation for each participant.
For example:
first observation → 0
second observation 3 hours later → 3
third observation 6 hours later → 6
This allows Mplus to correctly interpret temporal precedence and to correctly account for unequal spacing between observations.
Measurement-burst complication
Measurement-burst data introduce a practical complication:
The time gap between bursts can be extremely large.
For example, a 30-day gap corresponds to 720 hours.
If left unchanged, Mplus interprets these large gaps literally and inserts many missing time points internally, which can dramatically slow estimation.
Key idea
We only model dynamics within bursts.
Therefore:
long gaps between bursts do not carry meaningful information -
compressing them does not affect estimation
We therefore:
keep the temporal ordering within bursts intact
replace large inter-burst gaps with a small fixed interval (e.g., 24 hours).
This preserves the temporal structure needed for estimation while avoiding unnecessary computational burden.
This can be done in 3 steps:
Step 1 — Create time interval variable
We first compute elapsed time since each participant’s first observation.