Skip to contents

[Deprecated]

This function attempted to replicate the process of dplyr::arrange() |> dplyr::group_by() |> dplyr::sort(). It was deprecated because the same operation can be done much more quickly and flexibly using dsTidyverseClient::ds.arrange() |> dsTidyverseClient::ds.group_by() |> dsTidyverseClient::ds.slice().

Usage

dh.makeStrata(
  df = NULL,
  id_var = NULL,
  age_var = NULL,
  var_to_subset = NULL,
  bands = NULL,
  mult_action = NULL,
  mult_vals = NULL,
  keep_vars = NULL,
  new_obj = NULL,
  band_action = NULL,
  conns = NULL,
  checks = TRUE,
  df_name = NULL
)

Arguments

df

Character specifying a server-side data frame.

id_var

Character giving the name of the column within df which uniquely identifies each subject.

age_var

Character specifying age or time variable in df.

var_to_subset

Character specifying variable in df to stratify according to bands.

bands

Numeric vector of alternating lower and upper values specifying the bands in which to derive strata of var_to_subset. This vector should be an even number and twice the length of the number of bands required.

mult_action

Character specifying how to handle cases where a subject has more than one measurement within a specified band. Use "earliest" to take the earliest measurement, "latest" to take the latest measurement and "nearest" to take the measurement nearest to the value(s) specified in mult_vals.

mult_vals

Numeric vector specifying the value in each age band to chose values closest to if subjects have more than one value per band. Required only if mult_action is "nearest". The order and length of the vector should correspond to the order and number of the bands.

keep_vars

Optionally, a vector of variable names within df to include within each strata created.

new_obj

Character specifying name for created serverside object.

band_action

Character specifying how the values provided in bands are evaluated in creating the strata:

  • "g_l" = greater than the lowest band and less than the highest band

  • "ge_le" = greater or equal to the lowest band and less than or equal to the highest band

  • "g_le" = greater than the lowest band and less than or equal to the highest band

  • "ge_l" = greater than or equal to the lowest band and less than the highest band

conns

DataSHIELD connections object.

checks

Logical; if TRUE checks are performed prior to running the function. Default is TRUE.

df_name

Retired argument name. Please use `new_obj' instead.

Value

Servside dataframe in wide format containing the derived variables. For each band specified at least two variables will be returned:

  • var_to_subset

  • age_var. The suffix .lower_band identifies the band for that variable.

If argument keep_vars is not NULL, then additional variables will be added to the data frame representing these variables within the strata created.