---
title: "Consistency tests in depth"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Consistency tests in depth}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
  %\VignetteDepends{pkgload}
bibliography: references.bib
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
pkgload::load_all()
```

```{r setup}
library(scrutiny)
```

# Introduction

When implementing consistency tests in R, you shouldn't have to start from zero. This vignette goes into detail on scrutiny's support system for writing new consistency testing functions. For a brief presentation, see `vignette("consistency-tests-simple")`.

Following the present vignette will dramatically simplify the implementation of basic and advanced testing routines via function factories. It will enable you to write entire families of functions in a streamlined way: If you are familiar with any one scrutiny-style consistency test, you will immediately be able to make some sense of the other ones. This is true across all levels of consistency testing.

Below is an outline of these levels, and of the vignette, with GRIM as a paradigmatic example. If a valid consistency test is newly implemented with at least the first step, I'll be happy to accept a pull request to scrutiny. This means you'll only have to implement the core test itself, without even reading the vignette any further.

1.  A bare-bones, non-exported (!) function for testing a single set of cases, such as `grim_scalar()`.

2.  A vectorized version of the single-case function, such as `grim()`.

3.  A specialized mapping function that applies the single-case function to a data frame, such as `grim_map()`.

4.  A method for the `audit()` generic that summarizes the results of number 3.

5.  A visualization function that plots the results of number 3, such as `grim_plot()`.

6.  A mapping function that checks if slightly varied input values are consistent with the respective other reported values, such as `grim_map_seq()`.

7.  A mapping function to be used if only the total sample size was reported (in a study with two groups), not the individual group sizes, such as `grim_map_total_n()`.

8.  `audit_seq()` and `audit_total_n()` already work with the output of numbers 6 and 7, respectively. They still have to be specifically documented.

I will use a toy test called SCHLIM as a model to demonstrate the minimal steps needed to implement consistency tests, scrutiny-style. Note that SCHLIM doesn't have any significance beyond standing in for serious consistency tests. Any real implementation might well be more complex than the brief code snippets below. I will also recur to existing functions that implement actual tests, and that the reader may be familiar with.

Please make sure to follow the [tidyverse style guide](https://style.tidyverse.org/) as well as the scrutiny-specific conventions laid out below, wherever applicable. If you'd like to write a new package, work with the free online book [*R Packages*](https://r-pkgs.org/) [@wickham2023].

# 1. Single-case

The first function is the most important one. It contains the core implementation of the test. Although it is not exported itself, all other steps build up on it, and all of them are exported.

This function takes two or more arguments of length 1 that are meant to be tested for consistency with each other. Typically, they will be coercible to numeric. This means they either are numeric themselves or they are strings that can be converted to numbers (see `is_numeric_like()`). The function returns a logical value of length 1: It's `TRUE` if the inputs are mutually consistent, and `FALSE` if they aren't.

```{r}
 schlim_scalar <- function(y, n) {
   y <- as.numeric(y)
   n <- as.numeric(n)
   all(y / 3 > n)
 }

schlim_scalar(y = 30, n = 4)
schlim_scalar(y = 2, n = 7)
```

Other arguments might still be necessary, especially if your function reconstructs rounded numbers. An argument that determines how the function will round numbers should be called `rounding`. The function should then internally call `reround()`. The same goes for "unrounding" (i.e., reconstructing rounding bounds) and `unround()`. See also `vignette("rounding-in-depth")`. A single-case function that performs rounding will also need a helper to count decimal places, which should be `decimal_places_scalar()`.

The function's name is that of the test in lowercase, followed by `_scalar` which refers to the one-case limit. If your function happens to be applicable to multiple value sets already due to R's natural vectorization, leave out `_scalar` and skip the next section. If you're building a package, export the function. (This will rarely be the case because every single argument needs to be vectorized.)

# 2. Vectorized

The easiest way to turn a scalar function into a vectorized (i.e., multiple-case) function is to run `Vectorize()` on it. The name of the resulting function should be the lower-case name of the test itself, which is also the name of the single-case function without `_scalar`:

```{r}
schlim <- Vectorize(schlim_scalar)

schlim(y = 10:15, n = 4)
```

Functions created this way can be useful for quick testing, but they won't be used in the remaining part of the vignette. That's because functions like `schlim()` are not great to build upon --- unlike mapper functions, which will be discussed next.

# 3. Basic mapper

## Introduction

The most important practical use of a consistency test within scrutiny is to apply it to entire data frames at once, as `grim_map()` does. That's also the starting point for every other function below.

Most functions discussed in the remaining part of the vignette deal with data frames. I always use [tibbles](https://r4ds.had.co.nz/tibbles.html), and I strongly recommend the same to you. In fact, the mapper functions introduced here require tibbles. They might not work correctly with non-tibble data frames.

## Creating basic mappers with `function_map()`

The safest and easiest way to create a (basic) mapper is via `function_map()`. A function written this way is also guaranteed to fulfill all of the requirements for mapper functions listed further below. That's a major benefit because the list of requirements is long, and all of the follow-up functions in the remaining vignette assume that the mapper fulfills them.

You will have no such troubles with `function_map()`:

```{r}
schlim_map <- function_map(
  .fun = schlim_scalar,
  .reported = c("y", "n"),
  .name_test = "SCHLIM"
)

# Example data:
df1 <- tibble::tibble(y = 16:25, n = 3:12)

schlim_map(df1)
```

These are the most important arguments:

-   `.fun` is the single-case function from section 1.

-   `.reported` is a string vector naming the reported statistics that `.fun` tests for consistency with each other. These need to be arguments of `.fun`, but `.fun` may have other arguments, as well.

-   `.name_test` simply names the consistency test.

### Context and export

As you can see, `function_map()` is not a helper used inside other functions when creating them with `function()` --- instead, it takes the place of `function()` itself. This makes it a so-called function factory, or more precisely, a [function operator](https://adv-r.hadley.nz/function-operators.html) [@wickham2019]. You already met `base::Vectorize()` in section 2, which is also a function operator, but a more general and straightforward one.

To export a function manufactured this way from your own package, make sure to follow [this purrr FAQ](https://purrr.tidyverse.org/reference/faq-adverbs-export.html). (Incredible as it sounds, scrutiny will then take on the role of purrr.) Your version should look about like this:

```r
schlim_map <- function(...) "dummy"

.onLoad <- function(lib, pkg) {
  schlim_map <<- scrutiny::function_map(
    .fun = schlim_scalar,
    .reported = c("y", "n"),
    .name_test = "SCHLIM"
  )
}
```

### Identifying columns

All such factory-made functions come with a special convenience feature: Their `.reported` values are inserted into the list of the function's parameters. This means you don't need a data frame with the same column names as the `.reported` values. Instead, you can specify the arguments by those names as the names of the actual columns:

```{r}
df2 <- df1
names(df2) <- c("foo", "bar")

df2

schlim_map(df2, y = foo, n = bar)
```

If any columns are neither present in the data frame nor identified via arguments, there will be a precise error:

```{r, error=TRUE}
schlim_map(df2, y = foo)

# With a wrong identification:
schlim_map(df2, n = mike)
```

### Drawbacks

If `function_map()` is so helpful, why would you ever not use it? There are four reasons:

-   Functions produced by `function_map()` don't have any tailor-made checks, messages, or transformations for any specific consistency test. (They do have some more general checks and error messages.)

-   They only have limited capabilities to create columns internally other than `"consistency"`: Values in such columns need to be produced by the basic `*_scalar()` function. (This might replace tailor-made functionality for creating the `reason` column in the output of the handwritten `grimmer_map()`, but it is currently experimental.)

-   They don't support helper columns (see *Terminology* below).

-   Finally, when calling such a manufactured function, any test-specific arguments the user might specify via `…` (the dots) won't trigger RStudio's autocomplete. This is not as dangerous as in some other functions that use the dots because, when calling a function produced by `function_map()`, misspelled argument names always throw an error.

`grim_map()`, `grimmer_map()`, and `debit_map()` were all "handwritten" for flexibility with columns beyond `"consistency"`. For example, the `show_rec` argument in `grim_map()` or the `ratio` column in the function's output would not have been possible with `function_map()`. However, such issues don't affect the `"consistency"` results, and simply going with `function_map()` might often be the better option. If that's what you choose to do, skip right to section 4.

## Writing mappers manually

### Introduction

The remaining part of section 3 explains how to manually write mapper functions like `grim_map()`, `grimmer_map()`, or `debit_map()`. It is quite detailed because it's important to get these things right: Every other function in the rest of this vignette builds up on it. Still, the practical steps are not that complicated, as you can see in the code examples.

### Terminology

It's important to distinguish between *key arguments or columns* and other arguments or columns. The key arguments in a scalar or vectorized consistency-testing function are the values that are tested for consistency with each other, such as `x` and `n` in `grim()`. By extension, key columns are those that contain such values. Every key column has the same name as the respective key argument.

A *helper column* is a column that is not key itself, but still factors into the consistency test. An example is the optional `items` column in `grim_map()`'s input data frame: It transforms the `n` column, which in turn affects the test outcomes. However, helper columns need not work via key columns.

Key and helper columns are *tested* columns because they factor into the test. Any other columns are *non-tested*.

### Requirements

A general system for implementing consistency tests needs some consistency itself. This is especially true for basic mapper functions, because all functions further down the line rely on the mapper's output having some very specific properties.

The level of detail in these requirements might seem pedantic. I still encourage you to follow every step when handwriting a new mapping function. It's easier than it looks at first, and many aspects are indispensable. That is because the interplay between the mapper and the higher-level functions follows a carefully concerted system. If the mapper misses any one ingredient, those other functions may fail.

The (only) requirements for a basic mapping function are:

1.  Its name ends on `_map` instead of `_scalar` but is otherwise the same as the name of the respective `_scalar` function.

2.  Its first argument, `data`, is a tibble (data frame) that contains the key columns of the respective consistency test. Other columns are permitted. The mapper's user never needs to include helper columns and can always replace them by specifying arguments by the same names as those columns. If the user specifies such an argument but the input data frame contains a column by the same name, the function throws an error. No column of the input data frame should be named `"consistency"`.

3.  Its return value is a tibble data frame that contains all of the key input columns. The types of these columns are the same as in the input data frame. They are the first (i.e., leftmost) columns in the output, even if the input isn't ordered this way. If any of these columns are modified within the mapping function, the output should include the modified columns, not the original ones. Examples can be effects of helper columns, but also the change displayed by the `"x"` column in the output of `grim_map(percent = TRUE)`. The output must test `TRUE` with `is_map_df()` and `is_map_basic_df()`, but `FALSE` with the other two `is_map_*()` functions.

4.  Helper columns should be included in the output unless they transform one or more key columns. In this case, representing them in the output would be confusing because their effects have already played out via the transformed key column(s). For every helper column that performs such transformations, the mapper should have a logical argument, `TRUE` by default, that determines whether or not the helper column transforms those key column(s). If `TRUE`, the helper column does but is not included in the output itself. If `FALSE`, the helper column is included but no transformation takes place. The name of this logical argument should start on `merge_`, followed by the name of the helper column in question. An example for all of that is the optional `items` column in `grim_map()`'s input data frame, together with this function's `items` and `merge_items` arguments. To work with helper columns, call `manage_helper_col()` within the mapper.

5.  The output data frame also includes a logical column named `"consistency"`. It contains the results of the consistency test, as determined by the respective `*_scalar()` function. In each row, `"consistency"` is `TRUE` if the values to its left are mutually consistent, and `FALSE` if they aren't. This column is placed immediately to the right of the group of key (and, potentially, helper) columns.

6.  If the underlying single-case function performs rounding or unrounding, it should internally call `reround()` and/or `unround()`, respectively. The output data frame of the mapper function will then inherit an S3 class (see section *S3 classes* below) such as `"scr_rounding_up_or_down"`: It consists of `"scr_rounding_"` followed by the rounding specification, e.g., `"up_or_down"`. The latter should also be the default rounding and unrounding specification. This specification can be supplied by the user via an argument called `rounding`, which is then passed down to the single-case function. If `reround()` is called within the mapper, all of its arguments need to be passed down from the mapper, which itself has all of the same arguments, with the same defaults. The same applies to `unround()`.

7.  The output data frame inherits an S3 class that starts on `"scr_"` (short for scrutiny), followed by the name of the mapper function. For example, the output of `grim_map()` inherits the `"scr_grim_map"` class. The `"scr_"` prefix is necessary for some follow-up computations introduced below, so it should be used even within functions that are not part of scrutiny. Any other classes added to the output data frame should also start on `"scr_"`. None of them should end on `"_map"`.

### Implications

Some implications of these requirements, and of the fact that the design space for mapper functions is not restricted in any other ways:

Anything that factors into the consistency test other than tested columns needs to be conveyed to the mapper function via arguments. An example is the `rounding` argument in `grim_map()`. Mapper functions don't need to allow for helper columns.

The input data frame is not necessarily a tibble, but the output data frame is. The input data frame never contains a column named `"consistency"`, but the output data frame always does.

Key columns may or may not be modified by helper columns and/or by arguments. The number of key columns doesn't change between the input and output data frames.

The output data frame may or may not contain non-tested columns from the input. It may or may not contain non-tested columns created within the mapper function itself. (This can be useful, as with `"ratio"` in `grim_map()`'s output.) Any such non-tested, non-`"consistency"` columns go to the right of `"consistency"`.

If the number of key columns plus the number of helper columns in the output is $k$, the index of `"consistency"` is $k+1$.

Besides the `"scr_*_map"` class, the output data frame may inherit any number of other classes added within the mapper, so long as they start on `"scr_"` but don't end on `"_map"`. It can't inherit the `"grouped_df"` or `"rowwise_df"` classes added by `dplyr::group_by()` and `dplyr::rowwise()`, respectively. If either of these functions is called within the mapper, it needs to be followed by `dplyr::ungroup()` at some point.

```{r eval=FALSE, include=FALSE}
# NOTE: The two diagrams below should have width 585 and a locked aspect ratio.
```

The columns of the input data frame are organized like this:

![](images/consistency_test_columns_input-01.png){width="585"}

By contrast, the columns of the output data frame are organized like this:

![](images/consistency_test_columns_output-01.png){width="585"}

### Practical steps

```{r include=FALSE}
# Just internally, so that the function source code below works:
add_class <- scrutiny:::add_class
```

How to actually write mapper functions? Again, I recommend `function_map()`. The two functions created below are very similar but don't have as many options. Of course, you can manually add code for any options you like so long as the requirements above are still met.

Apply the `*_scalar()` function to the input data frame using `purrr::pmap_lgl()`:

```{r}
schlim_map_alt1 <- function(data, ...) {
  scrutiny::check_mapper_input_colnames(data, c("y", "n"), "SCHLIM")
  tibble::tibble(
    y = data$y,
    n = data$n,
    consistency = purrr::pmap_lgl(data, schlim_scalar, ...)
  ) %>% 
    add_class("scr_schlim_map")  # See section "S3 classes" below
}
```

Alternatively, you might call `dplyr::rowwise()` and directly mutate `"consistency"`. I don't recommend this approach because `dplyr::rowwise()` is quite slow.

```{r}
schlim_map_alt2 <- function(data, ...) {
  scrutiny::check_mapper_input_colnames(data, c("y", "n"), "SCHLIM")
  data %>% 
    dplyr::rowwise() %>% 
    dplyr::mutate(consistency = schlim_scalar(y, n, ...)) %>% 
    dplyr::ungroup() %>% 
    dplyr::relocate(y, n, consistency) %>% 
    add_class("scr_schlim_map")  # See section "S3 classes" below
}
```

The call to `check_mapper_input_colnames()` is not required but adds safety to your function. Also, see `manage_key_colnames()` which grants the user more flexibility in naming key columns.

Both approaches should lead to the same results:

```{r}
schlim_map_alt1(df1)

schlim_map_alt2(df1)
```

### Testing

You should let `function_map()` produce an equivalent function to make sure that it returns the same output as your handwritten one. To compare the two output data frames, don't just eyeball them. Use `waldo::compare()` or, if you already run tests with testthat, `expect_equal()`.

If your handwritten mapper creates new columns beyond `"consistency"`, you'll have to remove them from the output first. Don't use helper columns when testing because `function_map()` can't handle them.

### S3 classes

If you don't know what S3 classes are, don't worry. Just copy and paste the function below, and call it at the end of your mapper function. `x` is the output data frame, and `new_class` is a string vector. `new_class` consists of one or more "classes" that will be added to the existing classes of `x`.

```{r, eval=FALSE}
add_class <- function(x, new_class) {
  class(x) <- c(new_class, class(x))
  x
}
```

You can access the classes that an object carries --- or "inherits" --- by calling `class()`:

```{r}
some_object <- tibble::tibble(x = 5)
some_object <- add_class(some_object, "dummy class")
class(some_object)
```

### Internal helpers

Within scrutiny, many functions that are exported for users internally call helper functions that are not, such as `add_class()`. You might be writing your own function following the design of an exported scrutiny function, but suddenly you can't access an unknown function that you seem to need!

If you'd like to employ such an internal helper for yourself, specify its namespace with three colons, like `scrutiny:::add_class`. However, you should only use this trick to copy and paste the helper's source code into your own source code. (That's why I left out the parentheses --- return the function itself.) Never rely on calling a function with `:::`, because these internals are not actually meant for users. They can easily shift and vanish without notice.

If you develop your own package, see [this blogpost](https://www.tidyverse.org/blog/2022/09/playing-on-the-same-team-as-your-dependecy/) by Thomas Lin Pedersen for more information on using internal code from other packages. In particular, package developers should mind licenses when copying code from scrutiny because scrutiny is GPL-3 licensed.

When directly looking for internal helpers in scrutiny's source code, start at the [utils.R](https://github.com/lhdjung/scrutiny/blob/main/R/utils.R) file. Most helpers can be found there, and every helper in utils.R is documented.

# 4. `audit()` method

## Introduction

`audit()` is an S3 generic for summarizing scrutiny's test result data frames, especially those of mapper functions such as `grim_map()`. It should always return descriptive statistics but nothing else. Every mapper function should have its corresponding `audit()` method.

This is an aspect of object-oriented programming (OOP), but scrutiny's use of OOP is simple even by the low standards of R. Your mapper function's output already inherits a specific class, such as `"scr_grim_map"`, `"scr_grimmer_map"`, or `"scr_debit_map"`. In `schlim_map()`, we added the `"scr_schlim_map"` class in addition to existing classes:

```{r}
df1_tested <- schlim_map(df1)
class(df1_tested)
```

### Basics

Every `audit()` method for consistency test results should be the same insofar as all consistency tests are the same. It should have a single argument named `data`. Its return value should be a tibble with at least these columns:

1.  `incons_cases` counts the inconsistent cases, i.e., the number of rows in the mapper's output where `"consistency"` is `FALSE`.

2.  `all_cases` is the total number of rows in the mapper's output.

3.  `incons_rate` is the ratio of `incons_cases` to `all_cases`.

Apart from these, see for yourself which descriptive statistics your `audit()` method should compute. Means of variables in the `*_map()` function's output and their ratios to each other might be sensible choices.

All existing `audit()` methods for consistency tests return tibbles with a single row only. This makes sense because there is no obvious grouping variable for the input data frame, which would lead to multiple rows in `audit()`'s output. However, there might be good reasons for multiple rows when summarizing the results of other tests, so this is not a requirement.

### Practical steps

Your `audit()` method is simply a function named `audit` plus a dot and your specific class. Call `audit_cols_minimal()` within the method to create a tibble with the three required columns. If you don't use `audit_cols_minimal()`, call `check_audit_special()` in your method.

```{r, error=TRUE}
# The `name_test` argument is only for the alert
# that might be issued by `check_audit_special()`:
audit.scr_schlim_map <- function(data) {
  audit_cols_minimal(data, name_test = "SCHLIM")
}

# This calls our new method:
audit(df1_tested)

# This doesn't work because no method was defined:
audit(iris)
```

You can still add other summary columns to the tibble returned by `audit_cols_minimal()`. Use `dplyr::mutate()` or similar.

## Documentation template

Each `audit()` method should be documented on the same page as its respective mapper function. It should have its own section called *Summaries with `audit()`*. Create it with `write_doc_audit()`:

```{r}
audit_grim    <- audit(grim_map(pigs1))
audit_grimmer <- audit(grimmer_map(pigs5))

write_doc_audit(sample_output = audit_grim,  name_test = "GRIM")

write_doc_audit(sample_output = audit_grimmer, name_test = "GRIMMER")
```

This function prepares a roxygen2 block section. It fills the three standard columns out for you, and it leaves space to describe any other columns there might be. Also, the internal checks of `write_doc_audit()` make sure that you programmed a correct `audit()` method, as represented by the value of the `sample_output` argument.

Copy the output from the console and paste it into the roxygen2 block of your `*_map()` function. To preserve the numbered list structure when indenting roxygen2 comments with `Ctrl`+`Shift`+`/`, leave empty lines between the pasted output and the rest of the block.

# 5. Visualization function

## Introduction

It is hard to give general advice on how to implement visualization functions for the results of consistency tests. As with the `*_scalar()` function, the best way to plot such results greatly depends on the idiosyncratic nature of the consistency test itself. When comparing the looks of `grim_plot()` and `debit_plot()`, it becomes clear that two very different things are going on. (This is mainly because granularity is crucial for GRIM but not for DEBIT.)

## Requirements

Nevertheless, some general requirements do apply to scrutiny-style visualization functions. They are much more like arbitrary conventions than the requirements for mapper functions, which often meet very precise technical needs. Visualization functions, however, are not the basis for any other computations apart from modifications by additional ggplot2 layers.

As a result, the rules below are admittedly somewhat less important. If you violate them, nobody but me will be sad about it.

1.  All visualization functions should be based on ggplot2. They should follow its developers' general advice on [using ggplot2 in packages](https://ggplot2.tidyverse.org/articles/ggplot2-in-packages.html). Visualization functions don't need to implement any newly created layers, such as geoms or themes. Indeed, neither of the two existing visualization functions relies on any new layers.
2.  The visualization function's name should be that of the test itself (in lowercase), followed by `_plot`. Naturally, this doesn't apply to methods for generic functions like `plot()` or `ggplot2::autoplot()`.
3.  Its first argument, `data`, is a data frame that is the result of a call to the respective mapper function, such as `grim_map()` or `debit_map()`. The visualization function makes sure this is true by checking that `data` inherits the special class added within the mapper, such as `"scr_grim_map"` or `"scr_debit_map"`. If `data` fails this check, the function throws an error.
4.  The function should display consistent and inconsistent value sets. The color defaults should be `"royalblue1"` for consistent value sets and `"red"` for inconsistent ones. The user can override these defaults via two arguments named `color_cons` for consistent value sets and `color_incons` for inconsistent ones.
5.  If certain layers are optional rather than essential to the plot, their display can be controlled via logical arguments that start on `show_`. Examples are `show_data` in `grim_plot()` or `show_outer_boxes` in `debit_plot()`. Only arguments of this kind should start on `show_`. They should have defaults (which will usually be `TRUE`, but this is not a requirement).

# 6. Sequence mapper

## Introduction

When reported values are inconsistent, it's never obvious why. Consistency tests provide mathematical certainty in their results, but there is a trade-off: They don't suggest any clear causal story about the summary statistics. (Contrast this with a reconstruction technique such as [SPRITE](https://lukaswallrich.github.io/rsprite2/index.html), which does not aim at mathematical proof but does point towards major issues with the origins of the data.)

One possible reason for inconsistencies lies in small mistakes in computing and/or reporting by the original researchers. Indeed, when @brown2017 reanalyzed some of the data sets behind GRIM inconsistencies, they often found "a straightforward explanation, such as a minor error in the reported sample sizes, or a failure to report the exclusion of a participant" (p. 368).

It may therefore be useful to test the numeric neighborhood of inconsistent reported values. Are there any nearby values that are consistent with the other statistics? If so, how many and where? The problem might then be due to a simple oversight. However, it would be very cumbersome to test each candidate value manually, or even to test sequences that were manually created with functions such as `seq_distance()`.

Fortunately, scrutiny semi-automates this process. `grim_map_seq()`, `grimmer_map_seq()` and `debit_map_seq()` provide an instant assessment of whether or not inconsistent reported values are close to consistent numbers. They also allow the user to specify how many steps away from the reported value are permitted when looking for consistent ones, as well as some other options.

## Practical steps

Although the code that underlies them is fairly complex, these functions themselves were written in a very simple way. Here are the ones for GRIM, GRIMMER, and DEBIT:

```{r, eval=FALSE}
grim_map_seq <- function_map_seq(
  .fun = grim_map,
  .reported = c("x", "n"),
  .name_test = "GRIM",
)

grimmer_map_seq <- function_map_seq(
  .fun = grimmer_map,
  .reported = c("x", "sd", "n"),
  .name_test = "GRIMMER"
)

debit_map_seq <- function_map_seq(
  .fun = debit_map,
  .reported = c("x", "sd", "n"),
  .name_test = "DEBIT",
)
```

Any consistency test that is already implemented in a basic mapper function like `grim_map()`, `grimmer_map()`, and `debit_map()` can receive its own `*_map_seq()` function just as easily using `function_map_seq()`. This is due to scrutiny's streamlined design conventions --- specifically, the requirements for mapper functions laid out in section 3.

Let's write a sequence mapper for SCHLIM:

```{r}
schlim_map_seq <- function_map_seq(
  .fun = schlim_map,
  .reported = c("y", "n"),
  .name_test = "SCHLIM"
)

# Test dispersed sequences:
out_seq <- schlim_map_seq(df1)
out_seq

# Summarize:
audit_seq(out_seq)
```

By default, a `*_map_seq()` function only creates sequences around inconsistent input values. That's because its primary purpose is to shed light on inconsistencies in reported statistics. Override the default with `include_consistent = TRUE`:

```{r}
df1 %>% 
  schlim_map_seq(include_consistent = TRUE) %>% 
  audit_seq()

# Compare with the original values:
df1
```

As with `function_map()`, if you want to export a function produced by `function_map_seq()`, follow [this purrr FAQ](https://purrr.tidyverse.org/reference/faq-adverbs-export.html).

# 7. Total-n mapper

## Introduction

The reporting of summary statistics is often insufficient --- certainly from an error detection point of view. In particular, values such as means and standard deviations are not always accompanied by their respective group sizes, but only by a total sample size.

This presents a problem for consistency tests that rely on reported group sizes, such as GRIM. It requires splitting the reported total into groups and creating multiple plausible scenarios of group sizes that each add up to the total. Although no definitive test results can be gained this way, it does help to see whether reported values are consistent with at least some of the plausible group sizes [@bauer_expression_2021].

## Practical steps

`function_map_total_n()` creates new functions which follow this very scheme by applying a given consistency test to multiple combinations of reported and hypothetical summary statistics. It is the powerhouse behind `grim_map_total_n()`, `grimmer_map_total_n()`, and `debit_map_total_n()`, just as `function_map_seq()` is the powerhouse behind `grim_map_seq()`, `grimmer_map_seq()`, and `debit_map_seq()`. See the case study in `vignette("grim")` , section *Handling unknown group sizes with `grim_map_total_n()`*, for an example of how `grim_map_total_n()` works out in practice.

As with `function_map_seq()`, creating a manufactured `*_total_n()` function is very easy. Just let the function factory do the work for you:

```{r, eval=FALSE}
grim_map_total_n <- function_map_total_n(
  .fun = grim_map,
  .reported = "x",  # don't include `n` here
  .name_test = "GRIM"
)

grimmer_map_total_n <- function_map_total_n(
  .fun = grimmer_map,
  .reported = c("x", "sd"),  # don't include `n` here
  .name_test = "GRIMMER"
)

debit_map_total_n <- function_map_total_n(
  .fun = debit_map,
  .reported = c("x", "sd"),  # don't include `n` here
  .name_test = "DEBIT"
)
```

To drive this point home, let's do the same with SCHLIM:

```{r}
schlim_map_total_n <- function_map_total_n(
  .fun = schlim_map,
  .reported = "y",
  .name_test = "SCHLIM"
)

# Example data:
df_groups_schlim <- tibble::tribble(
  ~y1, ~y2, ~n,
   84,  37,  29,
   61,  55,  26
)

# Test dispersed sequences:
out_total_n <- schlim_map_total_n(df_groups_schlim)
out_total_n

# Summarize:
audit_total_n(out_total_n)
```

The same pattern can be applied to any other basic mapper function that fulfills the requirements from section 3. One of the columns, `n`, will have its values dispersed from half, internally using `disperse_total()`.

See the advice on exporting manufactured functions at the end of section 6.

# 8. Documenting `audit_seq()` and `audit_total_n()`

## Introduction

The output of sequence mappers and total-n mappers is very comprehensive. This makes it somewhat unwieldy and creates a need for summaries. As a first step, the user can always call `audit()` on the tibbles returned by manufactured functions like `grim_map_seq()`. It will go by the `"*_map"` class added within the basic mapper function, such as `grim_map()`, and return the regular output of the respective `audit()` method.

However, scrutiny features two specialized functions for summarizing the results of manufactured `*_seq()` or `*_total_n()` functions: `audit_seq()` and `audit_total_n()`. These two are not generic like `audit()`, but they only accept the output of functions produced by `function_map_seq()` and `function_map_total_n()`, respectively.

You will notice that I have spoken about two existing functions, rather than --- as in the other sections --- a kind of function that you should be writing. Indeed, there is nothing left for you to do about `audit_seq()` and `audit_total_n()` themselves, unless you find a bug in them! What you should do, however, is to document their behavior with regard to the specific test that you have implemented.

## Documentation templates

`audit_seq()` and `audit_total_n()` rely on the uniform design of the manufactured functions, which allows them to compute essentially the same summaries: Their behavior only varies with the names and numbers of the key columns, which in turn follow straightforwardly from the nature of the consistency test. If you're developing a package, you should therefore document the behavior of `audit_seq()` and `audit_total_n()` on the same pages as your manufactured `*_map_seq()` and `*_map_total_n()` functions.

There are specialized helpers for creating the respective documentation sections, `write_doc_audit_seq()` and `write_doc_audit_total_n()`, in analogy to `write_doc_audit()`. Here is how I used the first one for `grim_map_seq()`, `grimmer_map_seq()`, and `debit_map_seq()`; but with the output omitted to save space:

```{r, eval=FALSE}
write_doc_audit_seq(key_args = c("x", "n"), name_test = "GRIM")
write_doc_audit_seq(key_args = c("x", "sd", "n"), name_test = "GRIMMER")
write_doc_audit_seq(key_args = c("x", "sd", "n"), name_test = "DEBIT")
```

`key_args` is a string vector with the names of the respective test's key arguments. (You will see that the function is sensitive to the length of `key_args`, not just to its values.) `name_test` is the short, plain-text name of that consistency test itself.

Copy the output from the console and paste it into the roxygen2 block of your `_map_seq` function. To preserve the bullet-point structure when indenting roxygen2 comments with `Ctrl`+`Shift`+`/`, leave empty lines between the pasted output and the rest of the block.

Likewise, documenting `audit_total_n()` for `grim_map_total_n()`, `grimmer_map_total_n()`, and `debit_map_total_n()`:

```{r, eval=FALSE}
write_doc_audit_total_n(key_args = c("x", "n"), name_test = "GRIM")
write_doc_audit_total_n(key_args = c("x", "sd", "n"), name_test = "GRIMMER")
write_doc_audit_total_n(key_args = c("x", "sd", "n"), name_test = "DEBIT")
```

Why did I develop, and export, such strange functions? Documenting one's package should not be glossed over, and there is value in standardization, as well. `write_doc_audit_seq()` and `write_doc_audit_total_n()` deliver quality documentation with little effort while also establishing firm conventions for it.

# Wrap-up

The key part about consistency tests is compelling mathematical insight into the relationship between summary statistics. All the rest is implementation and application. No software package can generate new consistency tests as of yet, but it can make their implementation and application at scale as easy as possible. That is what scrutiny hopes to do.

This vignette generated five example functions (not counting the `audit()` method), four of them via function factories. Three of these factories are part of scrutiny's infrastructure. Starting with a simple mock test, an entire family of functions sprang up to apply that test in special cases, with a unified API, and at any scale --- all with just a few lines of easy to write code.

Below is their function family tree. Fields with bold margins are function factories. Arrows passing through them indicate that a new function is generated on the basis of an earlier one.

```{r include=FALSE, eval=FALSE}
# Note: The diagram was made on diagrams.net. The bold margins were created as follows: (1) selecting the respective field, (2) clicking on the `View` symbol at the upper left, (3) selecting `Format Panel`, and (4) setting the line thickness from 1 to 3 pt.
```

![](images/scrutiny_mappers_schlim.png)

Here is an overview of scrutiny's function factories:

| Output function type | Section here | Function factory         | GRIM example function | Predicate function    |
|---------------|---------------|---------------|---------------|---------------|
| Basic mapper         | 3            | `function_map()`         | `grim_map()`          | `is_map_basic_df()`   |
| Sequence mapper      | 6            | `function_map_seq()`     | `grim_map_seq()`      | `is_map_seq_df()`     |
| Total-n mapper       | 7            | `function_map_total_n()` | `grim_map_total_n()`  | `is_map_total_n_df()` |

Predicate functions are those that return `TRUE` for the data frames returned by the factory-made functions. The more general `is_map_df()` returns `TRUE` for all those data frames. As explained in section 3, `grim_map()` was written "by hand" rather than produced by `function_map()`, but such a factory-made function would be equivalent except for some additional columns in the output.

# References