FitterJitter()
A jitter function that is fitter (better) than R's base jitter() function.
• Current version: 27 April 2016.
Download FitterJitterRfunc160427.R
A jitter function that is fitter (better) than R's base jitter() function.
• Current version: 27 April 2016.
Download FitterJitterRfunc160427.R
What should jittering do (and not do)?
Consider a very common statistical graphic: plotting the values of a numeric variable, Y, versus the levels of a one or more factors (categorical variables), or another numeric variable, X. Tied or nearly tied values of Y are not a problem when they are linked to different groups or disparate values of X. But when they occur within the same group or same X value, they can often be jittered slightly so that each observation becomes a distinct mark in the plot.
Sound jittering separates values minimally and in accord with the data analysis and its presentation. No Y value should be jittered unnecessarily. The amount and direction of each jitter should be deterministic; there is no need to introduce more randomness.
R's standard jitter() function is crudely simplistic in that it adds random noise to every Y value whether or not any jittering is needed.
On the other hand, FitterJitter():
Consider a very common statistical graphic: plotting the values of a numeric variable, Y, versus the levels of a one or more factors (categorical variables), or another numeric variable, X. Tied or nearly tied values of Y are not a problem when they are linked to different groups or disparate values of X. But when they occur within the same group or same X value, they can often be jittered slightly so that each observation becomes a distinct mark in the plot.
Sound jittering separates values minimally and in accord with the data analysis and its presentation. No Y value should be jittered unnecessarily. The amount and direction of each jitter should be deterministic; there is no need to introduce more randomness.
R's standard jitter() function is crudely simplistic in that it adds random noise to every Y value whether or not any jittering is needed.
On the other hand, FitterJitter():
- jitters only Y values that are tied or nearly tied within a group.
- uses a deterministic algorithm, one designed to facilitate seeing how a cluster of jittered values stems from its set of original Y values.
- can jitter both additively (default) and multiplicatively (LogY=TRUE). In short, if Y is analyzed as log(Y) (any base), it should be plotted with log scaling and thus should be jittered accordingly.
- maintains mean structures exactly. The mean of each cluster of jittered values is unchanged from the mean of the Y values of that cluster. This also preserves the group means. If LogY=TRUE, the geometric means of the clusters and groups are preserved.
Arguments
Y = <numeric object>
LogY = FALSE {=TRUE}
Groups = NA {=groupvar, =list(groupvarA, groupvarB, ...)}
JitGap = "auto" {= "rangeY/50", = <numeric value>}
Factor = 1 {= <numeric value>}
Print = FALSE {= TRUE}
Y = <numeric object>
- The numeric variable to be jittered; may have NA values.
LogY = FALSE {=TRUE}
- If =TRUE, the jittering will be multiplicative not additive. See JitGap= and Factor=. This preserves the geometric means. Otherwise (default), the jittering is arithmetic and the ordinary means are preserved.
Groups = NA {=groupvar, =list(groupvarA, groupvarB, ...)}
- The group variable(s), optional. For one variable, Groups=groupvar. If more than one group variable, use Groups = list(groupvarA, groupvarB, ...)
JitGap = "auto" {= "rangeY/50", = <numeric value>}
- Gap between adjacent jittered values. By default (JitGap="auto", or "a" or "A")), jitgap is set by an algorithm to give jitter spacing similar to the default spacing by jitter( ..., amount=NULL), albeit FitterJitter only considers the smallest differences within the groups and the jitter is deterministic, not merely a little random noise. See details in code.
- JitGap = "rangeY/50" (or "r" or "R") sets jitgap to be 1/50-th of range over all the Y values taken as a single group, but the jittering is done within groups. This is the non-random alternative to jitter(..., amount=0).
- JitGap = <numeric value>. For LogY=FALSE (default), JitGap= must exceed 0.00; for LogY=TRUE, it must exceed 1.00. For example, let Y.jit[i+1] > Y.jit[i] be two adjacent jittered values arising from a tied Y value in the same group. If LogY=FALSE and jitgap = JitGap*Factor = 0.05, then Y.jit[i+1] = Y.jit[i] + 0.05, an "additive" jitter. For LogY=TRUE and jitgap = JitGap^Factor = 1.01. Y.jit[i+1] = Y.jit[i]*1.01, a multiplicative jitter.
Factor = 1 {= <numeric value>}
- Multiplier for jitgap value, For LogY=FALSE, jitgap is reset to jitgap = jitgap*Factor. For LogY=TRUE, jitgap is reset to jitgap = jitgap^Factor.
Print = FALSE {= TRUE}
- =TRUE to print non-error messages.
Examples
See "RunExamples" block of code in the Rfunc file.
See "RunExamples" block of code in the Rfunc file.