Install uuid Package from CRAN
The uuid package is available on CRAN and provides simple UUID generation functionality for R. Install it once and use across all your R projects and scripts for generating unique identifiers in statistical analysis and data science workflows.
CRAN Installation
install.packages("uuid")
Generate Random UUIDs with UUIDgenerate()
The UUIDgenerate() function creates random UUIDs (version 4) by default. You can generate single UUIDs or multiple UUIDs at once by specifying the n parameter, making it ideal for adding unique identifiers to entire datasets or experimental designs.
library(uuid)
# Generate single UUID v4
id <- UUIDgenerate()
print(id)
# [1] "550e8400-e29b-41d4-a716-446655440000"
# Generate multiple UUIDs
ids <- UUIDgenerate(n = 5)
print(ids)
# [1] "550e8400-e29b-41d4-a716-446655440000"
# [2] "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
# [3] "7c9e6679-7425-40de-944b-e07fc1f90ae7"
# [4] "886313e1-3b8a-5372-9b90-0c9aee199e5d"
# [5] "8e5d3a04-5b2f-4e3c-a6d9-9f4b7c3e2a1d"
# Store in variable
experiment_id <- UUIDgenerate()
UUID Generation Options and Parameters
The uuid package supports different UUID generation methods including random (v4) and time-based (v1) UUIDs. You can also control the output format to get either character strings or raw bytes depending on your application requirements.
Random UUID (Default)
# Default: random UUID (v4) uuid_v4 <- UUIDgenerate() # Multiple random UUIDs uuids <- UUIDgenerate(n = 10)
Cryptographically secure random UUIDs
Time-based UUID
# Time-based UUID (v1) uuid_v1 <- UUIDgenerate(use.time = TRUE) # Multiple time-based time_ids <- UUIDgenerate(n = 5, use.time = TRUE)
Includes timestamp, sortable by creation time
Uppercase Format
# Uppercase UUID upper_uuid <- toupper(UUIDgenerate()) # Multiple uppercase toupper(UUIDgenerate(n = 3))
Convert to uppercase for specific requirements
Raw Bytes Output
# Output as raw bytes raw_uuid <- UUIDgenerate(output = "raw") # Useful for binary storage class(raw_uuid) # "raw"
Binary format for efficient storage
Add UUID Column to Data Frames
Adding UUIDs to data frames is a common pattern in R for creating unique row identifiers. This is especially useful for tracking observations in experimental data, ensuring data integrity during merges, and creating robust references in reproducible research.
library(uuid)
# Create data frame with UUID column
df <- data.frame(
id = UUIDgenerate(n = 5),
name = c("Alice", "Bob", "Charlie", "David", "Eve"),
score = c(95, 87, 92, 88, 94),
stringsAsFactors = FALSE
)
print(df)
# id name score
# 1 550e8400-e29b-41d4-a716-446655440000 Alice 95
# 2 6ba7b810-9dad-11d1-80b4-00c04fd430c8 Bob 87
# 3 7c9e6679-7425-40de-944b-e07fc1f90ae7 Charlie 92
# 4 886313e1-3b8a-5372-9b90-0c9aee199e5d David 88
# 5 8e5d3a04-5b2f-4e3c-a6d9-9f4b7c3e2a1d Eve 94
# Add UUID column to existing data frame
existing_df <- data.frame(
name = c("Frank", "Grace"),
value = c(100, 200)
)
existing_df$id <- UUIDgenerate(n = nrow(existing_df))
dplyr Integration for UUID Generation
Integrate UUID generation seamlessly with dplyr's mutate() function for modern R workflows. This approach works well with pipes and tidyverse conventions, making it easy to add unique identifiers as part of your data transformation pipeline.
library(uuid)
library(dplyr)
# Add UUID with dplyr
df <- data.frame(
name = c("Alice", "Bob", "Charlie"),
score = c(95, 87, 92)
) %>%
mutate(id = UUIDgenerate(n = n()))
print(df)
# Or using base R approach
df$id <- UUIDgenerate(n = nrow(df))
# Generate UUID for each row in pipeline
results <- data.frame(
experiment = c("A", "B", "C"),
value = c(10, 20, 30)
) %>%
rowwise() %>%
mutate(
run_id = UUIDgenerate(),
timestamp = Sys.time()
) %>%
ungroup()
Experimental Design with Unique Identifiers
UUIDs are invaluable in experimental designs for tracking samples, participants, or trials. Each experimental unit gets a globally unique identifier, ensuring data integrity across different stages of analysis and preventing identification conflicts in complex multi-stage experiments.
library(uuid)
# Create experimental design with UUIDs
create_experiment <- function(n_participants, n_trials) {
expand.grid(
participant = 1:n_participants,
trial = 1:n_trials
) %>%
mutate(
observation_id = UUIDgenerate(n = n()),
participant_id = rep(UUIDgenerate(n = n_participants), each = n_trials),
condition = sample(c("Control", "Treatment"), n(), replace = TRUE),
timestamp = Sys.time() + runif(n(), 0, 3600)
)
}
# Generate experiment with 10 participants, 5 trials each
experiment <- create_experiment(10, 5)
print(head(experiment))
# Save experiment with UUIDs
write.csv(experiment, "experiment_data.csv", row.names = FALSE)