R

How to Generate UUID in R

Using uuid package

Generate UUIDs in R using the uuid package for data science, statistical computing, and research applications. Perfect for creating unique identifiers in data frames, experimental designs, and reproducible research workflows.

Generate UUID in R

UUIDgenerate()
550e8400-e29b-41d4-a716-446655440000

Install uuid Package from CRAN

The uuid package is available on CRAN and provides simple UUID generation functionality for R. Install it once and use across all your R projects and scripts for generating unique identifiers in statistical analysis and data science workflows.

CRAN Installation

install.packages("uuid")

Generate Random UUIDs with UUIDgenerate()

The UUIDgenerate() function creates random UUIDs (version 4) by default. You can generate single UUIDs or multiple UUIDs at once by specifying the n parameter, making it ideal for adding unique identifiers to entire datasets or experimental designs.

uuid_example.R
library(uuid)

# Generate single UUID v4
id <- UUIDgenerate()
print(id)  
# [1] "550e8400-e29b-41d4-a716-446655440000"

# Generate multiple UUIDs
ids <- UUIDgenerate(n = 5)
print(ids)
# [1] "550e8400-e29b-41d4-a716-446655440000"
# [2] "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
# [3] "7c9e6679-7425-40de-944b-e07fc1f90ae7"
# [4] "886313e1-3b8a-5372-9b90-0c9aee199e5d"
# [5] "8e5d3a04-5b2f-4e3c-a6d9-9f4b7c3e2a1d"

# Store in variable
experiment_id <- UUIDgenerate()

UUID Generation Options and Parameters

The uuid package supports different UUID generation methods including random (v4) and time-based (v1) UUIDs. You can also control the output format to get either character strings or raw bytes depending on your application requirements.

Random UUID (Default)

# Default: random UUID (v4)
uuid_v4 <- UUIDgenerate()

# Multiple random UUIDs
uuids <- UUIDgenerate(n = 10)

Cryptographically secure random UUIDs

Time-based UUID

# Time-based UUID (v1)
uuid_v1 <- UUIDgenerate(use.time = TRUE)

# Multiple time-based
time_ids <- UUIDgenerate(n = 5, use.time = TRUE)

Includes timestamp, sortable by creation time

Uppercase Format

# Uppercase UUID
upper_uuid <- toupper(UUIDgenerate())

# Multiple uppercase
toupper(UUIDgenerate(n = 3))

Convert to uppercase for specific requirements

Raw Bytes Output

# Output as raw bytes
raw_uuid <- UUIDgenerate(output = "raw")

# Useful for binary storage
class(raw_uuid)  # "raw"

Binary format for efficient storage

Add UUID Column to Data Frames

Adding UUIDs to data frames is a common pattern in R for creating unique row identifiers. This is especially useful for tracking observations in experimental data, ensuring data integrity during merges, and creating robust references in reproducible research.

dataframe_uuid.R
library(uuid)

# Create data frame with UUID column
df <- data.frame(
  id = UUIDgenerate(n = 5),
  name = c("Alice", "Bob", "Charlie", "David", "Eve"),
  score = c(95, 87, 92, 88, 94),
  stringsAsFactors = FALSE
)

print(df)
#                                     id    name score
# 1 550e8400-e29b-41d4-a716-446655440000   Alice    95
# 2 6ba7b810-9dad-11d1-80b4-00c04fd430c8     Bob    87
# 3 7c9e6679-7425-40de-944b-e07fc1f90ae7 Charlie    92
# 4 886313e1-3b8a-5372-9b90-0c9aee199e5d   David    88
# 5 8e5d3a04-5b2f-4e3c-a6d9-9f4b7c3e2a1d     Eve    94

# Add UUID column to existing data frame
existing_df <- data.frame(
  name = c("Frank", "Grace"),
  value = c(100, 200)
)

existing_df$id <- UUIDgenerate(n = nrow(existing_df))

dplyr Integration for UUID Generation

Integrate UUID generation seamlessly with dplyr's mutate() function for modern R workflows. This approach works well with pipes and tidyverse conventions, making it easy to add unique identifiers as part of your data transformation pipeline.

library(uuid)
library(dplyr)

# Add UUID with dplyr
df <- data.frame(
  name = c("Alice", "Bob", "Charlie"),
  score = c(95, 87, 92)
) %>%
  mutate(id = UUIDgenerate(n = n()))

print(df)

# Or using base R approach
df$id <- UUIDgenerate(n = nrow(df))

# Generate UUID for each row in pipeline
results <- data.frame(
  experiment = c("A", "B", "C"),
  value = c(10, 20, 30)
) %>%
  rowwise() %>%
  mutate(
    run_id = UUIDgenerate(),
    timestamp = Sys.time()
  ) %>%
  ungroup()

Experimental Design with Unique Identifiers

UUIDs are invaluable in experimental designs for tracking samples, participants, or trials. Each experimental unit gets a globally unique identifier, ensuring data integrity across different stages of analysis and preventing identification conflicts in complex multi-stage experiments.

library(uuid)

# Create experimental design with UUIDs
create_experiment <- function(n_participants, n_trials) {
  expand.grid(
    participant = 1:n_participants,
    trial = 1:n_trials
  ) %>%
    mutate(
      observation_id = UUIDgenerate(n = n()),
      participant_id = rep(UUIDgenerate(n = n_participants), each = n_trials),
      condition = sample(c("Control", "Treatment"), n(), replace = TRUE),
      timestamp = Sys.time() + runif(n(), 0, 3600)
    )
}

# Generate experiment with 10 participants, 5 trials each
experiment <- create_experiment(10, 5)
print(head(experiment))

# Save experiment with UUIDs
write.csv(experiment, "experiment_data.csv", row.names = FALSE)

Related Programming Language UUID Guides

Copied!