Function to convert a VariantAnnotation
CollapsedVCF
/ExpandedVCF
object to a data.frame
.
vcf2df(
vcf,
add_sample_names = TRUE,
add_rowranges = TRUE,
drop_empty_cols = TRUE,
unique_cols = TRUE,
unique_rows = TRUE,
unlist_cols = TRUE,
sampled_rows = NULL,
verbose = TRUE
)
Variant Call Format (VCF) file imported into R as a VariantAnnotation CollapsedVCF/ ExpandedVCF object.
Append sample names to column names (e.g. "EZ" --> "EZ_ubm-a-2929").
Include rowRanges
from VCF as well.
Drop columns that are filled entirely with:
NA
, "."
, or ""
.
Only keep uniquely named columns.
Only keep unique rows.
If any columns are lists instead of vectors, unlist them.
Required to be TRUE
when unique_rows=TRUE
.
First N rows to sample.
Set NULL
to use full sumstats_file
.
when determining whether cols are empty.
Print messages.
data.frame version of VCF
#### VariantAnnotation ####
# path <- "https://github.com/brentp/vcfanno/raw/master/example/exac.vcf.gz"
path <- system.file("extdata", "ALSvcf.vcf",
package = "MungeSumstats")
vcf <- VariantAnnotation::readVcf(file = path)
vcf_df <- MungeSumstats:::vcf2df(vcf = vcf)
#> Converting VCF to data.table.
#> Expanding VCF first, so number of rows may increase.
#> Checking for empty columns.
#> Removing 2 empty columns.
#> Unlisting 4 columns.
#> Dropped 314 duplicate rows.
#> Time difference of 0.1 secs
#> VCF data.table contains: 101 rows x 12 columns.