Fasta.Rd
This class aims to simplify the handling and exploration of FASTA files and provides simple methods for accessing information that can be used to assess bulk contents from a FASTA file - the analysis framework is provided by Rsamtools (alone).
floundeR::FloundeR
-> Fasta
sequencingset
The sequencingset
active binding returns a sequencingset object
that is canonically structured around the passes_filtering
logical
field to allow assessment of sequencing characteristics.
new()
Creates a new Fasta object. This initialisation method performs other sanity checking of the defined file(s) to ensure that it is indeed parseable and creates the required data structures.
Fasta$new(fasta_file)
fasta_file
The source sequencing_summary file.
A new Fasta
object.
canonical_fasta <- flnDr("cluster_cons.fasta.bgz") fasta <- Fasta$new(canonical_fasta)
sequence_chunks()
Split the fasta sequence file explored by the package into sequence chunks for e.g. import into a relational database.
Fasta$sequence_chunks(chunk_size = 1000)
chunk_size
The number of fasta entries that should be contained within a single chunk (default: 1000)
an invisible integer that defines the number of possible chunks; this can for example be iterated over
get_sequence_chunk()
Get a chunk of fasta sequences from a larger monolithic file
Fasta$get_sequence_chunk(id = 1)
id
the chunk (see $sequence_chunks()
) to extract sequence for -
this must be an integer that is > 0 and <= sequence_chunks.
DNAStringSet containing the fasta entries corresponding to the specified sequence chunk.
get_tibble_chunk()
Get a chunk of fasta sequences from a larger monolithic file as a tibble
Fasta$get_tibble_chunk(id)
id
the chunk (see $sequence_chunks()
) to extract sequence for -
this must be an integer that is > 0 and <= sequence_chunks.
tibble
containing the fasta entries corresponding to the
specified sequence chunk.
get_index()
return the Rsamtools FASTA index
Fasta$get_index()
GRanges object describing the fasta elements contained within the sequence file
count()
return the number of sequence elements contained within the sequence file specified
Fasta$count()
integer of fasta entries in file
as_tibble()
Export the imported dataset(s) as a tibble
The Fasta R6 object consumes a fasta format file and creates an object in memory that can be explored, sliced and filtered. This method dumps out the in-memory object for further exploration and development.
Fasta$as_tibble()
A tibble representation of the starting dataset
clone()
The objects of this class are cloneable with this method.
Fasta$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `Fasta$new` ## ------------------------------------------------ canonical_fasta <- flnDr("cluster_cons.fasta.bgz") fasta <- Fasta$new(canonical_fasta) #> #> ── creating floundR::fasta with [cluster_cons.fasta.bgz] ─────────────────────── #> ℹ index for [cluster_cons.fasta.bgz] found #> ℹ loading fasta index [cluster_cons.fasta.bgz.idx] #> Error in seqnames(private$fasta_index): could not find function "seqnames"