This class aims to simplify the handling and exploration of FASTA files and provides simple methods for accessing information that can be used to assess bulk contents from a FASTA file - the analysis framework is provided by Rsamtools (alone).

Super class

floundeR::FloundeR -> Fasta

Active bindings

sequencingset

The sequencingset active binding returns a sequencingset object that is canonically structured around the passes_filtering logical field to allow assessment of sequencing characteristics.

Methods

Public methods

Inherited methods

Method new()

Creates a new Fasta object. This initialisation method performs other sanity checking of the defined file(s) to ensure that it is indeed parseable and creates the required data structures.

Usage

Fasta$new(fasta_file)

Arguments

fasta_file

The source sequencing_summary file.

Returns

A new Fasta object.

Examples

canonical_fasta <- flnDr("cluster_cons.fasta.bgz")
fasta <- Fasta$new(canonical_fasta)


Method sequence_chunks()

Split the fasta sequence file explored by the package into sequence chunks for e.g. import into a relational database.

Usage

Fasta$sequence_chunks(chunk_size = 1000)

Arguments

chunk_size

The number of fasta entries that should be contained within a single chunk (default: 1000)

Returns

an invisible integer that defines the number of possible chunks; this can for example be iterated over


Method get_sequence_chunk()

Get a chunk of fasta sequences from a larger monolithic file

Usage

Fasta$get_sequence_chunk(id = 1)

Arguments

id

the chunk (see $sequence_chunks()) to extract sequence for - this must be an integer that is > 0 and <= sequence_chunks.

Returns

DNAStringSet containing the fasta entries corresponding to the specified sequence chunk.


Method get_tibble_chunk()

Get a chunk of fasta sequences from a larger monolithic file as a tibble

Usage

Fasta$get_tibble_chunk(id)

Arguments

id

the chunk (see $sequence_chunks()) to extract sequence for - this must be an integer that is > 0 and <= sequence_chunks.

Returns

tibble containing the fasta entries corresponding to the specified sequence chunk.


Method get_index()

return the Rsamtools FASTA index

Usage

Fasta$get_index()

Returns

GRanges object describing the fasta elements contained within the sequence file


Method count()

return the number of sequence elements contained within the sequence file specified

Usage

Fasta$count()

Returns

integer of fasta entries in file


Method as_tibble()

Export the imported dataset(s) as a tibble

The Fasta R6 object consumes a fasta format file and creates an object in memory that can be explored, sliced and filtered. This method dumps out the in-memory object for further exploration and development.

Usage

Fasta$as_tibble()

Returns

A tibble representation of the starting dataset


Method clone()

The objects of this class are cloneable with this method.

Usage

Fasta$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples


## ------------------------------------------------
## Method `Fasta$new`
## ------------------------------------------------

canonical_fasta <- flnDr("cluster_cons.fasta.bgz")
fasta <- Fasta$new(canonical_fasta)
#> 
#> ── creating floundR::fasta with [cluster_cons.fasta.bgz] ───────────────────────
#>  index for [cluster_cons.fasta.bgz] found
#>  loading fasta index [cluster_cons.fasta.bgz.idx]
#> Error in seqnames(private$fasta_index): could not find function "seqnames"