High Throughput Sequencing and the IT architecture (Part 1 : Volume dimensioning and filesystems)
Keywords:
High Throughput Sequencing, IT architectureAbstract
Improvements in DNA sequencing technology have reduced the cost and time of sequencing a new genome. The new generation of High Throughput Sequencing (HTS) devices means has provided large impetus to the life science field and genome sequencing is now a necessary first step of many complex research projects with direct implications to the field of medical sequencing, cancer and pathogen vector genomics, epi and meta genomics.
However, despite the falling sequencing cost and time outlines, there are other associated costs and difficulties in the process of maintaining a functional data repository on large scale research projects. The new generation of HTS technologies [1] has introduced the need for increased data storage technologies whose capacity is well beyond the average local data storage facilities [2]. In fact, the computing world has produced a new term for this paradigm, that of data intensive computing [2a]. Data storage costs are falling, however a study of the functional specifications of popular HTS equipment, such as Roche's 454 pyrosequencers [3], Illumina's hardware [4] and ABI SOLiD technology [5] suggests that a single high throughput experiment run creates several Tbytes of information. If one takes into account that genome sequencing is often performed repeatedly in order to study genetic variation [6], the capacity of a suitable data archiving facility needs to scale to several Petabytes of information, which is well beyond the scale of most group, departmental, university computing facilities.
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).