CORE FACILITIES: BIOINFORMATICS
Our range of services includes: Analytics, SaaS (Software as a Service); PaaS (Platform as a Service); IaaS (Infrastructure as a Service). From batch primers design to networks assembly and big data analysis, passing through the hosting of in-house genome browsers we provide all types of bioinformatics support.
Our analytics services start with support on experimental design helping researchers identify key technologies and methodologies that satisfy their scientific needs. We encourage every researcher to approach the bioinformatics core facility before generating experimental data.
Data analysis mostly starts with usage of open-source software (eg. Tuxedo Suite for RNA-seq analysis). Software choices are primarily done taking into account the biological questions pursued by the researcher, previous record of publications, and respective users community.
If required the core facility will develop the necessary tools for successful data analysis (see also SaaS). Depending on the researcher's need we will either perform end-to-end analysis or troubleshoot the researcher's own analysis as it develops.
We perform specialised analytics as well as full discovery analytics. Thus, besides helping researchers answering focused target questions (eg. how are microRNA changes correlated to gene expression changes in protein coding genes) we also perform long-term analysis where every piece of possible information is extracted from existing data sets.
We perform our analysis with a developers mind allowing the same type of analysis to be used in any similar data set with extremely reduced turnover times.
We deliver the results of our analysis on-the-fly assuring researchers can either follow-up or change the route of analysis at any given time point.
Our analysis portfolio increases on a weekly basis and along with researchers needs. We are not limited nor focused on any type of technology.
A big part of our work lays on the piping of outputs into downstream software required for complete data analysis and development of tools that expand the use of already existing software. This naturally leads to the development of new bioinformatic tools.
All code used and developed for requested analysis is updated and made available on a daily basis to all researchers involved in the respective project through in-house hosted gitlab servers (see also PaaS).
On-demand software development for routine analysis and/or processes (eg. high throughput systems) is done provided machine outputs have known formats.
Wrappers for Galaxy are developed upon request (also see PaaS).
Researchers that require support on the development of their own code can find constant advise on our facility.
Two terminals are permanently available in our facility to either host software developers and analysts for defined projects and periods of time or researchers that require support or wish to run their computational work on a bioinformatic environment.
The core facility hosts and manages a variety of platforms, databases, and execution environments. Examples include:
- Galaxy, a code free programming platform for the development and sharing of biological analysis pipelines
- RStudio server, an interactive web-based version of the popular statistical language R
- gitlab, a platform for versioning and sharing of code
- High performance computing cluster, an HPC cluster managed by the popular jobs distribution system SLURM where all software is made available though an environment modules system allowing us to identify and solve problems on a centralised fashion and providing users with debugged software
- databases (eg. nucleotides, proteins, motifs, genomes and respective indexes)
Working tightly with the IT department, and strongly supported by our bioinformatics core facility systems administrator we assure that no project is limited by computational power or any other type of IT infrastructure.
Our HPC cluster nodes are supported by a distributed parallel file system (BeeGFS) running on an infiniband network. Nodes have a minimum of 40 processors and 512GB of memory.
We make sure that 80% of jobs get started at their respective maximum performance within 20 minutes of submission and 98% of jobs within 3 hours.