Research Study Database Storage Requirements: What to Consider

UPDATED: This post was updated for 2018 to reflect new information and more examples. Enjoy!

This is the age of data. There’s data on everything; it’s powerful, it’s growing, and even getting its own ‘brain’ through AI. Today, this is coined as big data: any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Big data is often categorized by “the Three Vs”, which are volume (the amount of data), velocity (the speed at which data is streamed), and variety (the different forms data comes in). With 287,078 FDA-registered clinical research studies taking place in the United States at the time of this article, a staggering quantity of clinical data is being produced.

As a result, vast amounts of structured and unstructured clinical data necessitate vast amounts of storage space. Along with the tremendous volume of data in existence comes concerns over the security with which it’s being stored. This is a notable concern in clinical research, as the compliance and safety of data is paramount.  

So, what should research teams take into consideration when evaluating an eClinical solution’s storage capabilities? In this post, we take a look at different elements of database storage and how they apply to research studies.

Storage Capacity

We are so accustomed to seeing gigabyte and terabyte storage sizes in the consumer marketplace that we need to understand most raw clinical data does not require nearly the amount of storage space. In Clinical Studio, the developers have designed the database in a manner that stores and retrieves data efficiently, and removed any risk of data loss. It also makes the most use of storage space to prevent unnecessary costs for its customers.

For example, consider a single study in Clinical Studio may consist of 40 study sites gathering data into the same unique database. If each site has target enrollment of 50 subjects, that’s a combined enrollment of 900 subjects. For 900 subjects, there could be 80,000 to a few hundred thousand records (eCRFs) containing the study data.holding the study data (assuming no external pdf or image files have been uploaded as supporting documents). This would be mere 35 megabytes, or at most a single gigabyte, which your grandmother can store on her thumb-drive from 1997. This, of course, is a fairly large example. Most studies are much smaller than that.

Database Structure

Properly thought-out database structure basically equates to safety of data and speed. For a web-based data collection system, those are important factors. When you look at the typical cloud storage database service these days, it offers a high quantity service based on the demand of media-intensive consumers. However, there is very little functionality to those databases, as they are designed to simply store and fetch files. It’s a great model for those purposes, but entering and managing data from clinical trials requires a different approach. The data has large amounts of metadata tied to it; in other words, data about the data. Think of queries, audit trails, and record relationships that need to be made.

In clinical research – or any situation that gathers clinical data, customer data, or company data – a database with dynamic and highly organized structure is critical. Research study data, for most purposes, consists primarily of simple text and values. This data does not come anywhere close to the size of media files. For most studies taking place today, research databases require less quantity and more behind-the-scenes design quality. Clinical Studio has recognized and accomplished that; moreover, they made it customizable and simple enough for all users. Moreover, TrialKit for mobile devices makes data entry and navigating the data even faster than modern web browser technology can accomplish.

Security and Compliance

The threat of data breaches spurs the healthcare industry to protect sensitive and confidential patient data as much as possible. With one too many cyber attacks in 2018, organizations are taking extra precautions to secure medical records, and they seek data collection tools that offer strong security measures.

Database security and system compliance go hand-in-hand. Technology vendors should equip research teams with a solution that adheres to rules outlined by regulatory agencies. For instance, it’s critical for a system that houses clinical data be fully compliant with the FDA’s 21 CFR Part 11 requirements. This regulation, in short, mandates that electronic records and electronic signatures are considered trustworthy, reliable, and equivalent to paper records, accomplished through the use of audits, validation, and other system controls.


In Clinical Studio, all data is protected at the highest levels against any potential vulnerability. Data entering and exiting the servers is encrypted using the healthcare-industry standard 128-bit SSL and 2048-bit RSA public keys. Networks are protected and constantly monitored both physically and remotely. Industry-leading enterprise data centers housing dedicated and backed-up Clinical Studio servers follow standards in physical security and access.

Additionally, the primary data centers (US/EU) are SOC-certified AWS or dedicated Tier 3+ fully redundant facility tied to 4 different power substations, guaranteeing at least 99.982% availability. In the US, the same data center is used for local 911 and area hospitals. The EU AWS system is operated by top-rated enterprise standards; this helps ensure optimal performance, security, and reliability for users in our various regions of the globe.

When clinical research teams are evaluating study sizes and deciding on the value of an EDC system such as Clinical Studio, the biggest question should not be one of storage or reliability of data. Clinical Studio eliminates any concerns surrounding safe storage for clinical data, freeing research teams so they are able to focus on managing their studies and collecting data in the most effective manner. Questions about how to collect and store data using Clinical Studio? Get in touch with us today.