10 Jun 2019 Blog
Data Preparation on Critical Path for Clinical Data Intelligence

Clinical organizations are under increasing pressure to execute clinical trials faster with higher quality. Subject data originates from multiple sources; CRFs collect data on patient visits, implantable devices deliver data via wireless technology. All this data needs to be integrated, cleaned and transformed from raw data to analysis datasets. This data management across multiple sources is on the critical path to successful trial execution, and submission.



SDTM data provides a powerful tool for cross study analysis, and can include various types of external data from labs, ECGs and medical devices. Wearable devices are becoming more popular, and can even be included in patient treatment regimens. Once confirmed, these data can provide fantastic insights into patient data and population health. This ‘big’ data can allow researchers to observe drug reactions in larger populations than those under study, and aligning with genetic data could even reduce wasted treatment cycles. In the digital age, our attitude to information is changing. The traditional model of data capture and supply, using an EDC system with multiple integrations has shifted downstream. Rather than being at the very center of this picture, EDC has shifted left slightly: companies now expect their clinical systems to act as a hub for all of the information relevant to their drug on trial, and are searching for a single source of the truth, whatever the data source.



Growing volumes of data, global operations and increasing regulatory scrutiny are encouraging pharmaceutical companies and healthcare providers to develop Clinical Data Warehouses. Data warehouses can be a mine of information in a data-rich business environment, and can greatly enhance data transparency and visibility. The interoperability of systems is increasing along with interchange standards, and real world data is being collected more widely than ever before. Data warehouses are often used to aggregate data from multiple transactional systems. Such systems may have data structures designed for collection, and not be aligned with the reporting standard. Typically this data is transformed and then loaded into a central data model that has been optimized for analysis, for example, market research or data mining.

It is possible to design a Clinical Data Warehouse that follows the model of a traditional data warehouse with a single well-defined data model into which all clinical data are loaded. This can create a powerful tool allowing cross study analysis at many levels. Data is never deleted or removed from the warehouse, and all changes to data over time are recorded. The main features of a reporting standard must be ease of use and quick retrieval. SDTM is a mature, extensible and widely understood reporting standard with clearly specified table relationships and keys. The key relationships can be used to allow users to select clinical data from different reporting domains without an understanding of the relationships between domains. SDTM also allows users to create their own domains to house novel and as yet unpublished data types, so we can maintain the principles above for any data type, allowing powerful cross domain reports to be created interactively.



Data may be loaded from the source transactional systems in a number of ways. With EDC, new studies are continually brought online, and may be uploaded repeatedly. Most warehouse systems include a number of interfaces to load data. Many also supply APIs to allow external programs to control the warehouse in the same way as an interactive user. A combination of robust metadata, consistent data standards and naming conventions can allow automated creation of template driven warehouse structures, and dedicated listener programs can automatically detect files, and automate data loading.

The SDTM table keys enable incremental loading, where only records changed in the source system are updated in the warehouse, saving disk space. We can also use the SDTM keys in our audit processing, and use them to identify deleted records in incrementally loaded data pools. SDTM conversion, data pooling at Therapeutic Area and Compound level, and Medical Dictionary re-coding can be handled automatically in the warehouse in the reporting standard. Use of SDTM facilitates the pooling of studies to the maximum version available, accommodating all of the studies in previous versions without destructive changes which would affect the warehouse audit trail.

Uses of a Clinical Data Warehouse include:

  • Ongoing medical review
  • Wearable Device data review
  • Data reconciliation
  • Streamline statistical analysis for submission
  • Modeling of protocol design and trial simulation
  • Responding to regulatory queries
  • Safety monitoring and signal detection
  • Cross-study analysis

Each of these can deliver value to a customer, but each requires consistent data structures, in a format that can be easily understood by the warehouse consumers.



A Clinical Data Warehouse may also be connected to a transactional Safety system. This, coupled with the SDTM data warehouse can allow reconciliation of the two data sources, a crucial task as the Clinical studies are locked and reported. Automated transformations can account for the different vocabularies in the two systems, and the records can be paired together in a dashboard. The dashboards themselves can be configured to highlight non-matching records, and also to allow data entry to track comments, and acceptance of insignificant differences. Reconciliation involves both the Clinical and Safety groups, but could also be carried out by CRO users responsible for the studies. This enhances collaboration between the sponsor and CRO, and provides an audited central secure location to capture comments. Security is paramount in an open system, and the warehouse’s security model is designed to allow CRO users to only see the studies they have been assigned to, hiding other clinical studies from the dashboards and selection prompts. As a serious adverse event must be reported within 24 hours, it is possible that that event could be reconciled against the clinical data the following day. MHealth data can be integrated automatically using the IoT Cloud service, with patients automatically enrolled into an EDC study. This can be reconciled with CRF data and automatically loaded to the Business Intelligence layer.


SDTM can be of huge benefit to the users of a Clinical Data Warehouse system, allowing data pooling for storage, audit and reporting. Use of data standards has already transformed Clinical research. The next generation of eClinical Software should place those standards in front of programmers, inside the tools they use every day, and allow them to automate transformations to and from review and submission models, respond quickly to regulatory inquiries on current and historical data, generate automated definition documents and support a wide range of data visualization tools. Study component reusability and automatic documentation together enable clinical organizations to have greater clarity on what has been done to get from source (e.g. EDC, labs data) to target (e.g. SDTM) – to turn on the light in the black box.

Ultimately, leveraging standard, re-usable objects accelerates study setup, and combined with automation reduces manual processes, and increases traceability.

  • Standards can streamline and enhance data collection
  • End to end traceability can only improve review
  • Increase regulatory compliance with comprehensive security, audit trail, and two-way traceability across the discrepancy lifecycle

MaxisIT’s Clinical Development platform integrates a best in class data management platform, allowing clinical trial sponsors to automatically load and control data from EDC and various external sources, transform this from the collection standards into SDTM without user input, and provide the SDTM data to dynamic, near real-time analyses which can be compiled into internet-facing dashboards.

About MaxisIT

At MaxisIT, we clearly understand strategic priorities within clinical R&D, and we can resonate that well with our similar experiences of delivering Patient Data Repository, Clinical Operations Data Repository, Metadata Repository, Statistical Computing Environment, and Clinical Development Analytics via our integrated clinical development platform-; which delivers timely access to study specific as well as standardized and aggregated clinical trial operations as well as patient data, allows efficient trial oversight via remote monitoring, statistically assessed controls, data quality management, clinical reviews, and statistical computing.

Moreover, it provides capabilities for planned vs. actual trending, optimization, as well as for fraud detection and risk-based monitoring. MaxisIT’s Integrated Technology Platform is a purpose-built solution, which helps Pharmaceutical & Life sciences industry by “Empowering Business Stakeholders with Integrated Computing, and Self-service Dashboards in the strategically externalized enterprise environment with major focus on the core clinical operations data as well as clinical information assets; which allows improved control over externalized, CROs and partners driven, clinical ecosystem; and enable in-time decision support, continuous monitoring over regulatory compliance, and greater operational efficiency at a measurable rate”.

This website uses cookies to help us give you the best experience when you visit. By using this website you consent to our use of these cookies. For more information on our use of cookies, please review our cookie policy.