23 Jul 2019 Blog
Using R for cross-study analysis

Clinical research is experiencing a revolution with a huge range of connected devices growing in popularity, with wearable and implantable devices across healthcare, fitness tracking and diet. Pharmaceutical companies sponsoring trials are incorporating these devices into ever more elaborate clinical trials, generating ever larger datasets, while sifting through social media streams and their own big data sources. It is now easier than ever before to store, manage and query ever increasing datasets.

The growth in the range of inter-connected devices across healthcare represents an exponential growth in the volume of data collected in ever more elaborate Clinical Trials. This growth in the volume of data presents new challenges for Clinical Data Scientists and requires new solutions and new tools for cross-study data analysis.

To meet these demands, Clinical Data Scientists are increasingly choosing open source solutions to leverage the active open source communities of experienced developers and statisticians. The R scripting language is ever more popular in the biostatistics and statistical programming fields and supports predictive analytics, big data analysis, and offers the potential to leverage Machine Learning and Artificial Intelligence.

Regulators already accept R for statistical analysis and the requirement for skills in R is growing faster than other competing tools. This blog will look at the use of R in cross study data analysis using SDTM data and how and MaxisIT’s SCE can add value to the process.


Today, written words and numbers are everywhere, unending and ever – changing. In this world of infinite variety, visuals are still the best way to tell a story. In fact, visualization is more important than ever, because with all the information that’s available, it’s getting harder and harder to sift through the clutter to understand what’s valuable.

Today, visualizations are the best way to filter out the noise and see the signals.

R is a statistical and visual language used by a growing number of data analysts inside corporations and academia, whether being used to set ad prices, find new drugs more quickly or fine-tune financial models. Companies as diverse as Google, Pfizer, Merck, Bank of America and Shell use R.

It is also free. Open-source software is free for anyone to use and modify so statisticians, engineers and data scientists can improve the software’s code or write variations for specific tasks. Packages written for R add advanced algorithms, richly colored and textured graphs and mining techniques to dig deeper into databases. At MaxisIT, we offer an open statistical computing environment that supports R to make it easy for scientists manipulate their own data during nonclinical drug studies rather than send the information off to a statistician.


Clinical Data Scientists can use pooled data in R from multiple study databases for their visualizations, and train and apply Machine Learning workflows over ever larger datasets. Here is a stepwise process of how R can be used to conduct cross-study analysis.

  1. Connect to standardized SDTM data from multiple studies.
  2. Combine data across multiple studies using MaxisIT’s SCE.
  3. Create complex visualizations in R using MaxisIT’s analytics and reporting solution.
  4. Train predictive analytics algorithms on MaxisIT’s SCE
  5. Apply term analysis
  6. Export to SAS V5 xpt

In summary, R is an excellent tool to connect to the conformed data, and allows sponsors to pool dataframes (datasets) using only a few keystrokes. Sometimes it is a little time consuming to add a bespoke package of our own to allow export to SAS V5 xpt files. MaxisIT offers a Statistical Computing Environment (SCE) that is open and flexibly integrates preferred tools SAS / R applications. The SCE is analytics agnostic & integrated data repository that provides faster access to clinical data.

Key features of MaxisIT’s SCE

  • Built-in CDISC standards and storage that facilitates metadata, structured data as well as unstructured regulatory document management
  • Intuitive UI that facilitates efficient use of libraries and automation
  • Ability to streamline via automated workflows, use of standard templates, version control, change impact analyzer, and reusable code management
  • Faster, scalable, & nimble product architecture that is big data ready, and supports global needs with on-demand performance
  • Balances flexibility with controls, security, and transparency, reducing compliance and IP risk
  • Role-based controls – delivering regulatory submission ready reports in the most efficient manner


The FDA’s Statistical Software Clarifying Statement declares that any suitable software can be used in a regulatory submission. Some data-exchange regulations do require the use of the XPT file format, which is an open standard, not restricted to SAS. MaxisIT’s SCE is 21CFR Part 11 and GxP compliant environment offering full traceability, transparency and auditability.


R use is clearly growing across many industries and it one of the key tools for today’s Clinical Data Scientist. Here is why R is the future of Clinical Development:

  • R is embedded in many leading industry solutions.
  • R can power Machine Learning and Artificial Intelligence.
  • The availability of a commercial distribution of R can re-assure users in even highly regulated industries.
  • Confirmation from the FDA that it can be used to analyze clinical studies leaves no barriers to R adoption across the clinical trial lifecycle and beyond.

Supporting the statistical language R in the most user-friendly manner is MaxisIT’s SCE that delivers ultimate empowerment to Biostatisticians, Clinical Programmers and Data Scientists. It is a completely metadata-driven and scalable Statistical Computing Environment with Integrated Data Repository that supports regulatory analysis without any constraints.


At MaxisIT, we clearly understand strategic priorities within clinical R&D, and we can resonate that well with our similar experiences of implementing solutions for improving Clinical Development Portfolio via an integrated platform-based approach; which delivers timely access to study specific as well as standardized and aggregated clinical trial operations as well as patient data, allows efficient trial oversight via remote monitoring, statistically assessed controls, data quality management, clinical reviews, and statistical computing. Moreover, it provides capabilities for planned vs. actual trending, optimization, as well as for fraud detection and risk-based monitoring. MaxisIT’s Integrated Technology Platform is a purpose-built solution, which helps Pharmaceutical & Life sciences industry by “Empowering Business Stakeholders with Integrated Computing, and Self-service Dashboards in the strategically externalized enterprise environment with major focus on the core clinical operations data as well as clinical information assets; which allows improved control over externalized, CROs and partners driven, clinical ecosystem; and enable in-time decision support, continuous monitoring over regulatory compliance, and greater operational efficiency at a measurable rate”.

This website uses cookies to help us give you the best experience when you visit. By using this website you consent to our use of these cookies. For more information on our use of cookies, please review our cookie policy.