Before embarking on your research project,
you should be aware of the spectrum of data sources available to study teams at VCU.
Sources vary with respect to:
The regulatory overhead involved
Self-Service or additional support needed
Time to delivery
Scale of data available
Research questions should be formulated in a way that can be answered by:
A data source that aligns with your team’s budget
Timeline
Time to delivery
Technical skill
I'm Doing
Reviewing patient records to design a research study, or to determine feasibility is considered
“preparatory to research” under the Privacy Rule. To access protected health information (PHI), the use or disclosure must be solely to prepare a research
protocol or for similar purposes preparatory for research. The following options are available
for study feasibility or preparatory for research.
TriNetX
Self-Service
Data available on over 1 million VCU health patients, or over 150 million global patients
Data is de-identified, significantly reducing time for regulatory approval
Statistical analyses can be done within TriNetX, eliminating the need to download complex data files
Online and VCU training available
The following options are available for research comparing patient groups or populations with intent to publish the results and or use in a grant submission
TriNetX
Minimal support needed for data to be downloaded
Data available on over 1 million VCU health patients, or over 150 million global patients
Data is de-identified, significantly reducing time for regulatory approval
Statistical analyses can be done within TriNetX, eliminating the need to download complex data files
Online and VCU training available
Cosmos
Self-Service
Data available from Epic users nationally
Data is de-identified, similar to TriNetX
Honest Broker
Additional time, resources and compliance efforts associated with use
Requires IRB approval
Identified data requires secure data storage (Horizon system)
VCU Health Enterprise Analytics provides data for evaluating current systems of patient care for
improving the quality or efficiency of that care
To comply with the
HIPAA Minimum Necessary Data
rule, researchers are required to use de-identified or Limited datasets when possible.
De-identified and Limited data sources (i.e. TriNetX, Cosmos, APCD) allow study teams to access a huge amount of information with minimal
regulatory overhead and risk to patients.
The following analyses can be performed in TriNetX without any regulatory requirements:
Explore a Cohort (Single Cohort)
Cohort Comparison (Two Cohorts)
Outcomes Analysis (Single Cohort)
Outcomes Comparison (Two Cohorts)
Competing Risks (Single Cohort)
Treatment Pathways (Single Cohort)
Incidence and Prevalence (Single Cohort)
Patient Clustering (Single Cohort)
Burden of Illness (Single Cohort)
Logistic Regression (Single Cohort)
The following research articles were published using data from the TriNetX platform:
Van Tassell B, Talasaz AH, Redlich G, Ziegelaar B, Abbate A. A Real-World Analysis of New-Onset Heart Failure
After Anterior Wall ST-Elevation Acute Myocardial Infarction in the United States. Am J Cardiol. 2024 Jan 15;211:245-250.
doi: 10.1016/j.amjcard.2023.11.037. Epub 2023 Nov 20. PMID: 37981000.
Gao Z, Winhusen TJ, Gorenflo M, Ghitza UE, Nunes E, Saxon AJ, Korthuis T, Brady K, Luo SX, Davis PB, Kaelber DC, Xu R.
Potential effect of antidepressants on remission from cocaine use disorder - A nationwide matched retrospective cohort study.
Drug Alcohol Depend. 2023 Oct 1;251:110958. doi: 10.1016/j.drugalcdep.2023.110958. Epub 2023 Sep 7. PMID: 37703770; PMCID: PMC10556849.
Epic Cosmos is a de-identified collection of electronic health record data from 1,600
hospitals and 37,000 clinics. The following research articles were published using data from Epic Cosmos:
Laspro M, Brydges HT, Verzella AN, Schechter J, Alcon A, Roman AS, Flores RL. Association of Commonly
Prescribed Antepartum Medications and Incidence of Orofacial Clefting. Cleft Palate Craniofac J. 2024
Mar 6:10556656241237679. doi: 10.1177/10556656241237679. Epub ahead of print. PMID: 38449319.
Swaminathan SS, Medeiros FA. Socioeconomic Disparities in Glaucoma Severity at Initial Diagnosis:
A Nationwide Electronic Health Record Cohort Analysis. Am J Ophthalmol. 2024 Jul;263:50-60.
doi: 10.1016/j.ajo.2024.02.022. Epub 2024 Feb 22. Erratum in: Am J Ophthalmol.
2024 Nov;267:308-310. doi: 10.1016/j.ajo.2024.08.001. PMID: 38395325; PMCID: PMC11162936.
Handley SC, Gallagher K, Breden A, Lindgren E, Lo JY, Son M, Murosko D, Dysart K, Lorch SA,
Greenspan J, Culhane JF, Burris HH. Birth Hospital Length of Stay and Rehospitalization During
COVID-19. Pediatrics. 2022 Jan 1;149(1):e2021053498. doi: 10.1542/peds.2021-053498. PMID: 34889449;
PMCID: PMC9645693.
Conducting Research Using Identifiable Data
Researchers have to justify the need for identifiable patient data to satisfy the
HIPAA Minimum Necessary Data
rule.
Using data from identifiable sources (i.e. EHR) invokes additional ethical and regulatory obligations to protect the
privacy of VCU Health patients. Accordingly, the VCU Health Compliance Office requires that all research conducted using
EHR data involve an honest broker who acts on behalf of VCU Health to provision clinical data for research purposes.
By design, honest brokers are not part of the study team and have no stake in the research findings or outcomes.
The primary purpose of the honest broker is to limit the study team’s access to only the minimum necessary identifiable
information to complete the project. Note that while many investigators may have access to identifiable patient information
via their clinical role, this access is only for the purpose of providing patient care and cannot be used to extract information for research.
Pathways to Use Identifiable Data for Research:
Authorization from subjects (consent)
Documented Institutional Review Board (IRB) or Privacy Board approval of an alteration or waiver of authorization
Information is provided as Limited (or De-identified) Data Set with a signed Data Use Agreement
Honest Broker Functions
Mitigate bias by outsourcing data collection to a disinterested third party
Enforce reproducible cohort identification
Minimize manual effort by extracting data elements programmatically
Comply with VCUHS Policy COMP-014 for most projects utilizing retrospective EHR data
Track disclosures of patient data for research purposes as required by law
Available Options
TriNetX
Effort:
Self-Service
Regulatory:
None
Wait Time:
None
Data Scale:
130M+ patients
Recommended for:
Student projects, unfunded research, grant-funded research
TriNetX
is a self-service web tool in which you can use a point-and-click interface to identify cohorts of
patients using criteria defined by attributes from their medical records (i.e. demographics, labs, diagnoses, procedures, visits).
It is a great tool for experimenting with inclusion/exclusion criteria and assessing study feasibility. Once you’ve identified a cohort (or several) of interest,
you can use the tools built into the TriNetX platform to do things like
Summarize the demographics, comorbidity incidence, or other characteristics of your cohort
Conduct statistical tests comparing outcomes between two cohorts
Perform a logistic regression
Visualize the timeline of patient events
All of this requires neither IRB approval, nor funding.
You can start conducting publishable research within the TriNetX platform as soon as you have an account, for free,
without any regulatory obligations. You can request a TriNetX account using
this form.
TriNetX
Dataset
Download
Effort:
Minimal Support Needed
Regulatory:
Not Human Subjects Research (NHSR)
Wait Time:
~1-2 weeks
Data Scale:
130M+ patients
Recommended for:
Teams with biostatistical support, grant-funded research
If you need to perform an analysis more complex than the TriNetX analytics tools can handle,
there is an option to download a full extract of the de-identified EHR data for your cohort(s).
These downloads require a Not Human Subjects Research (NHSR) determination from the VCU IRB and must be approved by
both TriNetX and the Wright Center. The extract will be formatted as ~17 separate .csv files corresponding to
various domains of clinical data (i.e. demographics, labs, diagnoses, etc). These files can be quite large and will
require a skilled data practitioner on your study team to use effectively. You can submit a request to download a TriNetX dataset below.
Cosmos
Effort:
Self-Service
Regulatory:
None - Verify with Mary
Wait Time:
None
Data Scale:
270M+ patients
Recommended for:
Student projects, unfunded research, grant-funded research
Cosmos is an Epic tool in which you can use SlicerDicer to query a Limited dataset of
patient information from a large number of institutions using Epic. You can also limit your analysis to only the VCU population.
This is another self-service web tool that allows you to identify cohorts of patients and then analyze trends and outcomes.
There are more data elements available in Cosmos than TriNetX, but the statistical toolkit is more limited and there is no option to
access patient-level information, only aggregated values. Contact VCUHS Clinical Research
clinical.research@vcuhealth.org
to request a
Cosmos account or for more information about training.
All Payer Claims Database (APCD)
Effort:
Additional Support Needed
Regulatory:
Not Human Subjects Research (NHSR)
Wait Time:
Weeks-months
Data Scale:
~10M patients
Recommended for:
Grant-funded research, pilot data for grant applications
The APCD is a Limited dataset of medical and pharmacy claims, as well as demographic information for Virginia residents
between 2016-2022. Claims data is an excellent source for projects interested in healthcare costs or needing to track patients’ care across settings beyond VCU.
Data is limited to information relevant to billable charges (i.e. procedures, diagnoses, visits, prescriptions, etc) and cannot be linked to other data sources.
You can submit a request for APCD data below.
Electronic Health Records (Epic/Cerner)
Effort:
Additional Support Needed
Regulatory:
IRB Approval
Wait Time:
Weeks-months
Data Scale:
~1M
Recommended for:
Grant-funded research, pilot data for grant applications
Data collected for research using Electronic Health Records (EHR) must involve an honest broker recognized by the VCU
Health Compliance Office. The Wright Center and Massey Comprehensive Cancer Center honest broker teams maintain an Enterprise Data Warehouse containing reliable data
from billing records since the early 2000s which can be provisioned to study teams for research purposes. Additionally, a much more robust set of clinical information
is available for patient encounters since the transition from Cerner to Epic on 12/4/2021. For projects that need to manually extract unstructured data from patient charts
(i.e. from notes, imaging, etc), honest brokers can provide MRNs to investigators for further review. You can submit a request for EHR data below.
Still have questions?
Request a Consultation Below
Virginia Commonwealth University
Informatics Request Form
The form streamlines
requests made of the informatics teams at both the VCU Massey Cancer
Center and the VCU Wright Center for Clinical and Translational
Research. Use the form for both cancer and non-cancer related requests.
Review thedefinition of terms to better assist with submitting your request.