Researchers working with moderately sensitive data can now perform their analyses using their Faculty Computing Allowance (FCA) or their Condo computing resources in UC Berkeley's high performance computing cluster, Savio. The campus Information Security and Policy (ISP) team has approved Savio to host data that the campus classifies as "P2/P3" (including, for example, many genomics datasets from the National Institutes of Health).
In addition, the Berkeley Research Computing program anticipates that our Analytics Environments on Demand (AEoD) service will also be approved for P2/P3 data very soon, allowing researchers to use P2/P3 data with analytic software packages (such as ArcGIS, Stata, SPSS, R Studio, etc.) on this scalable, web-based Windows platform.
Why this is important
More and more Berkeley researchers are working with sensitive data that are subject to complex protection requirements. We have received many requests from researchers who work with P2/P3 data and wished to use Savio for their analyses. Until now, that has not been possible. This often required researchers to build and manage local, nonstandard, often vulnerable clusters, and to enlist a rotating cohort of grad students to administer them, placing their research and the campus at risk, and diverting time and effort from researchers’ core focus.
Two researchers with whom we have been collaborating to define the new P2/P3 support are Prof. Peter Sudmant (Integrative Biology) and Prof. Priya Moorjani (Molecular and Cell Biology); they described the need like this:
“...there are no shared secure computing resources at UC Berkeley. This puts UC Berkeley researchers at a significant disadvantage compared to their peer institutions to perform cutting edge research, compete for grants, and to recruit top talent in one of the most rapidly growing and diversifying areas of research. Analysis of these types of datasets is a high priority funding area for both the NIH and international consortia and a leading topic of research across numerous disciplines including biology, statistics, public health and computer science.”
They went on to note serious negative impacts of this situation:
“1) UC Berkeley researchers are receiving negative reviews on their grants specifically citing the computational shortcomings of the institution.
2) New faculty are facing substantial financial and logistical challenges in starting their research programs due to the lack of adequate infrastructure.
3) Recruiting new faculty is facing hurdles as these new hires become aware of the lack of available resources.”
Partnering to make this happen
The launch of P2/P3 support in Savio and AEoD is the result of a significant effort and an effective collaboration between Research IT/Berkeley Research Computing (BRC) and the Information Security and Policy (ISP) team. The BRC team (particularly our partners from Lawrence Berkeley National Lab) made a tremendous effort to get the new security measures in place, and benefited from a partnership with the ISP team’s work to learn about the Savio infrastructure, to educate BRC staff about the security measures needed, and to help prepare necessary documentation. This hard work and collaboration enabled launch of P2/P3 support for research projects on Savio two weeks ahead of an already aggressive schedule.
Support for P2/P3 data in Savio and AEoD is an important step towards broader support for secure and sensitive data, as part of our Secure Research Data and Compute (SRDC) initiative.
If you have P2/P3 data that you would like to use in Savio, please use this new form to request a Savio P2/P3 Data Project.
For other questions about the new support, or any of the other Research IT services, please contact us at research-it@berkeley.edu.