An international agricultural research group has created a set of guidelines for organizations using agricultural data in research projects.
CGIAR is the Consultative Group for International Agricultural Research. CGIAR and is a consortium of 15 international research centers. CGIAR has developed a Platform for Big Data in Agriculture that advocates for open data use for agricultural research. Open data will, according to CGIAR, ultimately confer significant benefits to society, including accelerated scientific advancement, economic growth, resource efficiency, strengthened public support for research funding, and increased public trust in research. But open data requires responsible data management, which led CGIAR to create Ten Responsible Data Guidelines for Managing Privacy and Personally Identifiable Information in the Research Project Data Lifecycle.
CGIAR’s principles focus on handling PII, or personally identifiable information (name, address, phone number etc.), when conducting agricultural research. “The safest approach is to strip the data of PII, that is, to de-identify the data in order to anonymize it so individuals are no longer identifiable. However, while anonymization maximizes privacy and minimizes risk to research participants, it can also compromise the analytic potential and scientific utility of the data, as well as benefits to the participant.” The ten principles are designed to help researchers resolve this tension between confidentiality and open data. Here are the ten principles, in no particular order:
PREPARE, PLAN & COMPLY: Develop and implement a robust data management plan for handling PII from collection through the life cycle of the research project.
USE A SCALE: Weigh scientific interest against the consequences of disclosure and the risk of harm to the participant or their community.
MINIMIZE PII: Minimize collection and use of PII only to the extent necessary to achieve the purpose for which it was obtained.
DE-IDENTIFY DATA: Anonymization should be the default, with pseudonymization used when anonymization is not possible.
BE TRANSPARENT: Obtain informed consent with full disclosure of the scientific purpose for how PII will be used.
BE CONFIDENTIAL: Create internal procedures to ensure appropriate IT and security features are in place to protect confidentiality.
PUBLIC VS PRIVATE: Public-use datasets containing PII are the exception.
ARCHIVE OR DELETE PII: Keep PII for the minimum possible time and destroy when no longer necessary to advance the project’s interests.
REVIEW REGULARLY: Periodically review the compliance landscape and seek expert support when needed.
BE ETHICAL: At all times, ensure that the benefits of the project outweigh the risk to participants.
These principles will certainly be useful to CGIAR research centers. I also think they will be helpful to US, Canadian, and other universities using ag data for research projects that include personally identifiable information. CGIAR makes clear these principles are aspirational only, and not exhaustive. But in an unregulated field, guideposts are helpful.
Likewise, these principles may be useful to those companies developing private industry ag data platforms, but I would caution any company from viewing them as exhaustive. Many of the concerns from farmers about ag data come not just from sharing “Personally Identifiable Information” but also from ag data that may not be readily identifiable, such as yield information, planting data, or herd genetics. (Many companies make this mistake—focusing only PII and ignoring other concerns). To determine the issues related to collecting, storing, and sharing ag data specifically, companies should visit the Ag Data Transparent website or check more blog posts here.