Introduction to Data Management

Managing your data can be an effective strategy for ensuring that your data will be usable preserved, maintained, and accessible throughout the life cycle of a research project and for future generations of scientific research.

Moreover, federal funding agencies are now requiring data management plans as part of grant proposal packages.

What is Data?

According to the U.S. Office of Management and Budget, research data is defined as follows:

(i) Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues. This "recorded" material excludes physical objects (e.g., laboratory samples). Research data also do not include:

(A) Trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law; and

(B) Personnel and medical information and similar information the disclosure of which would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.

Data Types

There are several different types of data:

  • Observational: Data captured in real-time, ususally irreplacable
    • Examples: Sensor data, telemetry, survey data, sample data, neuroimages
  • Experimental: Data from lab equipment; often reproducable, but can be expensive to do so
    • Examples: Gene sequences, chromatograms, torid magnetic field data
  • Simulation: Data generated from test models wher emodel and metadata (inputs are more important than output data)
    • Examples: Climate models, economic models
  • Derived or Compiled: Data that is reproducable (but very expensive to do so)
    • Examples: text and data mining, compiled atabase, 3d models, data gathered from public documents

Data and Storage File Formats

Storage formats can include, but are of course not limited to:

  • Text: e.g ascii, word, PDF
  • Numerical: e.g. ascii, SPSS, STATA, Excel, Access, MySQL
  • Multimedia: e.g. jpeg, tiff, mpeg, quicktime
  • Models: e.g. 3D, statistical
  • Software: e.g. Java, C
  • **Discipline-specific*: e.g. FITS in astronomy, CIF in chemistry
  • **Instrument-specific*: e.g. Olympus Confocal Microscope Data Format

Funding Agency Requirements

The sharing of research findings has always been critical to the development of science. Whether through society publications, corporate publishing, or open access forums, scientific progress is dependent upon the sharing of data collected through:

  • observation (e.g. sensor or survey data),
  • experimentation (e.g. gene sequences, chromatograms)
  • simulation (e.g. climate or economic models), and
  • derivation/compilation (e.g. text and data mining, 3D models).

In order to foster scientific progress and promote the expansion and diverity of user communicatinos, the National Institutes of Health and the National Science Foundation have issued data sharing mandates for projects funded by their agencies. This means that researchers receiving funds from these agencies are subject to the requirements of these mandates.

Example Data Management Plans

The following list of Data Management Plan examples are available for variety of research domains. Use the following examples as a guide to writing your own:

Data Management Tools

Preparing your data for effective management throughout the data lifecycle and adhering to agency mandates does not have to be a time consuming task. The following tools will help you create a well-organized Data Management Plan with ease:

  1. DMPTool provides uidance and resources for your Data Management Plan The DMP tool is the product of a joint effort by several Major American Research Institutions. The goal is to provide a "flexible, online tool to help researchers create data management plans." With the DMPTool, rsearchers can:
    • Create ready-to-use agency and directorate-pecific data management plans
    • Comply with data management plan requirements
    • Acquire easy-to-follow guidance on creating plans.
  2. The Digital Curation Center Data Management and Sharing Plan Website is the "leading hub of expertise in curating digital research" in the United Kingdom. Much like the DMPTool, the DCC's DMP tool is a flexible web-based tool that allows rsearchers to create personalized data management plans, and includes many of the same features as DMPTool.
  3. The ICPSR Guidelines for Effective Data Management Plans provides resarchers with the following:
    • A defined list of the elements comprising a Data Management Plan
    • An explanation as to why each element is important for a DMP
    • A recommended list of elements to be included in a Data Management Plan
  4. The DataONE Data Management Plan Outline is a quick reference guide for outlining a Data Management Plan, and provides a generic example of how each section fo the plan may look once completed.

Data Sharing

Your data is a source of potential value to resarchers and society at large, and sharing it helps to put its potential value to use. In order to maximize the value of your data, it is imperative to inform others how they can use it, while protecting your rights as the creator.

The most effective way of informing others of how they can use your data is by applying a license to its use.

If you want to find a license that is right for youd ata or learn more about licensing data in general, please refer to the following resources:

Further Reading

Refer to our Data Management Plans page for a roundup of links and resources for DMP.