Glossary
There are a variety of terms that you’ll come across throughout the NF Data Portal. Refer back to this page if you’re ever unsure about what something means, or the difference between terms.
Bug
Term used to describe an error occurring in a computer program or hardware.
Controlled Access Data
Some data on the portal is considered Controlled Access and requires you to request access by reading and electronically agreeing to data-specific terms. Read more about controlled access data here.
Controlled Value
A pre-formatted value that must be used as defined.
Data Model
A model that organizes data elements and describes their relationships to one another, usually in a graph-based form such as a flowchart diagram. The structure of the data is dictated by the data model. In other words, a data model structures how information is organized and related in a particular context. At Sage Bionetworks, the data model typically refers to CSV or JSON-LD files used by SCHEMATIC (Schema Engine for Manifest Ingress and Curation).
Dataset
Datasets bundle multiple similar data files together into one large bulk file that can be downloaded at once. NF datasets range in size from hundreds of files, to just a few. Sometimes, datasets have also been processed or harmonized to improve their usefulness.
Explore all datasets on the portal here.
Digital Object Identifier (DOI)
A string of alphanumerical characters used to provide a digital link to an entity such as a journal article, abstract, website.
File
Of the thousands of files on the NF Data Portal, there are different types. They may be individual data files, such as raw sequencing runs (e.g., FASTQ files), or they may be report files, which are often required by funders to track data generation progress.
Explore all files on the portal here.
File annotations
File annotations are a set of controlled vocabulary associated with data files that describe properties of the data to allow for queries. Also known as metadata, these annotations are essentially extra information about the data so that you can properly search and filter through it.
Governance
Due to the open-access nature of the platform, Synapse operates under comprehensive governance policies that define the rights and responsibilities of Synapse users. This includes our standard operating procedures (SOPs), privacy policy, code of conduct, community standards, and more.
Grant
A grant is represented by a contract number and/or digital object identifier assigned to a project.
Individual ID
An individual ID is the identifier for a specific individual (human subject or single animal).
Initiative
In our context, an initiative describes a group of projects that were funded under the same grant mechanism.
Explore all initiatives on the portal here.
Key data
Particular data sets selected from a statement of work (SOW) which fulfill one or more of the following criteria: (1) Dataset contains more than 20 samples in the dataset or generated from patient samples (N > 5) (2) Dataset contains data generated using high-throughput methods that output raw data presented in a widely used systematic format (3) Omics data derived from unbiased techniques (4) Dataset considered to be validation data for a new method (5) Dataset considered to be of interest to the funding partner
Metadata
Metadata is additional, standardized information included alongside the data to give it context—data about the data, if you will. Metadata is what allows data in the portal to be searchable, discoverable, accessible, re-usable, and understandable to others, including those who were not involved in the data generation process.
Metadata can be descriptive (i.e., the name of the file), administrative (i.e., provenance information), or research-based (i.e., information about the sampling and handling of data).
Metadata dictionary
A resource for contributors dedicated to metadata and annotations. Browse the dictionary here.
Metadata validation
The act of checking metadata for correct values and formatting.
NF Data Curator
A person responsible for managing data by annotating, validating, checking for errors, etc.
NF-OSI
Acronym which stands for Neurofibromatosis (NF) Open Science Initiative. This is an initiative to support open science within the neurofibromatosis and schwannomatosis research community.
projectLIVE
A dashboard found here that facilitates tracking of data uploads, publications, etc. contributed by funders/data contributors.
Publication
Publications are collected by the NF Data Portal data curation team, and represent publications generated from funded studies generating data on the portal. If you have produced a publication with data available on the NF Data Portal, or funded by an NF-OSI partner, and do not see it on the portal, please let us know at nf-osi@sagebionetworks.org.
Explore all publications on the portal here.
Raw Data
Raw data is the initial, unmodified information collected directly from sources, not yet processed or analyzed. For instance, in biological imaging, it's often in .ome-tiff format, preserving all details and metadata from microscopy. In genomics, it typically appears as .fastq files.
Schema
An overlapping concept to data model, a metadata schema provides further rules and standardization of a data model. It outlines additional rules governing the management of metadata through constraints such as the optionality or valid values of attributes.
SOP
A standard operating procedure (SOP) is a document which serves as a step-by-step guide to accomplish a particular task in a consistent manner.
Specimen ID
A specimen ID is the identifier for a sample from a specific individual – for example, a brain sample from a specific region or a blood sample.
Study
A study is the primary unit of data organization in the portal. Essentially, each study represents an individual research project with specific objectives and focus (one project can operate multiple studies) A study can represent data generated from a specific human cohort, data from experiments on a model system, cross-consortium data processing and analysis efforts, or data associated with a specific publication.
In our context, a study is typically associated with a grant. So, the terms study and grant are often used interchangeably. However, some studies span multiple grants, or are led by program partners that are not grant-funded.
On the NF Data portal, a study bundles multiple pieces of project information together, including a study title, summary, lead investigator, access requirements, acknowledgement statements, data files, datasets, metadata files, tools, publications, and related studies. Not all of these components will necessarily be present, particularly if the study is currently active.
Explore all studies on the portal here.
Synapse
An online software developed by Sage Bionetworks which allows users to upload, store, analyze, and track data in a private space.
Template
A manifest template is a template, usually an excel spreadsheet, that outlines a collection of specific metadata attributes pertaining to a data type to be filled in. The columns of the template refer to the metadata attributes to be collected for a set of corresponding data. In other words, it describes a set of key/value pairs that can be assigned to a data file(s) of the same data type.