Submission Instructions

This page provides instructions on how to submit data to SeaBASS as well as information about the SeaBASS File Format. If you are new to this process, please scroll down the page to the "How to Submit" section and review the steps involved. More details can be found in the other sections on this page and beneath the other topics under "Contribute Data" found in the main menu of the SeaBASS website.
 
The SeaBASS data format and structure were designed with the following in mind: To account for the continuous growth of the bio-optical data set and the wide variety of supported data types, the NASA Ocean Biology Processing Group felt it essential to develop efficient data ingestion and storage techniques. While this requires a specific data file format, the data protocols were designed to be as straightforward and effortless as possible on the part of the contributor, while still offering a useful format for internal efforts. The system was intended to meet the following conditions: simple data format, easily expandable and flexible enough to accommodate large data sets; global portability across multiple computer platforms; and web-accessible data holdings with sufficient security to limit access to authorized users.


Table of Contents


Data Submission Policy

All data collected under the auspices of the NASA Ocean Biology and Biogeochemistry (OB&B) program are to be submitted to SeaBASS within 1-year of the date of collection. For further details, please review the SeaBASS Access Policy.

How to Submit (Overview Steps)

New submitters should read this section for an overview of how to submit data to SeaBASS. Much of the process involves organizing your data using the SeaBASS file format. SeaBASS files are text files beginning with a block of metadata information followed by a data matrix. Some metadata headers and keywords are mandatory, and others are optional or conditional. All data columns must be labeled using standardized combinations of field names and units. In addition to data files, your submission must be accompanied by external documentation that describes collection and processing, and calibration files. Some submissions include associated metadata and files such as plankton imagery and raw files. 

NASA-funded investigators are responsible for converting their data into the SeaBASS file format. The SeaBASS team might be able to assist in reformatting voluntary data submissions, especially for in-demand measurements needed for satellite validation, but it will require discussion on a case-by-case basis.
 
The list below outlines the steps involved in submission for someone new to SeaBASS. These steps include brief explanations, with more information found elsewhere on this page and under Contribute Data in the main menu. Feel free to email the SeaBASS team if you have questions or concerns at any point in this process.
  1. Learn about the SeaBASS data format and other submission requirements listed further below on this page.
    1. For a quick idea of what the SeaBASS file format looks like, go to the Metadata Headers page which begins with a relatively simple example.
    2. Consult the list of measurement types on the Data Submission Special Requirements page to see if your measurement types are there. If so, review any relevant examples and info about which fields and metadata to use.
    3. Consult the list of fields and units for each of the data parameters. If your data parameter is not listed in the table, please reach out to the SeaBASS team. 
  2. Contact SeaBASS staff via email to plan out your submission before you begin formatting data. SeaBASS staff will respond to your inquiry, typically within 1-2 business days. For routine submissions, you will then be able to proceed with the steps below. For more complex scenarios and data types that are new to SeaBASS, our data managers will require more information and discussion of how to best organize and accommodate your data and documentation. When you email the SeaBASS Staff to introduce yourself, include:
    1. Your first and last name, and what institution you are affiliated with.
    2. Please indicate if your project was NASA-funded, or if your submission is voluntary.
    3. If you are not the PI or person who secured funding for the project within your laboratory or organization, please indicate that person's name and how you are connected.
    4. Indicate the project name and deployment or cruise names. In SeaBASS data are categorized under a new or existing experiment and cruise name (see Lists in the main menu), which should be coordinated if there were co-investigators.
    5. Briefly explain the measurement types of the data you wish to submit.
  3. Register a SeaBASS submitter account by following the instructions further down this page. The process requires generating SSH keys and emailing the public key to SeaBASS staff. 
  4. Create a draft SeaBASS file.
    1. Use the automated File Checker (FCHECK) to scan your file for format-compatibility problems. More info. Run FCHECK, fix any problems, and repeat as necessary.
    2. A suggestion is to start with just one file for practice before creating and checking all your other files.
    3. Pick thoughtful file names for your data files. Example: EXPERIMENT_CRUISE_DATATYPE_YYYYMMDDHHMM_RELEASE#.sb. More details about file naming are below in section 5.
    4. Avoid using spaces and most special characters. Full file naming guidelines are found further down this page.
  5. Prepare and gather all relevant external documentation and calibration information (e.g., cruise reports, methods descriptions, calibration files, and/or fill out any SeaBASS-provided templates in the mandatory special submission requirements). Generally, these sorts of files may be provided in any format (e.g., text, PDF, DOC, etc.) Tips:
    1. Make sure the names listed in the metadata /documents and /calibration_files exactly match the corresponding file names.
    2. Do not use FCHECK to scan documentation. FCHECK is only needed for data files.
    3. Consult the Data Submission Special Requirements page and, if your data type is there, ensure you follow the guidelines and include the required documentation and checklists.
    4. If you have additional data or metadata files that should be linked to the submission refer to the associated metadata section below for additional information. 
  6. Use your account to upload your data submission, including all documentation. If your submission includes multiple types of measurements, it is generally requested that you submit them all simultaneously as a complete package, not piecemeal.
    1. After your upload your submission, you will receive an automated email receipt within 24 hours (sent to the submitter's account plus addresses in the "contacts" metadata). Please contact us if you don't receive this confirmation within 24 hours.
    2. If your receipt email contains a section about MISSING documents and calibration files: these files were named in your metadata headers but could not be found by our automated system within your data_submission folder. The most common reasons are: 1) you forgot to upload the files, or 2) the spelling or capitalization of the documentation files does not exactly match. Please log back in and upload the relevant docs and calibration files. However, you may disregard this message if you are resubmitting or supplementing a previous submission and referencing documentation archived for the cruise (also see the “resubmission” instructions on this page). 

    3. Next, the SeaBASS team will prepare to archive your data. This involves manually reviewing your submission for completeness and performing additional QC checks. You will be emailed if revisions or more information are needed.
    4. The SeaBASS team will add a Digital Object Identifier (DOI) to your data files corresponding to the "experiment".  If the experiment is new to SeaBASS, the DOI will be reserved. New DOIs are typically formally registered on a quarterly basis. The SeaBASS team also adds the special date "received" header.
    5. You will be emailed when the data archiving is complete.

Request a SeaBASS Experiment and DOI

SeaBASS data are organized and archived under “experiments” which are the same as “collections” in NASA OB.DAAC. Each experiment is assigned a DOI, therefore the experiment name and DOI are permanent and can’t be changed once they are assigned. The request for a DOI and experiments can take up to a month to complete, so please plan accordingly.

The Experiment name is usually an acronym associated with a specific funding or grant or in some cases a long-term research project. If multiple principal investigators (PI's) are involved, we recommend that the PI's agree on a name to be used across repositories. If you have a new project and plan on submitting data, contact the SeaBASS Team. You will be asked to provide the following information:
  1. Experiment name: This is a short name (often an acronym) for the project. It will be part of the DOI, for example, 10.5067/SeaBASS/EXPERIMENT/DATA001
  2. Experiment long name: Brief explanation of the experiment and definition of acronyms
  3. Experiment description: Please provide a paragraph with information about your project. This will be part of your landing page and is an opportunity to showcase your project. Feel free to acknowledge funding, PI's, and include links to other data repositories.
  4. Experiment URL: If the experiment has a webpage please send that information to us
  5. Cruises: Help SeaBASS generate one or more cruise names for your Experiment. Our data is cataloged by cruises, which can be small vessels, research vessels, field campaigns, monitoring stations, and/or we can subdivide the data into time frame deployments for sampling-based projects. If the projects had research vessels, please use the cruise id. More names can be added in the future. Multiple cruises should use a consistent naming pattern, if applicable.
  6. PI’s: Principal Investigators will be added to the landing page as they submit data. However, we ask for this information ahead to know more about the potential data submitters.

How to Resubmit (i.e. Update a Previous Submission)

SeaBASS encourages resubmissions to ensure that the best quality versions of data are available online. In the event that you need to update some or all of your data or documentation for a past submission, please follow these steps:

  1. Email the SeaBASS team ahead of time to discuss what you want to resubmit. We will evaluate the best way to update your files on the back end of SeaBASS and reply to you about how to proceed.
    1. Briefly explain the reason for the resubmission. If possible, also provide a succinct description (one line or less) to explain the issue or what changed in the newer version of your data files. The SeaBASS team will use the short description to update a special metadata field (i.e., to preserve release/version history, the previous “/received” info will be kept as a comment with the short explanation appended.)
    2. Explain the scope of what files or documents need to be replaced or removed relative to the number or types of files that are currently archived.
    3. Upon request, we will try to accommodate reusing some or all of your existing documents or calibration files, if they do not need to change.
  2. Update your data files (and any relevant documents) in preparation to resubmit them. Here are a few special reminders for updating SeaBASS files and metadata:

    1. File names: Remember to update your file names and the corresponding “/data_file_name” metadata header, if relevant. It is recommended that SeaBASS file names end in a release number (i.e., version number), and you should increment this integer to indicate resubmissions. For example, change myfile_R1.sb to myfile_R2.sb.
    2. /documents, /calibration_files, /associated_archives (if applicable), and /associated_files (if applicable): If these file names are different than the previous version, remember to change the metadata headers.
    3. /data_status: may need to be updated. For example, if your files were initially labeled "preliminary", change them to "final" if further changes are unlikely.
    4. optionally, provide a verbose description of the updates: The SeaBASS team will add a short description in the comment section of your data files. If you wish to go into more detail about the updates, either write your own lengthier comments in the headers or within your updated documentation.
  3. FCHECK the files and fix any issues.
  4. Upload your resubmission to the SeaBASS SFTP using your SeaBASS submitter account and following the normal submission process instructions. 
  5. Email the SeaBASS team shortly after you transmit your resubmission to let us know you sent it, and to remind us of the scope and reason for your resubmission.
  6. Automated email receipt: If your resubmission contains at least one data file, SeaBASS's automated email receipt system will trigger and send you a message within 24 hours. It won't trigger if your resubmission consisted of only non-SeaBASS files such as documentation or calibration files. If you requested SeaBASS reuse existing documents, you may disregard any warnings about missing documents in the automated email receipt.
  7. The SeaBASS team will review the resubmission and email you when it is online. The old versions of files will be moved offline and replaced.

Data Format

The SeaBASS file format is an approved NASA Earth Science Data format by NASA ESDIS (Earth Science Data and Information Systems)

 

Please consider the following while preparing data sets:

seabass_file_format.png

 

 

Other requirements within the data matrix:

 

° Intermediate data: submit intermediate products that were calculated as part of another reported value. An important example is submitting absorbance (i.e. optical density) measurements in spectrophotometer files along with the calculated absorption coefficients.

 

° Replicates and uncertainty: Submit information about uncertainty and/or replicates whenever applicable, typically as columns of standard deviation (e.g., <field>_sd, like "chl_sd"), and bincount to indicate the number of averaged measurements (use <field>_bincount if multiple bincount columns are required.) Other forms of uncertainty reporting are accepted for cases where they are more appropriate than the standard deviation. Contact SeaBASS staff if you have other questions about preserving raw data or uncertainty.

 

Note: some measurement protocols for measurements of discrete water samples call for samples to be calculated from multiple scans or filters (such as for extracted Chl or QFT). SeaBASS reporting convention calls for such samples to be reported as a single row of data, along with the standard deviation, and bincount. Replicate measurements can and should be reported separately, but are defined and assumed to have been created from their own set of multiple averaged values.

 

° Level of data processing: Generally speaking, data should be calibrated, depth-adjusted (i.e. adjust depths based on any differences between sensors and the pressure transducer on the package), unbinned*, with QAQC applied (i.e. bad data filtered out). Data submissions should be accompanied by the relevant calibration files and a description of the processing and analysis that went into producing output (see documentation requirements below).

 

*We recognize there might be situations where binning or other differences are necessary or appropriate. Please contact the SeaBASS administrator with questions.

File Naming Guidelines

  1. File names must not contain spaces or special characters except for hyphens, underscores, and periods.
  2. SeaBASS file names must end in ".sb" suffix. 
  3. File names (before the suffix) are recommended to end in _R#, where # is the release number starting with 1 in most cases (e.g., myfile_R1.sb).
    1. If preliminary files are submitted (i.e., /data_status=preliminary such that it is likely they will be revised in the future) then it is recommended their name includes "_R0" to indicate their tentative status
  4. File names must be unique within a submission, and ideally should be completely unique in SeaBASS. It is strongly recommended they are formed using descriptive patterns incorporating information or abbreviations of the measurement type, cruise name, date, depth or other information. Using a file naming pattern like <EXPERIMENT>-<CRUISE>-<DATATYPE>_<YYYYMMDDHHMM_<RELEASE#>.sb has the benefits of generating unique file names that sort nicely within a directory, and also allows users to quickly understand their general contents at a glance. As hypothetical examples:

               naames-naames_3-hplc_2017091512_R0.sb (example of a preliminary file, noted by "_R0")

               naames-naames_3-hplc_2017091512_R1.sb (example of final release version, i.e., R1)

               naames-naames_3-hplc_2017091512_R2.sb (example of release #2 of a data file, i.e., it has been revised once after being submitted to SeaBASS)

               tara-azores_laurient_acs-apcp_inline_201204300101_R1.sb (another example of the first release/version of a data file)

Supporting Documentation and Ancillary Files

Supporting documentation is crucial for the preservation and accessibility of data. All data submissions must be accompanied by a readme file or a protocol document that describes where and how the data were collected, instruments and methods, and describe any post-processing or curation of the data. External supporting documentation (including cruise and instrument reports or logs) and calibration files must be included in the submission. Certain types of SeaBASS data submissions have special requirements. Refer to the following "Mandatory Special Submission Requirements" page described below for measurement-specific instructions. The names of supporting documentation and calibration files must match how they are listed in the appropriate headers in each data file. Names should not include spaces or special punctuation (only underscores and hyphens are allowed.) All supporting documentation and metadata should be included within the following headers: documents, calibration files, and associated archives. Each one of them is discussed further in the following subsections.

Documents

Use this header to include readme files, protocols, cruise reports, and the required checklist if applicable (see mandatory special submission requirements). In the submission, these named files will be stored in a document's directory. 
 
Example:
/documents=Experiment_cruise_AC-s_protocol.pdf,Experiment_cruise_AC-s_checklist.txt

Calibration Files

Use this header to include all calibration and device files. In the submission, these named files will be stored within a document's directory.
 
Example:
/calibration_files= LuZ111_Calibration_Certificate.pdf, LuZ222_Calibration_Certificate.pdf

Associated Ancillary Files

In addition to documents and calibration files, some data submissions may require additional data to be submitted and linked to in the files. Some examples of associated metadata include raw imagery from instruments such as IFCB and UVP, raw sig files from flow cytometry measurements, sky photos and level-2 HDF files from above-water radiometry, FASTQ files from metagenomic data, among others.

SeaBASS uses its "associated" workflow for these circumstances that involve large data volumes and/or optional files. Associated metadata is compressed and stored in tar bundles (TGZ file) less than 5Gb in size. The TGZ files will be listed under the header “associated_archives” and the “associated_archive_types” will indicate the type of associated file (e.g., raw, planktonic imagery, benthic). The header or field “associated_files” can be used to make the link between the SeaBASS files and the associated metadata.

Associated files are stored along with the dataset and can be downloaded when ordering the data (File Search users are given the choice to opt-in.) However, they are not searchable on the SeaBASS File Search by themselves. For certain data types, the inclusion of associated files may be mandatory. For example, IFCB datasets should include planktonic imagery as an associated archive. Unless specified in the mandatory special submission requirements, the data submitter should consult with the SeaBASS data manager before using the associated metadata headers and fields.  
 
When and how to use “associated_archives” header:
 

The associated_archives header is mandatory for all submissions with associated metadata or files. This header provides the file name(s) of external bundles of files. It is typically used as part of the process of storing unprocessed files or imagery used for the results presented in the SeaBASS (.sb) file, such as planktonic imagery for IFCB, sig files for FCM, and sky photos for AOP measurements. 

 

The associated files should be compressed into a tar bundle (.tgz). Each TGZ file should not exceed 5Gb in size, however, each submission can provide multiple TGZ files. Multiple files can be listed similarly to the documents, separated by commas, and no spaces in the filename or list of files. The filename of the associated archive should include the experiment and cruise name, and data type, and must end with “_associated.tgz”. 

 

/associated_archives=Experiment_cruise_IFCB_202210-202212_associated.tgz,Experiment_cruise_IFCB_202310-202312_associated.tgz 

 

The associated_archive_types header is mandatory for all submissions with associated metadata or files and should accompany the associated_archives header. This header should describe each of the associated archive TGZ files.

 

Some of the valid entries for this header are: 

  • DNA-FASTQ 
  • DNA-FASTA (for files relating to DNA analysis) 
  • benthic 
  • planktonic (for accompanying scientific imagery files) 
  • raw (minimally processed, higher resolution data versions) 
  • unbinned (Processed or QA/QC CTD data not binned) 
  • metadata (Additional metadata information) 

Please contact the SeaBASS staff if your data do not fit into one of these types. 

/associated_archive_types=planktonic,planktonic 

 

When and how to use “associated_files” header or field:

 

The “associated_files” header or field can be used to link specific files within the compressed tar bundle (TGZ; the associated_archives bundle). If one link is relevant for the entire SeaBASS files, use associated_files as a header. If the links differ by data row within the data matrix, use associated_files as a field. Please refer to the examples below for the format when using it as a header or a field. Note that the examples below are for the purposes of showing the format, however, the data is not real.

 
Example using the associated_files header:
/associated_archives=Experiment_cruise_IFCB_202210-202212_associated.tgz
/associated_archive_types=planktonic_imagery
/associated_files=Experiment_cruise_IFCB_image1.png,Experiment_cruise_IFCB_image2.png
/associated_file_types=planktonic_imagery,planktonic_imagery
 
Example using the associated_files field:
/associated_archives=Experiment_cruise_CTD_raw_associated.tgz
/associated_archive_types=raw
/fields=station,lat,lon,date,time,depth,wt,associated_files,associated_file_types
/units=none,degrees,degrees,yyyymmdd,hh:mm:ss,m,degreesC,none,none
station_40,31.98,-88.67,20210910,03:34:45,30,27.5,CTD_file40a.txt|CTD_file20b.txt,raw|raw
 
Note that in the case of multiple associated files per row, use the pipe “|” to separate the files and file types.
 
How to submit associated ancillary files:
 
The submitter can compress the files using any method such as TGZ, tar.gz, or ZIP file. The files can also be uploaded individually, however, make sure they are within a folder that clearly denoted that they are associated files. The SeaBASS data manager will create the final TGZ file.  

Mandatory Special Submission Requirements

When preparing a submission, check for your measurement types in the list on the Data Submission Special Requirements page. Certain types of SeaBASS data submissions have special requirements. For example, some data files need conditionally required metadata headers and some submissions require extra "checklists" as part of external documents. More measurement types are being added to that list as oceanographic community-based protocols evolve.
 

On that page, you will see if the submission of your measurement type includes

  1. Any required extra documents. These checklists are designed to standardize and preserve critical methods and analysis details that are needed for intercomparison, and reprocessing, to make it easier for data users to assess the data quality and to consider them for satellite validation or inclusion in algorithm development datasets. 
  2. Any special notes that highlight any necessary measurement-specific metadata (e.g., conditionally required headers), fields, or formatting.
  3. Any example submission information containing example data files and documentation bundles to use as a reference. 

Format Checking and Submission

  • Reminder: It is required you scan your data files for any format problems using FCHECK before uploading your submission to SeaBASS.
  • Data submissions (including data files, calibration files, and documentation) are uploaded to SeaBASS via SFTP (SSH File Transfer Protocol).
  • Contact the SeaBASS team if you need a submitter account. For instructions on this one-time process see the Setting up SFTP Access documentation.
  • When ready to connect, use any SFTP client of your choice. Inputs include your username and the SSH key pair you linked to your account:
  1. Use SFTP software of your choice to connect to the following link using your assigned SeaBASS SFTP username:
    yourusername@samoa.gsfc.nasa.gov:/yourusername
    (substitute your personal username where that link says "yourusername". Also, your client will need to be pointed to where you saved your SSH keys.
  2. Upon connecting, you will find two directories on the SeaBASS SFTP server: data_submission and FCHECK.

    •    The data_submission directory is where submissions must be uploaded (including data and documentation.)
      • Create subdirectories to organize data by project (i.e. experiment or cruise).
      • Per project grouping, please create a folder named "documents" to contain all supporting documentation.
      • (If applicable) Per project grouping, please create a folder named "associated" if your submissions involve any special associated files referenced in the metadata of your SeaBASS data files.
    • The FCHECK directory may be used to scan a batch of SeaBASS files for formatting problems. For details see the FCHECK documentation.
      •    Note: files uploaded here don't count as a submission. Submissions must be placed in data_submission/
  3. Within 24 hours after submitting files, an automated receipt will be emailed to the contact listed in the files' metadata header. If you do not receive a receipt, please contact the SeaBASS Administrator.
  4. SeaBASS administrators will collect the files and evaluate the data set, contacting the submitters with any questions about the data or documentation.
  5. Once the data are archived, SeaBASS administrators will update the new data page and contact the data submitters with a final confirmation.

Setting up SFTP Access

You must register a SeaBASS SFTP account to submit data. This is a one-time step, and you will be able to use this account for all future submissions. You may follow these same instructions if you already have a submitter account but need to upgrade or add a new SSH key. 

  1. Attach your public RSA SSH key
    1.    Do attach the file name ending in ".pub" (it commonly has a name like "id_rsa.pub").
    2.    Don't email us its partnering private key (it commonly has the same name without .pub, e.g., "id_rsa").
    3.    If you want to submit data from multiple computers, you are allowed to register multiple public keys and request that they are all linked to your SeaBASS SFTP account.
  2. Your first and last name
  3. Your affiliation/institution name
  4. Your email address that will be linked to your SFTP account
  5. If you are not the person who already contacted the SeaBASS team to describe your project and upcoming data submissions, please include a reminder of your connection (e.g., you work in the same lab, etc.)

Generating a SSH key

Your SFTP account will be secured using an SSH key. If you have never used SSH keys before, you will need to create one by running a command as explained below. An SSH key consists of two unique strings which will be saved in small plain text files. One file is public (usually its name ends in ".pub"), and the other is private (never share this one). Creating your key is a one-time process if you keep its two parts (files) stored safely on your computer. 

 

Note that SSH keys come in different types. Previously, SeaBASS would accept different types of SSH keys. Starting May 2023, SeaBASS will only accept RSA SSH keys that are 4096-bit encrypted. All previously submitted SSH keys must be updated to comply with the new NASA SSH key requirements.  T update your SSH key please follow the instructions below and e-mail the SeaBASS team your new RSA 4096-bit SSH key along with your SeaBASS submitter username (usually the first letter of your name followed by your last name, for example, John Smith username is jsmith). 

 

Please note that SSH keys are computer specific, therefore an SSH key must be generated for each computer used to upload data to the SeaBASS SFTP.  

 

 

How to create an SSH key from the terminal (Mac, Linux, and newer versions of Windows):

 

Run ssh-keygen using the terminal or command prompt:

ssh-keygen -t rsa -b 4096 
  1. You might be prompted to customize the file name or to create a passcode. You may press enter to accept the default values (i.e., no password).
  2. By default, the key will be created in your home directory under the ".ssh" directory (note that the dot in the .ssh folder name might cause it to be hidden, depending on your system settings)
    1. Example default location for RSA public key on Mac & Linux:
      /home/USERNAME/.ssh/id_rsa.pub
    2. Example default location on Windows:
      C:\Users\USERNAME\.ssh\ id_rsa.pub
  3. Keep your keys safe: Try not to delete or overwrite your keys in the future. Also, never share or email the private key.
  4. Return to the Setting up SFTP Access instructions to learn what other information to email to the SeaBASS Team. 

How to generate SSH keys on older versions of Windows or other systems without ssh-keygen:

 

Older versions of Windows are unable to generate SSH keys from the command line but can accomplish similar results using 3rd party software. Some suggestions and reminders are provided below, although it is beyond the scope of this guide to provide detailed instructions for every possible configuration. Suggestions:

 

  • This page offers one example of how to generate a SSH key on Windows via an open-source software solution.
  • Remember to specify the type; the SSH key must be of type RSA 4096-bit encrypted.
  • If the software doesn't automatically save the key as a text file, then use a text editor to save the public and private keys manually
    • Copy the entire public RSA SSH key string into a text file called "id_rsa.pub"
    • If needed, also manually save the private key as "id_rsa"
  • Save or move both those files to a folder on your computer where you can refer to them later and they won't be accidentally deleted. Your SFTP software will need them whenever it is time to connect.
Last edited by Chris Proctor on 2023-09-14
Created by anonymous on 2012-05-23