Submission Instructions

This page provides instructions on how to submit data to SeaBASS as well as information about the SeaBASS File Format. If you are new to this process, please scroll down the page to the "How to Submit" section and review the steps involved. More details can be found in the other sections on this page and beneath the other topics under "Contribute Data" found in the main menu of the SeaBASS website.
 
The SeaBASS data format and structure were designed with the following in mind: To account for the continuous growth of the bio-optical data set and the wide variety of supported data types, the NASA Ocean Biology Processing Group felt it essential to develop efficient data ingestion and storage techniques. While this requires a specific data file format, the data protocols were designed to be as straightforward and effortless as possible on the part of the contributor, while still offering a useful format for internal efforts. The system was intended to meet the following conditions: simple data format, easily expandable and flexible enough to accommodate large data sets; global portability across multiple computer platforms; and web accessible data holdings with sufficient security to limit access to authorized users.


Table of Contents


Data Submission Policy

All data collected under the auspices of the NASA Ocean Biology and Biogeochemistry (OB&B) program are to be submitted to SeaBASS within 1-year of the date of collection. For further details, please review the SeaBASS Access Policy.

 

How to Submit (Overview Steps)

New submitters should read this section for an overview of how to submit data to SeaBASS. The majority of the process involves organizing your data using the SeaBASS file format. SeaBASS files are text files beginning with a block of metadata information followed by a data matrix. Some metadata headers and keywords are mandatory, and others are optional or conditional. All data columns must be labeled using using standardized combinations of field names and units. In addition to data files, your submission must be accompanied by external documentation that describes collection and processing in more detail, plus any relevant calibration files and details.

NASA-funded investigators are responsible for converting their data into the SeaBASS file format. The SeaBASS team might be able to assist in reformatting voluntary data submissions, especially for in-demand measurements needed for satellite validation, but it will require discussion on a case-by-case basis.
 
The numbered list below outlines the steps involved in submission for someone new to SeaBASS. These steps include brief explanations, with more information found elsewhere on this page and under Contribute Data in the main menu. Feel free to email the SeaBASS team if you have questions or concerns at any point in this process.
  1. Learn about the SeaBASS data format and other submission requirements listed further below on this page.
    1. For a quick idea of what the SeaBASS file format looks like, go to the Metadata Headers page which begins with a relatively simple example.
    2. Consult the list of measurement types on the Data Submission Special Requirements page to see if your measurement types are there. If so, review any relevant examples and info about which fields and metadata to use.
  2. Contact SeaBASS staff via email to plan out your submission before you begin formatting data. SeaBASS staff will respond to your inquiry, typically within 1-2 business days. For routine submissions, you will then be able to proceed with the steps below. For more complex scenarios and data types that are new to SeaBASS, our data managers will require more information and discussion of how to best organize and accommodate your data and documentation. When you email the SeaBASS Staff to introduce yourself, include:
    1. Your first and last name, and what institution you are affiliated with.
    2. Please indicate if your project was NASA-funded, or if your submission is voluntary.
    3. If you are not the PI or person who secured funding for the project within your laboratory or organization, please indicate that person's name and how you are connected.
    4. Indicate the project name and deployment or cruise names. In SeaBASS data are categorized under a new or existing experiment and cruise name (see Lists in the main menu), which should be coordinated if there were co-investigators.
    5. Briefly explain the measurement types of the data you wish to submit.
  3. This is a good time to register a SeaBASS submitter account by following the instructions further down this page. The process requires generating SSH keys and emailing the public key to SeaBASS staff. You can postpone this step, but note that it is required for several of the methods of running FCHECK.
  4. Create a draft SeaBASS file.
    1. Use the automated File Checker (FCHECK) to scan your file for format-compatibility problems. More info. Run FCHECK, fix any problems, and repeat as necessary.
    2. A suggestion is to start with just one file for practice before creating and checking all your other files.
    3. Pick thoughtful file names for your data files. Avoid using spaces and most special characters. Full guidelines are found further down this page.
  5. Prepare and gather all relevant external documentation and calibration information (e.g., cruise reports, methods descriptions, calibration files, and/or fill out any SeaBASS-provided templates). Generally these sorts of files may be provided in any format (e.g., text, PDF, DOC, etc.) Tips:
    1. Make sure the names listed in the metadata /documents and /calibration_files exactly match the corresponding file names.
    2. Do not use FCHECK to scan documentation. FCHECK is only needed for data files.
    3. Again, consult the Data Submission Special Requirements page and, if your data type is there, ensure you follow the guidelines and include required documentation and checklists.
  6. If you haven't already, register a SeaBASS submitter account by following the instructions further down this page.
  7. Use your account to upload your data submission, including all documentation. If your submission includes multiple types of measurements, it is generally requested that you submit them all simultaneously as complete package, not piecemeal.
    1. After your upload your submission, you will receive an automated email receipt within 24 hours (sent to the submitter's account plus addresses in the "contacts" metadata). Please contact us if you don't receive this confirmation within 24 hours.
    2. If your receipt email contains a section about MISSING documents and calibration files: These files were named in your metadata headers but could not be found by our automated system anywhere within your data_submission folder. The most common reasons are: 1) you forgot to upload the files, or 2) the spelling or capitalization of the documentation files does not exactly match. Please log back in and upload the relevant docs and cals. The one exception to this is if you are supplementing a previous submission and referencing documentation that are already archived for the particular cruise you are adding data files to (also see the “resubmission” instructions on this page.) In that case, you may disregard the MISSING message, unless the SeaBASS administrators contact you for clarification.

    3. Next, the SeaBASS team will prepare to archive your data. This involves manually reviewing your submission for completeness and performing additional QC checks. You will be emailed if revisions or more information are needed.
    4. The SeaBASS team will add a Digital Object Identifier (DOI) to your data files corresponding to the "experiment".  If the experiment is new to SeaBASS, the DOI will be reserved. New DOIs are typically formally registered on a quarterly basis. The SeaBASS team also adds the special date "received" header.
    5. You will be emailed when the data archiving is complete.
              
 

How to Resubmit (i.e. Update a Previous Submission)

SeaBASS encourages resubmissions to ensure that the best quality versions of data are available online. In the event that you need to update some or all of your data or documentation for a past submission, please follow these steps:

  1. Email the SeaBASS team ahead of time to discuss what you want to resubmit. We will evaluate the best way to update your files on the back end of SeaBASS and reply to you about how to proceed. We might need different things from you if only a small number of files are affected on a 1:1 basis versus a large or complex resubmission.
    1. Briefly explain the reason for the resubmission. If possible, also provide a succinct description (one line or less) to explain the issue or what changed in the newer version of your data files. The SeaBASS team will use the short description to update a special metadata field (i.e., to preserve release/version history, the previous “/received” info will be kept as a comment with the short explanation appended.)
    2. Explain the scope of what files or documents need to be replaced or removed relative to the number or types of files that are currently archived.
    3. Upon request we will try to accommodate reusing some or all of your existing documents or calibration files, if they do not need to change.
  2. Update your data files (and any relevant documents) in preparation to resubmit them. Here are a few special reminders for updating SeaBASS files and metadata:

    1. File names: Remember to update your file names and the corresponding “/data_file_name” metadata header, if relevant. It is recommended that SeaBASS file names end in a release number (i.e., version number), and you should increment this integer to indicate resubmissions. For example, change myfile_R1.sb to myfile_R2.sb.
    2. /documents and /calibration_files: If these file names are different than the previous version, remember to change the metadata headers.
    3. /data_status: may need to be updated. For example, if your files were initially labeled "preliminary", change them to "final" if further changes are unlikely.
    4. optionally, provide a verbose description of the updates: The SeaBASS team will add the very short description you provided in 1.a as a special comment in your data files. If you wish to go into more detail about the updates, either write your own lengthier comments in the headers or within your updated documentation.
  3. Transmit your resubmission using your SeaBASS submitter account, following the normal submission process instructions. Before sending, remember to scan your updated files with FCHECK and fix any issues. Please wait until after a SeaBASS staff member has acknowledged your resubmission (see #1 above) before proceeding with this step.
  4. Email the whole SeaBASS team shortly after you transmit your resubmission to let us know you sent it, and to remind us of the scope and reason of your resubmission. If possible, send this email as a reply-all to any existing email conversation (i.e., from #1 above) to retain such information.
  5. Automated email receipt: If your resubmission contains at least one data file, SeaBASS's automated email receipt system will trigger and send you a message within 24 hours. It won't trigger if your resubmission consisted of only non-SeaBASS files such as documentation or calibration files. If you requested SeaBASS reuse existing documents, you may disregard any warnings about missing documents in the automated email receipt.
  6. The SeaBASS team will review the resubmission and email you when it is online. The old versions of files will be moved offline and replaced.

 

 

Data Format

The SeaBASS file format is an approved NASA Earth Science Data format by NASA ESDIS (Earth Science Data and Information Systems)

 

Please consider the following while preparing data sets:

  • SeaBASS data files are flat, two-dimensional ASCII text files
  • Information is organized into two sections: metadata headers and data
    • The metadata headers section is first, and consists of keywords and values
    • The data section follows, and consists of columns of data values
  • The following graphic presents a simplified overview:

seabass_file_format.png

 

  • Data are presented as a matrix of values, much like a spreadsheet.
  • One delimiter separates columns of data. Comma is recommended. Space or tab are accepted.
    •    Do not mix delimiters. Use only the delimiter spelled out in the relevant metadata header (e.g., /delimiter=comma)
  • Data are preceded by a series of predefined metadata headers.
  • The headers provide descriptive information about the data file, e.g., cruise name, date, and cloud cover.
  • Some headers are always mandatory, and others are optional. Some are required only for specific types of measurements.
  • All SeaBASS field names and units (e.g., CHL, AOT) have been standardized.
    • Tip: there is a search bar on the Fields page
  • SeaBASS field names and units are not case sensitive.
  • Headers may be arranged in any order provided that the first and last are /begin_header and /end_header.
  • Most headers are required. Use a value of NA (not applicable) if information is unavailable.
  • Within the data matrix, use a numeric value such as -9999, for missing or screened data (not NA or NaN.) Define this value in the metadata headers, e.g., /missing=-9999
    •    If applicable, distinct values should be defined for measurements that were outside detection limits, e.g.,  /below_detection_limit=-8888 or /above_detection_limit=-7777
  • List latitude in decimal degrees, with coordinates north of the equator positive and south negative.
  • List longitude in decimal degrees, with coordinates east of the Prime Meridian positive and west negative.
  • List times in GMT (UTC).
    • Acceptable combinations of time to be reported in the data matrix of the file are: date/time, year/month/day/hour/minute/second, year/month/day/time, or date/hour/minute/second.
    • Year/sdy/hour/minute/second and year/sdy/time are also supported but not encouraged.
    • If precision to the nearest second was not measured, please report seconds as top of the minute (00).
  • Header entries for date, time, and location headers are the extreme value for the file (e.g., farthest north, the date and time of the first measurement, etc.) These extremes are identical in cases when a constant value applies to all data.
  • Only the time and location headers require bracketed ([]) units. No other headers should include brackets.
  • Headers should not include any white space. Separate words with an underscore. The only exception is for comment lines (beginning with !) which are allowed to contain spaces.
  • When selecting experiment & cruise names for your metadata headers, consider if your submission is part of an existing project or time series. If so, try to use consistent names. Otherwise, you are welcome to pick or suggest a new unique name (25 characters or fewer). The experiment name is especially important for grouping data sets because it becomes part of the assigned Digital Object Identifier (DOI).
 
  • All data measurements must include the following metadata information: date, time, latitude, longitude, depth. However, such columns may be omitted if its metadata value was constant for every row of data. If omitted, then it is implied that the value from the corresponding metadata header (which are always required) was constant applies to all rows of data
  • For example, if the columns "lat" and "lon" are not present in /fields, then it is implied there is only one latitude value reported in the header (/north_latitude and /south_latitude are the same) as well as for longitude (/east_longitude, /west_longitude are the same). The constant value for each applies to every row of data
  • If the values were not constant, then columns must be included.
  • For example, if lat and lon are in /fields, then the /north_latitude header contains the northern-most latitude measurement from the data file (and the same logic applies for /south_latitude, /east_longitude, and /west_longitude)

 

Other requirements within the data matrix:

° Intermediate data: submit intermediate products that were calculated as part of another reported value. An important example is submitting absorbance (i.e. optical density) measurements in spectrophotometer files along with the calculated absorption coefficients.

 

° Replicates and uncertainty: Submit information about uncertainty and/or replicates whenever applicable, typically as columns of standard deviation (e.g., <field>_sd, like "chl_sd"), and bincount to indicate the number of averaged measurements (use <field>_bincount if mulitple bincount columns are required.) Other forms of uncertainty reporting are accepted for cases where they are more appropriate than standard deviation. Contact SeaBASS staff if you have other questions about preserving raw data or uncertainty.

 

Note: some measurement protocols for measurements of discrete water samples call for samples to be calculated from multiple scans or filters (such as for extracted Chl or QFT). SeaBASS reporting convention calls for such samples to be reported as a single row of data, along with the standard deviation, and bincount. Replicate measurements can and should be reported separately, but are defined and assumed to have been created from their own set of multiple averaged values.

 

° Level of data processing: Generally speaking, data should be calibrated, depth-adjusted (i.e. adjust depths based on any differences between sensors and the pressure transducer on the package), unbinned*, with QAQC applied (i.e. bad data filtered out). Data submissions should be accompanied by the relevant calibration files and a description of the processing and analysis that went into producing output (see documentation requirements below).

 

*We recognize there might be situations where binning or other differences are necessary or appropriate. Please contact the SeaBASS administrator with questions.

 

File Naming Guidelines

  1. File names must not contain spaces or special characters except for hyphens, underscores, and periods.
  2. SeaBASS file names must end in ".sb" suffix. 
  3. File names (before the suffix) are recommended to end in _R#, where # is the release number starting with 1 in most cases (e.g., myfile_R1.sb).
    1. If preliminary files are submitted (i.e., /data_status=preliminary such that it is likely they will be revised in the future) then it is recommended their name includes "_R0" to indicate their tentative status
  4. File names must be unique within a submission, and ideally should be completely unique in SeaBASS. It is strongly recommended they are formed using descriptive patterns incorporating information or abbreviations of the measurement type, cruise name, date, depth or other information. Using a file naming pattern like <EXPERIMENT>-<CRUISE>-<DATATYPE>_<YYYYMMDDHHMM_<RELEASE#>.sb has the benefits of generating unique file names that sort nicely within a directory, and also allows users to quickly understand their general contents at a glance. As hypothetical examples:

               naames-naames_3-hplc_2017091512_R0.sb (example of a preliminary file, noted by "_R0")

               naames-naames_3-hplc_2017091512_R1.sb (example of final release version, i.e., R1)

               naames-naames_3-hplc_2017091512_R2.sb (example of release #2 of a data file, i.e., it has been revised once after being submitted to SeaBASS)

               tara-azores_laurient_acs-apcp_inline_201204300101_R1.sb (another example of the first release/version of a data file)

 

Documentation

The following requirements must be met:

  • External supporting documentation (including cruise and instrument reports or logs) and calibration files must be included in the submission.
  • List them via the metadata headers /documents=, /calibration_files=, and optionally /associated_files=. Multiple file names may be listed; separate them by commas, without spaces.
  • The names of supporting documentation and calibration files must match how they are listed in the appropriate headers in each data file. Names should not include spaces or special punctuation (only underscores and hyphens are allowed.)
  • Refer to the following "Submission Special Requirements" page described below for measurement-specific instructions.

 

Mandatory Special Submission Requirements

When preparing a submission, check for your measurement types in the list on the Data Submission Special Requirements page. Certain types of SeaBASS data submissions have special requirements. For example, some data files need conditionally required metadata headers, and some submissions require extra "checklists" as part of external documents. More measurement types are being added to that list as oceanographic community-based protocols evolve.
 

On that page you will see if the submission of your measurement type includes

 

  1. Any required extra documents. These checklists are designed to standardize and preserve critical methods and analysis details that are needed for intercomparison, reprocessing, to make it easier for data users to assess the data quality and to consider them for satellite validation or inclusion in algorithm development datasets. 
  2. Any special notes that highlight any necessary measurement-specific metadata (e.g., conditionally required headers), fields, or formatting.
  3. Any example submission information containing example data files and documentation bundles to use as a reference.
 

Format Checking and Submission

  • The OBPG maintains feedback software, FCHECK, to evaluate the format of data files.
  • All data files should be tested using FCHECK prior to submission to SeaBASS.
  • Data files, calibration files, and documentation are submitted to SeaBASS via SFTP (SSH File Transfer Protocol).
  • The contributor should upload files using an SFTP client of choice. NASA does not endorse any particular SFTP client.
  • Note: a username and a SSH key pair are required to access the SFTP site. Contact the SeaBASS Administrator to establish access. For instructions on this process see the Setting up SFTP Access documentation.
  1. Once an SFTP account is established you will be assigned a username. Use SFTP software of your choice to connect to the following link:
    yourusername@samoa.gsfc.nasa.gov:/yourusername
    (substitute your personal username where that link says "yourusername". No password is explicitly needed beyond your SSH keys.
  2. Upon connecting, you will find two directories on the SeaBASS SFTP server: FCHECK and data_submission.
    • The FCHECK directory is a location where a batch of SeaBASS files may be bulk checked via FCHECK. For details see the FCHECK documentation.
    • The data_submission directory is to be used to upload and submit data and required documentation to the SeaBASS archive. Data MUST be placed in this directory to be considered as a submission to SeaBASS.
      • Subdirectories should be created within the data_submission folder to organize data by project (i.e. - experiment or cruise). An additional layer of subdirectories should also be used within the project directories to contain all supporting documentation.
  3. Within 24 hours after submitting files, an automated receipt will be emailed to the contact listed in the files' metadata header. If you do not receive a receipt, please contact the SeaBASS Administrator.
  4. The SeaBASS Administrator will collect the files and evaluate the data set, contacting the submitters with any questions about the data or documentation.
  5. Once the data are archived, the SeaBASS Administrator will update the new data page and contact the data submitters with a final confirmation.

Setting up SFTP Access

  • An individual account must be registered to submit data using the SeaBASS SFTP server.
  • In order to gain access to the SFTP server, we require a copy of your public SSH key, generated with key type ED25519 (file name id_ed25519.pub) or key type ECDSA (file name id_ecdsa.pub). We prefer the ED25519 key type if your system has a modern version of OpenSSH.
  • Please email 1) your public ED25519 or ECDSA SSH key, 2) your first and last name, 3) your affiliation/institution name, and 4) your email address that will be linked to your SFTP account to the SeaBASS Administrator. Then, an individual SFTP account will be created and access for data submission and bulk FCHECK requests will be granted.
    • Note: If you need to submit data from different computers, it is acceptable to send additional public keys and request that they are all linked to your SeaBASS SFTP account.
  • Instructions on generating ED25519 or ECDSA SSH keys may be found here for Windows, Mac, Linux or BSD platforms.

Generating a SSH key

For Mac, Linux or BSD:

There are two options to genereate a SSH key to be used for SFTP access to submit data to SeaBASS.

For Windows:

There are many options to genereate a public SSH key to be used for SFTP access to submit data to SeaBASS via Windows.

  • The SSH key must be of type ED25519.
  • This page offers one example of how to generate a SSH key on Windows via an open-source software solution.
  • Be sure to save and note the location of the private SSH key and do not share the private key with anyone.
  • Copy the entire public ED25519 SSH key string into a text file called "id_ed25519.pub" and email your public key, your first and last name, your affiliation (insitution and PI names), and your email address that will be linked to your SFTP account to the SeaBASS Administrator.
    • Note: this is the email address that MUST be used to request a batch check of SeaBASS files via FCHECK.
Last edited by Chris Proctor on 2021-06-14
Created by Jason Lefler on 2012-05-23