Develop the Oracle of DSCOVR - Experimental Data Repository

Example Resource for the challenge "Develop the Oracle of DSCOVR"

The data files in this repository contain measurements from the DSCOVR PlasMAG instrument suite, recorded in the solar wind near the Earth-Sun L1 Lagrange Point between 2016 and present day.

Each file corresponds to one year of measurements. The measurements themselves have been condensed and decimated to a cadence of one measurement set per minute.

DSCOVR PlasMAG 2016 Data

DSCOVR PlasMAG 2017 Data

DSCOVR PlasMAG 2018 Data

DSCOVR PlasMAG 2019 Data

DSCOVR PlasMAG 2020 Data

DSCOVR PlasMAG 2021 Data

DSCOVR PlasMAG 2022 Data

DSCOVR PlasMAG 2023 Data

These data are provided in a human-readable text format, with one one-minute measurement set per line. The measurements that comprise each set are separated by commas.

What is in a measurement set?

Each line begins (column 0) with a date and time in Coordinated Universal Time (UTC), formatted like YYYY-MM-DD hh:mm:ss. So, for example, October 7, 2023 at 9:00 a.m in Greenwich, England, would be written 2023-10-07 09:00:00.

The next three values in the line (columns 1-3) represent components of the magnetic field vector that was measured at this time. They are expressed in units of nanoTesla (nT) and provided in the Geocentric Solar Ecliptic reference frame (GSE).

The last fifty values (columns 4-53) represent a "raw" measurement spectrum from the Faraday cup plasma detector. Each value corresponds to the flux, or flow strength, of the solar wind in a particular range of energies (or flow speeds). These numbers are not calibrated or converted-- they are dimensionless numbers as encoded in the instrument computer.

Why are some of the numbers zero?

The PlasMAG detectors do not take data all of the time, and the Faraday cup does not make measurements over its full range every minute. Whenever and wherever no data are available, the field is filled in with an integer 0. We recommend converting these to "NaN" in your computing environment after you load the data.

How should I load these data?

These data have been formatted to be as accessible as possible, no matter what tools you prefer to use. One particularly simple and likely popular option, however, will be to use python with pandas dataframes, like this:

import pandas
data = pandas.read_csv("dsc_fc_summed_spectra_2016_v01.csv", \
delimiter = ',', parse_dates=[0], \
infer_datetime_format=True, na_values='0', \
header = None)