<<

RAWDAR: Raw Data Repository

May 14th, 2018 RAWDAR

Objectives & Status RAWDAR has the following, related but independent objectives: ✓ Storage of the missions’ historic raw data in a central repository. ✓ Systematic storage of the missions’ future raw data in this central repository.

RAWDAR Data stored by Mission Category: • RSVTSC-0: Concluded Missions - Data not available • RSVTSC-1: Concluded Missions - Data fully fetched • RSVTSC-2: On-going Missions - Data stored externally

• RSVTSC-3: On-going Missions - Manual retrieval GEDAIS • RSVTSC-4: On-going Missions - Automated retrieval • RSVTSC-5: Concluded Missions - Data stored externally as presented in the following diagram. Concluded missions • • • • • • • • • • • • • Data stored externally Data VEX SMART IUE HIPPARCHOS HERSCHEL EXOSAT LISA PATHFINDER ISO GIOTTO Data not available Data Data fully fetched fully Data - 1 Missions Categories & RSVTSC RSVTSC RSVTSC RAWDAR - - - 1 5 0 • • • • • • • • Data stored externally Data XMM SOHO INTEGRAL MEX HUBBLE BEPICOLOMBO Automated retrieval Manual retrieval Manual - NEWTON RSVTSC RSVTSC RSVTSC - - - 4 3 2

On-going missions RAWDAR

Missions – High-Level View RAWDAR

Missions’ Sizes – Pies & Charts Graphs RAWDAR

Directory Structure

The /data mount point contains the RADACER directory which is the repository root directory.

For each mission there is a directory that contains all the available raw data of the mission. Concluded missions • • • ULYSSES HUYGENS GIOTTO Data not available Data RSVTSC RSVTSC RAWDAR - 0 - 0 Missions RAWDAR

GIOTTO

• Category: – RSVTSC-0: Concluded mission – Data not available – Old mission, whose raw data are not available at ESAC nor is there a possibility of finding it. • General Mission Information: – http://sci.esa.int/giotto/ • Data:

– Spacecraft data (DID, RSVTSCEPA, GRE,-1 HMC, IMS, JPA, MAG, NMS, OPE, PIA, RPA) in PDS format – https://www.cosmos.esa.int/web/psa/giotto# RAWDAR

HUYGENS

• Category: – RSVTSC-0: Concluded mission – Data not available – Data (data products only) and relevant information available exclusively through the URL below. • General Mission Information: – http://sci.esa.int/cassini-huygens/ – https://www.cosmos.esa.int/web/psa/huygens#

RSVTSC-1 RAWDAR

ULYSSES

• Category: – RSVTSC-0: Concluded mission – Data not available – Very old mission, whose raw data are not available at ESAC nor is there a possibility of finding it. • General Mission Information: – http://sci.esa.int/ulysses/ • Data:

– http://ufa.esac.esa.int/ufa/RSVTSC-1 Concluded missions • • Data stored externally Data LISA PATHFINDER ISO RSVTSC RSVTSC RAWDAR - 5 - 5 Missions 5 RAWDAR

ISO

• Category: – RSVTSC-5: Concluded mission – Data stored externally – Old mission for which all the raw data is already available from the ISO Data Archive (IDA) at the URL below. • General Mission Information: – http://sci.esa.int/iso/ • Data:

– https://www.cosmos.esa.int/web/iso/accessRSVTSC-1 -the-archive – ISO DATA PRODUCTS (https://www.cosmos.esa.int/web/iso/guide-to-iso- data-products) • Raw Data Products: These are essentially unpacked telemetry in which no data reduction has taken place. • Basic Science Data Products: Data have been processed further to an intermediate level. • Fully Auto-processed Science Data Products: These data include a set of coherent, instrument-independent measurements of images or spectra designed to get as close as possible by automatic means to what could be produced by an astronomer using an interactive analysis system. RAWDAR

Lisa Pathfinder

• Category: – RSVTSC-5: Concluded mission – Data stored externally – Mission in post-operations phase. • General Mission Information: – https://www.cosmos.esa.int/web/lisa-pathfinder • Data: – Science Archive: http://archives.esac.esa.int/lpfsa – Products RSVTSC-1 • Raw Data Products: unpacked telemetry. They are to be ingested in the archive under the DDS placeholder soon. • Basic Science Data Products: processed data. In the current version of LPFSA Analysis Objects and Service Documents fall under this category. Eventually Analysis Packages will be ingested as well. • Fully Auto-processed Science Data Products: in a future release some products of this type will be part of the final LPFSA. Concluded missions • • • • • • • • VEX SMART ROSETTA PLANCK IUE HIPPARCOS HERSCHEL EXOSAT Data fully fetched fully Data - 1 RSVTSC RSVTSC RAWDAR - 1 - 1 Missions RAWDAR

EXOSAT

• Category: – RSVTSC-1: Concluded mission – Data fully fetched • General Mission Information: – http://sci.esa.int/exosat/ • Data: – Estimated Total Size: 38.57 GB – Available Locally: 38.57 GB / 3 Directories / 6,686 Files – Data location: /data/RADACER/missions/EXOSAT/DATA/ – Data structure / content:RSVTSC-1 • Data currently stored in files named by their corresponding FOT identifier (eg. E0004.fot.tar.gz, E0007.fot.tar.gz, etc • These data files were downloaded from the EXOSAT web site and were also tar and gz-compressed. • Data Source – Data originally stored in 1600 BPI 12" magnetic tapes (Final Observation Tapes, or FOTs). – Also available from the ESA-ESTEC cosmos web site at: http://www.cosmos.esa.int/web/exosat/fots-1983 RAWDAR

HERSCHEL

• Category: – RSVTSC-1: Concluded mission – Data fully fetched • General Mission Information: – http://sci.esa.int/herschel/ • Data: – Estimated Total Size: 1.67 TB – Available Locally: 1.67 TB / 27 Directories / 97,641 Files – Data location: /data/RADACER/missions/HERSCHEL/DATARSVTSC-1 – Data structure/content: • The telemetry is organized as one directory per APID, with one folder per APID • Within APID folders, there are files: – each covering an interval of 16384 seconds – named in the form: 0016_106222057422848.TLM where 0016 is the APID and 106222057422848 is the onboard CUC time of the start of the period • Data Source: – Copied from an external HDD at ESAC. RAWDAR

HIPPARCOS

• Category: – RSVTSC-1: Concluded mission – Data fully fetched • General Mission Information: – http://sci.esa.int/hipparcos/ • Data: – Estimated Total Size: 1.28 TB – Available Locally: 1.28 TB / 1,044 Directories / 32,194 Files – Data location: /data/RADACER/missions/HIPPARCOS/DATA/RSVTSC-1 – Data structure / content: • at first level one folder for each usb stick, named set_1, set_2, …, set_8 • within these, there are files named in the form HA0284.tap, HA0285.tap, etc • Data source: – Copied from 8 HDDs of 250 GB each RAWDAR

IUE

• Category: – RSVTSC-1: Concluded mission – Data fully fetched • General Mission Information: – http://sci.esa.int/iue/ • Data: – Estimated Total Size: 213.94 GB – Available Locally: 213.94 GB / 133 Directories / 694,001 Files – Data location: /data/RADACER/missions/IUE/DATA/RSVTSC-1 – Data structure / content: • Raw telemetry is organized in directories under camera name and image sequence number • A separate directory exists for each camera (i.e., lwp, lwr, swp, swr) • Below each of these directories is a separate directory for each 1000 image numbers • These directories are named according to the starting image number in each group (i.e., 1000, 2000, 3000, etc.). • Data source: – Copied via ftp://archive.stsci.edu/pub/iue/data/ RAWDAR

PLANCK

• Category: – RSVTSC-1: Concluded mission – Data fully fetched • General Mission Information: – http://sci.esa.int/planck/ • Data: – Estimated Total Size: 2.52 TB – Available Locally: 2.52 TB / 1,237 Directories / 20,568 Files – Data location: /data/RADACER/missions/PLANCK/DATA/RSVTSC-1 – Data structure / content: • The raw packet telemetry data is organized in directories first by data stream, then followed by the SPID • The FARC area contains a number of additional files such as: – discarded packets in raw format – telecommand history files as provided by the DDS system – complete checkout of the FARC • Data source: – Copied from 2 HDDs at ESAC. RAWDAR

ROSETTA

• Category: – RSVTSC-1: Concluded mission – Data fully fetched • General Mission Information: – http://sci.esa.int/rosetta/ • Data: – Estimated Total Size: 425.71 GB – Available Locally: 425.71 GB / 42 Directories / 219,025 Files – Directory location: /data/RADACER/missions/ROSETTA/DATA/RSVTSC-1 – Data structure / content: • This data is distributed as Instrument: Science data size / Housekeeping data size • - ALICE: 1.6 GB / 137 MB • - CAM1: 12 GB / 259 MB • - CAM2: 4.8 MB / 204 KB • - COSIMA: 1.1 GB / 81 MB • - GIADA: 175 MB / 319 MB (continued) RAWDAR

ROSETTA

– Data structure / content (continued): • - MIDAS: 294 MB / 992 MB • - MIRO: 7.4 GB / 436 MB • - OSIRIS: 22 GB / 192 MB • - ROSINA: 4.5 GB / 243 MB • - RPC: 7.2 GB / 316 MB • - VIRTIS: 23 GB / 377 MB • Other: • - Command History Files: 148 MB RSVTSC-1 • - Event Files: 117 MB • Data source: – RAW telemetry received from MOC is stored in a Linux server at ESAC. – The raw data were copied to RADACER through rsync. RAWDAR

SMART-1

• Category: – RSVTSC-1: Concluded mission – Data fully fetched • General Mission Information: – http://sci.esa.int/smart-1/ • Data: – Estimated Total Size: 170.75 GB – Available Locally: 170.75 GB / 1,286 Directories / 187,805 Files – Directory location: /data/RADACER/missions/SMARTRSVTSC-1 -1/DATA/ – Data structure / content: • The telemetry for the SMART-1 mission is structured first by year and subsequently by the day of the year, eg 2003269 corresponds to the 269th day of 2003 • Data source: – Copied from an external drive at ESAC. RAWDAR

VEX ()

• Category: – RSVTSC-1: Concluded mission – Data fully fetched • General Mission Information: – http://sci.esa.int/venus-express/ • Data: – Estimated Total Size: 1.09 TB – Available Locally: 1.09 TB / 25,534 Directories / 73,856 Files – Directory location: /data/RADACER/missions/VEX/DATA/RSVTSC-1 – Data structure / content: ├── dds_archive_07052015 │ ├── hfa │ ├── lts1 │ ├── lts2 │ ├── … │ └── lts14 ├── faulty_files └── log_and_validation RAWDAR

VEX (VENUS EXPRESS)

• Data source: – Telemetry data DVDs sent to ESAC were copied to HDDs from which they were copied into RADACER.

RSVTSC-1 RAWDAR

Missions & Categories Data stored externally

RSVTSC-2 • BEPICOLOMBO • GAIA

• HUBBLE

going missions going

- On RAWDAR

BEPICOLOMBO

• Category: – RSVTSC-2: On-going mission – Data stored externally • General Mission Information: – http://sci.esa.int/bepicolombo/ • Data: – The mission will set off in October 2018, hence details about its data are not available yet.

RSVTSC-1 RAWDAR

GAIA

• Category: – RSVTSC-2: On-going mission – Data stored externally • General Mission Information: – http://sci.esa.int/gaia/ • Data: – http://sci.esa.int/gaia/58275-data-release-1/ • All the data are available from the ESA Gaia Archive: http://archives.esac.esa.int/gaiaRSVTSC-1 • and from the main partner data centres: – Centre de Données astronomiques de Strasbourg (CDS): http://cds.unistra.fr/gaia – ASI Science Data Center (ASDC): http://gaiaportal.asdc.asi.it – Astronomisches Rechen-Institut (ARI): http://gaia.ari.uni- heidelberg.de – Leibniz-Institut für Astrophysik Potsdam (AIP): http://gaia.aip.de RAWDAR

GAIA

• Data Source – MOC pushes data via the GFTS installation at MOC, to the SOC. – The destination directory structure is polled by ESAC for new files. – The operational data structure at ESAC is as follows: • /gaia/data |-- moc The main subdirectories of note under the | |-- hktm 'moc' directory are: | |-- ifms • 'hktm' (HK data supplied by MOC) and | |-- ifms2 • ‘scitm’ (science telemetry). | |-- odb RSVTSC-1 | |-- orb Both are in the RAPID file format. | `-- scitm |-- sent `-- soc `-- pos – MOC write to the 'moc' directory, while SOC pass data for MOC to collect via the 'soc' directory. – In the 'soc/pos' directory ESAC adds Payload Operations Requests (PORs) that are XML files, and additional data of use by MOC. RAWDAR

HUBBLE

• Category: – RSVTSC-2: On-going mission – Data stored externally • General Mission Information: – (HST) is a NASA/ESA mission. – Spacecraft and science operations are being performed by NASA, with participation of ESA at the STScI in Baltimore, USA. – More details: • http://sci.esa.int/hubble/ RSVTSC-1 • http://www.stsci.edu/portal/ • Data: – The raw data are used internally to produce the HST science data, which are put in the European HST Archive, available at: • http://www.cosmos.esa.int/web/hst/home RAWDAR

HUBBLE

• Data Source: – ESA hosts (internally) HST raw data in the ESA HST Cache Archive. – They are located in hstops machine in the directory structure mounted under /hubble/data/SEEDS. – The directory structure is organized by splitting the filenames. – The overall size of the raw data was reported as 6 TB (June 2015). – The raw data are visible from all HST servers and the grid. – No external access is provided to the raw data. RSVTSC-1 RAWDAR

Missions & Categories

Manual retrieval

RSVTSC-3

• MEX

going missions going

- On RAWDAR

MEX ()

• Category: – RSVTSC-3: On-going mission – Manual retrieval • General Mission Information: – http://sci.esa.int/mars-express/ • Data: – Estimated Total Size by end of mission (Dec 2018): 1,9 TB – Available Locally: 1,4 – Data location: /data/RADACER/missions/MEX/DATA/RSVTSC-1 – Data structure / content: ├── bin ├── CTSTATUS.FCS ├── log_hfa_20170517.log ├── log_lts1_20170517.log ├── log_lts2_20170517.log .. ├── log_lts5_20170517.log ├── mex_hfa_20170517 ├── mex_lts1_20170517 ├── mex_lts2_20170517 … └── mex_lts5_20170517 RAWDAR

MEX (MARS EXPRESS)

• Data source: – The raw data were obtained from MEX's operational EDDS instance at the MOC (ESOC). – An export on a hard drive was requested to ESOC in order to load the raw data in RADACER.

RSVTSC-1 RAWDAR

Missions & Categories

going missions going

- On Automated retrieval

• CLUSTER RSVTSC-4 • INTEGRAL • SOHO • XMM-NEWTON RAWDAR

CLUSTER

• Category: – RSVTSC-4: On-going mission – Automated retrieval • General Mission Information: – http://sci.esa.int/cluster/ • Data: – Estimated Total Size by end of mission (Dec 2018): 2.7 TB – Available Locally: 2.51 TB / 1 Directory / 12,271 Files

– Data location: /data/RADACER/missions/CLUSTER/DATA/RSVTSC-1 – Data structure / content: • The mission data are stored as zip files in a single directory. • There is one zip file per Volume ID which is named as YYMMDD_N_TV where: – YYMMDD is the date (two digit year), – N is the number of the volume within that day, – T is the total number of volumes on that day and – V is the version identifier starting at A and increasing in alphabetical order. • Data source: – Raw data are handled by the Cluster Active Archive (CAA). RAWDAR

CLUSTER

RSVTSC-1 RAWDAR

INTEGRAL

• Category: – RSVTSC-4: On-going mission – Automated retrieval • General Mission Information: – http://sci.esa.int/integral/ • Data: – Estimated Total Size by end of mission (Dec 2018): 4.7 TB – Available Locally: 4.23 TB / 141 Directories / 19,425 Files – Data location: /data/RADACER/missions/INTEGRAL/DATA/RSVTSC-1 – Data structure / content: • Flat directory structure • Timestamp in file name • 2 type of files per date point – CTRL (ASCII file) – TM (Binary file) RAWDAR

INTEGRAL

• Data source: – Raw telemetry data since early 2009 is available on a local disk of the INTEGRAL SOC. – The data is directly transferred from ESOC in a semi-automated process in batches organized by INTEGRAL revolutions (3 days duration until early 2015, 2.75 days nowadays). – Older data is available on external HDDs. – New raw data copied into RADACER via rsync. RSVTSC-1 RAWDAR

INTEGRAL

Several steps until automatic population was deployed: • Manual copy for old data. • Completion checks on data (find missing years) • Registration of the ssh user for rsync • 2 step of ssh tunneling.

RSVTSC-1 RAWDAR

SOHO

• Category: – RSVTSC-4: On-going mission – Automated retrieval • General Mission Information: – http://sci.esa.int/soho/ • Data: – Estimated Total Size by end of mission (Dec 2018): 5.5 TB – Available Locally: 1.31 TB / 185.426 Directories / 202.959 Files – Data location: /data/RADACER/missions/SOHO/DATA/RSVTSC-1 – Data structure / content: • a flat list of directories with the naming pattern SO_ALL_LZ_ where is a number with 4 digits • Directories SO_ALL_LZ_2000.01 up to SO_ALL_LZ_2085 contain three sub- directories and a voldesc.sfd file. • SO_ALL_LZ_2086 up to SO_ALL_LZ_3681 contain only the data directory and the voldesc.sfd file • ASCII data files end in CDF, SFD, TXT.. • Binary data files end in.DAT RAWDAR

SOHO

• Data source: – Mount point: /ssa_telemetry (netappd3.n1data.lan:/soho_ssa/telemetry) – Made available for read-only access in radacerbe.n1data.lan • Limitations applying to SOHO data fetching: – There is no transfer from MOC/NASA to SOC/ESAC. – The transfer works MOC/NASA -> SOC/Goddard (ESA+NASA) -> ESAC (where there is no SOC, only archive). – TM is not part of the SOHO Science Archive at ESAC. RSVTSC-1 RAWDAR

SOHO

RSVTSC-1 RAWDAR

XMM-NEWTON

• Category: – RSVTSC-4: On-going mission – Automated retrieval • General Mission Information: – http://sci.esa.int/xmm-newton/ • Data: – Estimated Total Size by end of mission (Dec 2018) : 1.8 TB – Available Locally: 1.62 TB / 18 Directories / 12.323 Files – Data location: /data/RADACER/missions/XMMRSVTSC-1 -NEWTON/DATA/ – Data structure / content: • Flat directory structure • Binary DATA (.DAT files) • Timestamp on filenames, e.g. X2KA_2015_07_01T00_00_00Z__.DAT RAWDAR

XMM-NEWTON

• Data source: – SOC receives Real Time TM from MOC which is processed in Real Time and is not archived. – 2-3 days after the event (Real Time) the SOC receives the TM via ftp in the form of files that cover 12 hours of TM. – All raw data since the beginning of the mission is stored in an external HDD at ESAC.

RSVTSC-1 RAWDAR

XMM-NEWTON

RSVTSC-1 RAWDAR

High-Level Architecture RAWDAR

High-Level Architecture • Front End VM – Apache Web server (Forwarding the radoverview) • Application VM – Radoverview (developed on Java EE/REST/Javascript ) – Running on Apache Tomcat – Using PostgreSQL DB • Back End VM – Raw Data Repository – Data Retrieval • Automatic retrieval (scripts running daily) • Manual retrieval (arrange copies remotely/receiving external HDD) RAWDAR

Conclusion

• Activity has been finished for completed missions • All done for most ongoing missions • Resolution is pending for the following issues: • MEX data provided by MOC in external drive needs to be ingested in RAWDAR. A new sync is expected to be carried out by at the endRSVTSC of RAWDAR-1 project. • It is not deemed possible to enable automatic retrieval of MEX raw data from MOC to RADACER due to: • No access to MEX DDS is to be provided by MEX MOC • No staging area available on the SOC