Difference between revisions of "Make data accessible by Mimi Tzeng"

From Geoscience Paper of the Future
Jump to: navigation, search
(Set PropertyValue: Type = low)
(Set PropertyValue: Progress = 100)
 
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[Category:Task]]
 
[[Category:Task]]
 
<br/><b>Details on how to do this task:</b> [[Make data accessible]]<br/><br/>
 
<br/><b>Details on how to do this task:</b> [[Make data accessible]]<br/><br/>
 +
 +
So far I've signed up to FigShare and obtained explicit permission from the PI to upload the data to it. I am now waiting for the PI to reinstall Matlab so I can rerun the processing.
 +
 +
As I recall, we are supposed to also make available all of the original raw data and intermediate files. There are many of these; should I also include a README.txt in the ultimate zip file that explains what all of these are?
 +
 +
Answer from telecon: include just the intermediate files that might be useful to someone else, such as the *.mat files. No need to include every single raw and intermediate file for this task.
 +
 +
The question then becomes: which intermediate files should be included? I think I'll probably omit most of the pre-processing files and start with the ones that go into the perl script. Then I'll also skip a lot of the intermediate files that come out of Matlab and just go with the combined figure PDFs, especially for the ADCP where there are a lot.
 +
 +
Also, new plan: going to use Zenodo instead of FigShare because it's run by CERN. The organization does matter for lending weight to legitimacy; CERN is a well-known, well-established science research institution, and FigShare seems to be a random startup...
 +
 +
<hr noshade size=2>
 +
 +
'''Files to include:'''
 +
 +
# From MOOR: the initial data files after preliminary processing through the proprietary software that came with the sensors, before the perl script
 +
# From MOOR: timestamps.txt
 +
# From MOOR: the Matlab data file that contains all the variables, generated by MOORprocess_all.m
 +
# From MOOR: everything generated by MOORprocess_all.m (after PDFs have been concatenated)
 +
# From ADCP: the Matlab data file exported from WinADCP
 +
# From ADCP: endpoints.txt
 +
# From ADCP: the Matlab data file generated by MoorADCP.m
 +
# From ADCP: everything generated by MoorADCP.m (after PDFs have been concatenated)
 +
 +
7/1/2015 update: The data are all accessible on Zenodo now. Minor issue: four of the text files are not formatted correctly, and I'll need to track down in the code why it's not outputting the way it's supposed to. I don't know if I'm going to get to this or not, or if I will just put a note somewhere mentioning it.
 +
 
<!-- Add any wiki Text above this Line -->
 
<!-- Add any wiki Text above this Line -->
 
<!-- Do NOT Edit below this Line -->
 
<!-- Do NOT Edit below this Line -->
{{#set:|
+
{{#set:
 +
Expertise=Open_science|
 +
Expertise=Geosciences|
 +
Owner=Mimi_Tzeng|
 +
Progress=100|
 +
StartDate=2015-02-21|
 +
TargetDate=2015-03-06|
 
Type=Low}}
 
Type=Low}}

Latest revision as of 19:58, 14 July 2015


Details on how to do this task: Make data accessible

So far I've signed up to FigShare and obtained explicit permission from the PI to upload the data to it. I am now waiting for the PI to reinstall Matlab so I can rerun the processing.

As I recall, we are supposed to also make available all of the original raw data and intermediate files. There are many of these; should I also include a README.txt in the ultimate zip file that explains what all of these are?

Answer from telecon: include just the intermediate files that might be useful to someone else, such as the *.mat files. No need to include every single raw and intermediate file for this task.

The question then becomes: which intermediate files should be included? I think I'll probably omit most of the pre-processing files and start with the ones that go into the perl script. Then I'll also skip a lot of the intermediate files that come out of Matlab and just go with the combined figure PDFs, especially for the ADCP where there are a lot.

Also, new plan: going to use Zenodo instead of FigShare because it's run by CERN. The organization does matter for lending weight to legitimacy; CERN is a well-known, well-established science research institution, and FigShare seems to be a random startup...


Files to include:

  1. From MOOR: the initial data files after preliminary processing through the proprietary software that came with the sensors, before the perl script
  2. From MOOR: timestamps.txt
  3. From MOOR: the Matlab data file that contains all the variables, generated by MOORprocess_all.m
  4. From MOOR: everything generated by MOORprocess_all.m (after PDFs have been concatenated)
  5. From ADCP: the Matlab data file exported from WinADCP
  6. From ADCP: endpoints.txt
  7. From ADCP: the Matlab data file generated by MoorADCP.m
  8. From ADCP: everything generated by MoorADCP.m (after PDFs have been concatenated)

7/1/2015 update: The data are all accessible on Zenodo now. Minor issue: four of the text files are not formatted correctly, and I'll need to track down in the code why it's not outputting the way it's supposed to. I don't know if I'm going to get to this or not, or if I will just put a note somewhere mentioning it.