Difference between revisions of "Make data accessible by Mimi Tzeng"
(Set PropertyValue: Progress = 0) |
(Set PropertyValue: Progress = 100) |
||
(12 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
[[Category:Task]] | [[Category:Task]] | ||
<br/><b>Details on how to do this task:</b> [[Make data accessible]]<br/><br/> | <br/><b>Details on how to do this task:</b> [[Make data accessible]]<br/><br/> | ||
+ | |||
+ | So far I've signed up to FigShare and obtained explicit permission from the PI to upload the data to it. I am now waiting for the PI to reinstall Matlab so I can rerun the processing. | ||
+ | |||
+ | As I recall, we are supposed to also make available all of the original raw data and intermediate files. There are many of these; should I also include a README.txt in the ultimate zip file that explains what all of these are? | ||
+ | |||
+ | Answer from telecon: include just the intermediate files that might be useful to someone else, such as the *.mat files. No need to include every single raw and intermediate file for this task. | ||
+ | |||
+ | The question then becomes: which intermediate files should be included? I think I'll probably omit most of the pre-processing files and start with the ones that go into the perl script. Then I'll also skip a lot of the intermediate files that come out of Matlab and just go with the combined figure PDFs, especially for the ADCP where there are a lot. | ||
+ | |||
+ | Also, new plan: going to use Zenodo instead of FigShare because it's run by CERN. The organization does matter for lending weight to legitimacy; CERN is a well-known, well-established science research institution, and FigShare seems to be a random startup... | ||
+ | |||
+ | <hr noshade size=2> | ||
+ | |||
+ | '''Files to include:''' | ||
+ | |||
+ | # From MOOR: the initial data files after preliminary processing through the proprietary software that came with the sensors, before the perl script | ||
+ | # From MOOR: timestamps.txt | ||
+ | # From MOOR: the Matlab data file that contains all the variables, generated by MOORprocess_all.m | ||
+ | # From MOOR: everything generated by MOORprocess_all.m (after PDFs have been concatenated) | ||
+ | # From ADCP: the Matlab data file exported from WinADCP | ||
+ | # From ADCP: endpoints.txt | ||
+ | # From ADCP: the Matlab data file generated by MoorADCP.m | ||
+ | # From ADCP: everything generated by MoorADCP.m (after PDFs have been concatenated) | ||
+ | |||
+ | 7/1/2015 update: The data are all accessible on Zenodo now. Minor issue: four of the text files are not formatted correctly, and I'll need to track down in the code why it's not outputting the way it's supposed to. I don't know if I'm going to get to this or not, or if I will just put a note somewhere mentioning it. | ||
+ | |||
<!-- Add any wiki Text above this Line --> | <!-- Add any wiki Text above this Line --> | ||
<!-- Do NOT Edit below this Line --> | <!-- Do NOT Edit below this Line --> | ||
{{#set: | {{#set: | ||
− | Progress= | + | Expertise=Open_science| |
+ | Expertise=Geosciences| | ||
+ | Owner=Mimi_Tzeng| | ||
+ | Progress=100| | ||
+ | StartDate=2015-02-21| | ||
+ | TargetDate=2015-03-06| | ||
Type=Low}} | Type=Low}} |
Latest revision as of 19:58, 14 July 2015
Details on how to do this task: Make data accessible
So far I've signed up to FigShare and obtained explicit permission from the PI to upload the data to it. I am now waiting for the PI to reinstall Matlab so I can rerun the processing.
As I recall, we are supposed to also make available all of the original raw data and intermediate files. There are many of these; should I also include a README.txt in the ultimate zip file that explains what all of these are?
Answer from telecon: include just the intermediate files that might be useful to someone else, such as the *.mat files. No need to include every single raw and intermediate file for this task.
The question then becomes: which intermediate files should be included? I think I'll probably omit most of the pre-processing files and start with the ones that go into the perl script. Then I'll also skip a lot of the intermediate files that come out of Matlab and just go with the combined figure PDFs, especially for the ADCP where there are a lot.
Also, new plan: going to use Zenodo instead of FigShare because it's run by CERN. The organization does matter for lending weight to legitimacy; CERN is a well-known, well-established science research institution, and FigShare seems to be a random startup...
Files to include:
- From MOOR: the initial data files after preliminary processing through the proprietary software that came with the sensors, before the perl script
- From MOOR: timestamps.txt
- From MOOR: the Matlab data file that contains all the variables, generated by MOORprocess_all.m
- From MOOR: everything generated by MOORprocess_all.m (after PDFs have been concatenated)
- From ADCP: the Matlab data file exported from WinADCP
- From ADCP: endpoints.txt
- From ADCP: the Matlab data file generated by MoorADCP.m
- From ADCP: everything generated by MoorADCP.m (after PDFs have been concatenated)
7/1/2015 update: The data are all accessible on Zenodo now. Minor issue: four of the text files are not formatted correctly, and I'll need to track down in the code why it's not outputting the way it's supposed to. I don't know if I'm going to get to this or not, or if I will just put a note somewhere mentioning it.