2 Data Retrieval
2.1 Obtaining data summaries
Data in the CMAP database exists in tables, and can be accessed by specific table names, and variable names therein. For instance, sea surface temperature is from the table "tblSST_AVHRR_OI_NRT"
, and the variable name is "sst"
.
Some simpler functions for accessing data are useful for obtaining summaries of the data prior to downloading the data itself.
- Name of variables (
get_columns()
) - Type of variables, i.e., quantitative, qualitative, time, (
get_var_catalog()
) - Space-time range information (
get_var_coverage
), - Numerical variable range summary (
get_var_stat()
).
These are used in the various following functions, which are designed to summarize data tables by extracting these table attributes. This is useful for learning about the tables without downloading them prior to analysis. Here are some key examples (the piping using %>%
to kable()
is just for formatting purposes):
get_head("tblArgoMerge_REP") %>% kable()
float_id | cycle | time | lat | lon | depth | position_qc | direction | data_mode | data_centre | argo_merge_cdnc | argo_merge_cdnc_qc | argo_merge_cdnc_adj | argo_merge_cdnc_adj_qc | argo_merge_cdnc_adj_err | argo_merge_pressure | argo_merge_pressure_qc | argo_merge_pressure_adj | argo_merge_pressure_adj_qc | argo_merge_pressure_adj_err | argo_merge_salinity | argo_merge_salinity_qc | argo_merge_salinity_adj | argo_merge_salinity_adj_qc | argo_merge_salinity_adj_err | argo_merge_temperature | argo_merge_temperature_qc | argo_merge_temperature_adj | argo_merge_temperature_adj_qc | argo_merge_temperature_adj_err | argo_merge_O2 | argo_merge_O2_qc | argo_merge_O2_adj | argo_merge_O2_adj_qc | argo_merge_O2_adj_err | argo_merge_bbp | argo_merge_bbp_qc | argo_merge_bbp_adj | argo_merge_bbp_adj_qc | argo_merge_bbp_adj_err | argo_merge_bbp470 | argo_merge_bbp470_qc | argo_merge_bbp470_adj | argo_merge_bbp470_adj_qc | argo_merge_bbp470_adj_err | argo_merge_bbp532 | argo_merge_bbp532_qc | argo_merge_bbp532_adj | argo_merge_bbp532_adj_qc | argo_merge_bbp532_adj_err | argo_merge_bbp700 | argo_merge_bbp700_qc | argo_merge_bbp700_adj | argo_merge_bbp700_adj_qc | argo_merge_bbp700_adj_err | argo_merge_turbidity | argo_merge_turbidity_qc | argo_merge_turbidity_adj | argo_merge_turbidity_adj_qc | argo_merge_turbidity_adj_err | argo_merge_cp | argo_merge_cp_qc | argo_merge_cp_adj | argo_merge_cp_adj_qc | argo_merge_cp_adj_err | argo_merge_cp660 | argo_merge_cp660_qc | argo_merge_cp660_adj | argo_merge_cp660_adj_qc | argo_merge_cp660_adj_err | argo_merge_chl | argo_merge_chl_qc | argo_merge_chl_adj | argo_merge_chl_adj_qc | argo_merge_chl_adj_err | argo_merge_cdom | argo_merge_cdom_qc | argo_merge_cdom_adj | argo_merge_cdom_adj_qc | argo_merge_cdom_adj_err | argo_merge_NO3 | argo_merge_NO3_qc | argo_merge_NO3_adj | argo_merge_NO3_adj_qc | argo_merge_NO3_adj_err | argo_merge_bisulfide | argo_merge_bisulfide_qc | argo_merge_bisulfide_adj | argo_merge_bisulfide_adj_qc | argo_merge_bisulfide_adj_err | argo_merge_ph | argo_merge_ph_qc | argo_merge_ph_adj | argo_merge_ph_adj_qc | argo_merge_ph_adj_err | argo_merge_down_irr | argo_merge_down_irr_qc | argo_merge_down_irr_adj | argo_merge_down_irr_adj_qc | argo_merge_down_irr_adj_err | argo_merge_down_irr380 | argo_merge_down_irr380_qc | argo_merge_down_irr380_adj | argo_merge_down_irr380_adj_qc | argo_merge_down_irr380_adj_err | argo_merge_down_irr412 | argo_merge_down_irr412_qc | argo_merge_down_irr412_adj | argo_merge_down_irr412_adj_qc | argo_merge_down_irr412_adj_err | argo_merge_down_irr443 | argo_merge_down_irr443_qc | argo_merge_down_irr443_adj | argo_merge_down_irr443_adj_qc | argo_merge_down_irr443_adj_err | argo_merge_down_irr490 | argo_merge_down_irr490_qc | argo_merge_down_irr490_adj | argo_merge_down_irr490_adj_qc | argo_merge_down_irr490_adj_err | argo_merge_down_irr555 | argo_merge_down_irr555_qc | argo_merge_down_irr555_adj | argo_merge_down_irr555_adj_qc | argo_merge_down_irr555_adj_err | argo_merge_up_irr | argo_merge_up_irr_qc | argo_merge_up_irr_adj | argo_merge_up_irr_adj_qc | argo_merge_up_irr_adj_err | argo_merge_up_irr412 | argo_merge_up_irr412_qc | argo_merge_up_irr412_adj | argo_merge_up_irr412_adj_qc | argo_merge_up_irr412_adj_err | argo_merge_up_irr443 | argo_merge_up_irr443_qc | argo_merge_up_irr443_adj | argo_merge_up_irr443_adj_qc | argo_merge_up_irr443_adj_err | argo_merge_up_irr490 | argo_merge_up_irr490_qc | argo_merge_up_irr490_adj | argo_merge_up_irr490_adj_qc | argo_merge_up_irr490_adj_err | argo_merge_up_irr555 | argo_merge_up_irr555_qc | argo_merge_up_irr555_adj | argo_merge_up_irr555_adj_qc | argo_merge_up_irr555_adj_err | argo_merge_down_par | argo_merge_down_par_qc | argo_merge_down_par_adj | argo_merge_down_par_adj_qc | argo_merge_down_par_adj_err | year | month | week | dayofyear |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2900114 | 189 | 2011-05-11 | -59.138 | 95.245 | 1040.0 | 1 | A | D | AO | NA | NA | NA | NA | NA | 1040.0 | 1 | 1040.4 | 1 | 2.4 | 34.739 | 1 | 34.739 | 1 | 0.01 | 1.544 | 1 | 1.544 | 1 | 0.002 | 184.1 | 0 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 2011 | 5 | 20 | 131 |
2900114 | 189 | 2011-05-11 | -59.138 | 95.245 | 1042.0 | 1 | A | D | AO | NA | NA | NA | NA | NA | 1042.0 | 1 | 1042.4 | 1 | 2.4 | 34.739 | 1 | 34.739 | 1 | 0.01 | 1.542 | 1 | 1.542 | 1 | 0.002 | 184.2 | 0 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 2011 | 5 | 20 | 131 |
2900114 | 189 | 2011-05-11 | -59.138 | 95.245 | 1044.0 | 1 | A | D | AO | NA | NA | NA | NA | NA | 1044.0 | 1 | 1044.4 | 1 | 2.4 | 34.739 | 1 | 34.739 | 1 | 0.01 | 1.541 | 1 | 1.541 | 1 | 0.002 | 184.3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 2011 | 5 | 20 | 131 |
2900114 | 189 | 2011-05-11 | -59.138 | 95.245 | 1046.1 | 1 | A | D | AO | NA | NA | NA | NA | NA | 1046.1 | 1 | 1046.5 | 1 | 2.4 | 34.739 | 1 | 34.739 | 1 | 0.01 | 1.542 | 1 | 1.542 | 1 | 0.002 | 184.3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 2011 | 5 | 20 | 131 |
2900114 | 189 | 2011-05-11 | -59.138 | 95.245 | 1048.0 | 1 | A | D | AO | NA | NA | NA | NA | NA | 1048.0 | 1 | 1048.4 | 1 | 2.4 | 34.739 | 1 | 34.739 | 1 | 0.01 | 1.542 | 1 | 1.542 | 1 | 0.002 | 184.3 | 0 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 2011 | 5 | 20 | 131 |
get_var_coverage("tblAltimetry_REP", "sla") %>% kable()
Time_Min | Time_Max | Lat_Min | Lat_Max | Lon_Min | Lon_Max | Depth_Min | Depth_Max |
---|---|---|---|---|---|---|---|
1993-01-01 | 2018-06-10 | -89.875 | 89.875 | -179.875 | 179.875 | NA | NA |
get_var_resolution("tblAltimetry_REP", "sla") %>% kable()
Temporal_Resolution | Spatial_Resolution |
---|---|
Daily | 1/4° X 1/4° |
get_var_unit("tblAltimetry_REP", "sla") %>% kable()
Unit |
---|
m |
get_columns('tblSST_AVHRR_OI_NRT') %>% kable()
Columns |
---|
lat |
lon |
time |
sst |
year |
month |
week |
dayofyear |
2.2 Cruises
Out of the many data tables in CMAP, some are data collected and processed from cruises. Here are some examples of summarizing this data.
cruisename = "KOK1606"
get_cruise_by_name(cruisename) %>% kable()
ID | Nickname | Name | Ship_Name | Start_Time | End_Time | Lat_Min | Lat_Max | Lon_Min | Lon_Max | Chief_Name |
---|---|---|---|---|---|---|---|---|---|---|
589 | Gradients_1 | KOK1606 | R/V Kaimikai O Kanaloa | 2016-04-20 00:04:37 | 2016-05-04 02:33:45 | 21.4542 | 37.8864 | -158.3355 | -157.858 | Virginia Armbrust |
get_cruise_bounds(cruisename) %>% kable()
ID | dt1 | dt2 | lat1 | lat2 | lon1 | lon2 |
---|---|---|---|---|---|---|
589 | 2016-04-20 00:04:37 | 2016-05-04 02:33:45 | 21.4542 | 37.8864 | -158.3355 | -157.858 |
get_head('tblAMT13_Chisholm') %>% kable()
time | lat | lon | depth | Pro_abundance | Pro_total_ecotype_abundance | Pro_eMED4_abundance | Pro_eMIT9312_abundance | Pro_eMIT9211_abundance | Pro_eNATL2A_abundance | Pro_eSS120_abundance | Pro_eMIT9313_abundance | Syn_abundance | density_AMT13 | temp_C_AMT13 | csal_ppt_AMT13 | Light_Quanta_m2_sec_AMT13 | chlA_AMT13 | DCM_AMT13 | MLD_05C_AMT13 | MLD_25C_AMT13 | MLD_03Density_AMT13 | MLD_125Density_AMT13 | TotalDepth_AMT13 | Pro_eMED4_abundance_quality | Pro_eMIT9312_abundance_quality | Pro_eMIT9211_abundance_quality | Pro_eNATL2A_abundance_quality | Pro_eSS120_abundance_quality | Pro_eMIT9313_abundance_quality | Pro_total_ecotype_abundance_quality |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2003-09-16 09:55:00 | 47.983 | -11.539 | 9.280 | 0.9206000 | 96901.054 | 95878.590 | 96.139927 | 1.000000 | 923.3245 | 1.0000000 | 1.0000000 | 0.2825333 | NA | 17.8958 | 35.6425 | 45.652811 | 0.0761 | NA | NA | NA | NA | NA | NA | NA | NA | 0 | NA | 0 | 0 | 0 |
2003-09-16 09:55:00 | 47.983 | -11.539 | 43.017 | 0.2690000 | 21001.014 | 19852.153 | 24.226982 | 1.000000 | 1121.9211 | 0.7138027 | 1.0000000 | 0.4385333 | NA | 16.3019 | 35.6470 | 9.754238 | 0.1072 | NA | NA | NA | NA | NA | NA | NA | NA | 0 | NA | NA | 0 | 0 |
2003-09-16 09:55:00 | 47.983 | -11.539 | 53.629 | 0.1236000 | 4502.291 | 3847.637 | 6.132032 | 1.000000 | 645.5221 | 1.0000000 | 1.0000000 | 0.2341333 | NA | 15.2485 | 35.5911 | 4.237555 | 0.5024 | NA | NA | NA | NA | NA | NA | NA | NA | 0 | NA | 0 | 0 | 0 |
2003-09-16 09:55:00 | 47.983 | -11.539 | 62.327 | 0.0954667 | 2468.064 | 2120.974 | 4.158969 | 3.010477 | 338.6858 | 0.2588225 | 0.9763928 | 0.1315333 | NA | 14.6576 | 35.5787 | 2.620675 | 0.2852 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
2003-09-16 09:55:00 | 47.983 | -11.539 | 78.587 | 0.0576667 | 2315.213 | 1420.991 | 2.133862 | 1.000000 | 889.0881 | 1.0000000 | 1.0000000 | 0.0554000 | NA | 13.8446 | 35.5778 | 1.134332 | 0.1979 | NA | NA | NA | NA | NA | NA | NA | NA | 0 | NA | 0 | 0 | 0 |
get_columns('tblAMT13_Chisholm') %>% head(10) %>% kable()
Columns |
---|
time |
lat |
lon |
depth |
Pro_abundance |
Pro_total_ecotype_abundance |
Pro_eMED4_abundance |
Pro_eMIT9312_abundance |
Pro_eMIT9211_abundance |
Pro_eNATL2A_abundance |
As you can see, each table contains many data "rows", which are indexed by "keys" of the four values:
(Time, Latitude, Longitude, Depth)
Most functions in this package will involve using these data key values, or user-specified ranges of these keys, for certain operations. The most basic operation, explained next, is to download a dataset from a time and space range of interest.
2.3 Download data
When retrieving all data from CMAP in a "rectangle" of space and time, specifies the following:
- Name of table (
table
) - Names of variables in that table (
variable
) - Upper and lower limits of the variables (
dt1
,dt2
,lon1
,lon2
,lat1
,lat2
,depth1
,depth2
)
These are used in the space_time()
function, which downloads the data as a data frame.
get_spacetime(tableName = 'tblArgoMerge_REP',
varName = 'argo_merge_salinity_adj',
dt1='2015-05-01',
dt2='2015-05-30',
lat1=28,
lat2=38,
lon1=-71,
lon2=-50,
depth1=0,
depth2=100) %>% head(10) %>% kable()
time | lat | lon | depth | argo_merge_salinity_adj |
---|---|---|---|---|
2015-05-02 | 32.348 | -63.45 | 4.3 | 36.71301 |
2015-05-02 | 32.348 | -63.45 | 6.0 | 36.71301 |
2015-05-02 | 32.348 | -63.45 | 7.6 | 36.71099 |
2015-05-02 | 32.348 | -63.45 | 8.0 | 36.71101 |
2015-05-02 | 32.348 | -63.45 | 10.0 | 36.70300 |
2015-05-02 | 32.348 | -63.45 | 11.7 | 36.69700 |
2015-05-02 | 32.348 | -63.45 | 12.0 | 36.69701 |
2015-05-02 | 32.348 | -63.45 | 14.0 | 36.69500 |
2015-05-02 | 32.348 | -63.45 | 16.0 | 36.69401 |
2015-05-02 | 32.348 | -63.45 | 16.7 | 36.69402 |
2.4 Colocalize data
First of all, what does it mean to colocalize? Imagine a ship moving along a trajectory, whose location in space and time is recorded in latitude/longitude/time. Then, we'd like to obtain data like sea surface temperature or salinity, as if we were recording it along the ship's trajectory. Colocalization simply means retrieving such data manually from the database, by querying from the vicinity (a rectangle) of each lat/lon/time triplet of interest, and summarizing it (e.g. taking the average).
The size of the vicinity is define by the user-specified "slack" in longitude, latitude and time. For instance, one could ask for 1 degrees of slack in longitude and latitude, and 6 hours of slack in time.
The main function is along_track
, which colocalizes data for one or multiple tables and variables at a time. The available cruises can be found by running the function cruises()
. The available data tables and variable names are listed in the catalog (get_catalog()
, or from https://cmap.readthedocs.io/en/latest/catalog/catalog.html). Here is a short example:
## cruise = 'Diel'
cruise = 'MGL1704'
targetTables = c('tblSeaFlow', 'tblPisces_NRT')
targetVars = c('abundance_synecho', 'NO3')
## targetVars = c('synecho_abundance', 'NO3')
depth1 = 0
depth2 = 5
temporalTolerance = c(0, 10)
latTolerance = c(0, 5)##0.25)
lonTolerance = c(0, 5)##0.25)
depthTolerance = c(5, 5)
dat = along_track(cruise[1],
targetTables[1],
targetVars[1],
depth1[1],
depth2[1],
temporalTolerance[1],
latTolerance[1],
lonTolerance[1],
depthTolerance[1])
##
|
| | 0%
head(dat) %>% kable()
time | lat | lon | abundance_synecho | abundance_synecho_std |
---|---|---|---|---|
2017-05-26 22:34:53 | 21.3164 | -157.8767 | NA | NA |
2017-05-26 22:37:53 | 21.3164 | -157.8767 | NA | NA |
2017-05-26 22:40:53 | 21.3164 | -157.8767 | NA | NA |
2017-05-26 22:43:54 | 21.3164 | -157.8767 | NA | NA |
2017-05-26 22:46:54 | 21.3164 | -157.8767 | NA | NA |
2017-05-26 22:49:54 | 21.3164 | -157.8767 | NA | NA |
2.5 Issue SQL query
(For advanced users) Simons CMAP datasets are hosted in a SQL database and the CMAP4R R package provides the user with a number of pre-developed methods to extract and retrieve subsets of the data. The rest of this documentation is dedicated to explore and explain these methods. In addition to the pre-developed functions described above, you may issue custom SQL query statement and returns the results in form of a data frame.
dat = exec_manualquery("SELECT [time], lat, lon, depth, Fe FROM tblPisces_NRT
WHERE
[time] BETWEEN '2017-06-03' AND '2017-06-03' AND
lat BETWEEN 10 AND 55 AND
lon BETWEEN -180 AND 100 AND
depth BETWEEN 0 AND 0.5
ORDER BY [time], lat, lon, depth")
dat %>% head(10) %>% kable()
time | lat | lon | depth | Fe |
---|---|---|---|---|
2017-06-03 | 10 | -180.0 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -179.5 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -179.0 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -178.5 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -178.0 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -177.5 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -177.0 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -176.5 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -176.0 | 0.494025 | 1.5e-05 |
2017-06-03 | 10 | -175.5 | 0.494025 | 1.5e-05 |