Make sure the package is installed. See installation instructions here.

Basic Query

The PSID (public water system ID number) of a water system is the only mandatory argument to get_data(). Use the PSID of a system to get all the water quality data for that system.

For example, Alameda County Water District is PSID 0110001:

alameda_all <- get_data(psid = "0110001")
glimpse(alameda_all)
## Rows: 22,107
## Columns: 55
## $ PWSID                                             <chr> "110001", "110001",…
## $ SAMP_DATE                                         <dttm> 2015-07-20, 2015-0…
## $ SAMP_TIME                                         <int> 910, 910, 910, 910,…
## $ LAB_NUM                                           <int> 5401, 5401, 5401, 5…
## $ ANADATE                                           <chr> "7/28/2015 0:00:00"…
## $ INDATE                                            <chr> "8/12/2015 0:00:00"…
## $ METHOD                                            <chr> "", "", "", "", "",…
## $ INBY                                              <chr> "U", "U", "U", "U",…
## $ SPECIAL                                           <chr> "S", "S", "S", "S",…
## $ STORE_NUM                                         <chr> "00081", "00086", "…
## $ XMOD                                              <chr> "<", "<", "", "", "…
## $ FINDING                                           <dbl> 5.00, 1.00, 864.00,…
## $ CHEMICAL                                          <chr> "COLOR", "ODOR THRE…
## $ AKA1                                              <chr> "COLOR, APPARENT (U…
## $ AKA2                                              <chr> NA, NA, NA, NA, NA,…
## $ CLS                                               <chr> "T", "T", "T", "T",…
## $ RPT_CDE                                           <chr> "GP", "GP", "GP", "…
## $ RPT_UNIT                                          <chr> "UNITS", "TON", "US…
## $ MCL                                               <dbl> 15, 3, 1600, 0, 0, …
## $ NL                                                <dbl> 0, 0, 0, 0, 0, 0, 0…
## $ TRIGGER_AM                                        <dbl> 15.0, 3.0, 900.0, 0…
## $ DLR                                               <dbl> 0.0, 1.0, 0.0, 0.0,…
## $ PHG                                               <int> 0, 0, 0, 0, 0, 0, 0…
## $ RPHL                                              <int> 0, 0, 0, 0, 0, 0, 0…
## $ CHEM_SORT                                         <chr> "COLOR", "ODOR THRE…
## $ `Water System Name`                               <chr> "ALAMEDA COUNTY WAT…
## $ `Principal County Served`                         <chr> "ALAMEDA", "ALAMEDA…
## $ `Federal Water System Type -CODE`                 <chr> "C", "C", "C", "C",…
## $ `Federal Water System Type`                       <chr> "Community", "Commu…
## $ `State Water System Type -CODE`                   <chr> "C", "C", "C", "C",…
## $ `State Water System Type`                         <chr> "Community", "Commu…
## $ `Water System Status -CODE`                       <chr> "A", "A", "A", "A",…
## $ `Water System Status`                             <chr> "ACTIVE", "ACTIVE",…
## $ `Owner Type -CODE`                                <chr> "L", "L", "L", "L",…
## $ `Owner Type`                                      <chr> "Local", "Local", "…
## $ `Primary Water Source Type -CODE`                 <chr> "SW", "SW", "SW", "…
## $ `Primary Water Source Type`                       <chr> "Surface Water", "S…
## $ `Residential Population`                          <int> 340000, 340000, 340…
## $ `Non Transient Population`                        <lgl> NA, NA, NA, NA, NA,…
## $ `Transient Population`                            <lgl> NA, NA, NA, NA, NA,…
## $ `Total Population`                                <int> 340000, 340000, 340…
## $ `Number of Service Connections Agricultural`      <lgl> NA, NA, NA, NA, NA,…
## $ `Number of COMBINED Service Connections (CB)`     <int> 80871, 80871, 80871…
## $ `Number of Commercial (CM) Service Connections`   <lgl> NA, NA, NA, NA, NA,…
## $ `Numer of Institutional Service Conections`       <lgl> NA, NA, NA, NA, NA,…
## $ `Number of Residential Service Connections`       <lgl> NA, NA, NA, NA, NA,…
## $ `Total Number of Service Connections`             <int> 80871, 80871, 80871…
## $ `Fee Code`                                        <chr> "C1", "C1", "C1", "…
## $ `Fee Code Description`                            <chr> "Large Water System…
## $ `Date of Sanitary Survey visit (SNSV Visit Date)` <chr> "11/30/2017", "11/3…
## $ CITY                                              <chr> "FREMONT", "FREMONT…
## $ `Treatment Plant Class-CODE`                      <chr> "T5", "T5", "T5", "…
## $ `Treatment Plant Class`                           <chr> "Treatment Operator…
## $ `Distribution System Class-CODE`                  <chr> "D5", "D5", "D5", "…
## $ `Distribution System Class`                       <chr> "Distribution Opera…

Helper Functions

sdwisard includes three helper functions:

Supply a county to get_water_system() to return PSIDs and water system names within that county.

get_water_system(county = "Alameda")
## # A tibble: 24 x 3
##    psid    water_system_name                    county 
##    <chr>   <chr>                                <chr>  
##  1 0103040 NORRIS CANYON PROPERTY OWNERS ASSN   ALAMEDA
##  2 0103041 TRAILER HAVEN MOBILE HOME PARK       ALAMEDA
##  3 0105002 RIVERS END MARINA                    ALAMEDA
##  4 0105003 CEMEX/ELIOT PLANT                    ALAMEDA
##  5 0105008 CASTLEWOOD DOMESTIC WATER SYSTEM     ALAMEDA
##  6 0105009 MOUNTAIN HOUSE SCHOOL                ALAMEDA
##  7 0105010 EBRPD - DEL VALLE REGIONAL PARK      ALAMEDA
##  8 0105012 EBRPD - SUNOL REGIONAL WILDERNESS    ALAMEDA
##  9 0105013 EBRPD - REDWOOD SPRING REGIONAL PARK ALAMEDA
## 10 0105016 MOUNTAIN HOUSE BAR                   ALAMEDA
## # … with 14 more rows

Supply a PSID to get_analyte_summary() to view a summary of a water system’s available analyte data.

get_analyte_summary(psid = "0110001") 
## # A tibble: 209 x 6
##    psid    storet analyte                     start_date end_date       n
##    <chr>   <chr>  <chr>                       <date>     <date>     <int>
##  1 0110001 00010  SOURCE TEMPERATURE C        2019-02-21 2016-07-27     4
##  2 0110001 00081  COLOR                       2018-01-16 2018-07-09   211
##  3 0110001 00086  ODOR THRESHOLD @ 60 C       2018-01-16 2018-07-09   210
##  4 0110001 00095  SPECIFIC CONDUCTANCE        2018-01-16 2019-08-08   218
##  5 0110001 00400  PH, FIELD                   2018-01-16 2018-07-09   214
##  6 0110001 00403  PH, LABORATORY              2019-11-05 2019-08-08     8
##  7 0110001 00410  ALKALINITY (TOTAL) AS CACO3 2018-01-16 2019-08-08   222
##  8 0110001 00440  BICARBONATE ALKALINITY      2018-01-16 2019-08-08   222
##  9 0110001 00445  CARBONATE ALKALINITY        2018-01-16 2019-08-08   222
## 10 0110001 00612  AMMONIA (NH3-N)             2020-05-11 2020-05-11     1
## # … with 199 more rows

Use the function get_storet_id to search for storet IDs by common names of an analytes.

nitrate_ids <- get_storet_id("nitrate")
nitrate_ids
## # A tibble: 3 x 2
##   storet analyte                 
##   <chr>  <chr>                   
## 1 00618  NITRATE (AS N)          
## 2 71850  NITRATE (AS NO3)        
## 3 A-029  NITRATE + NITRITE (AS N)

Provide the storet ID to query for a specific analyte. The following query return all nitrate data (storet ID = A-029) for Alameda county Water District.

alameda_nitrate <- get_data(psid = "0110001", storet = "A-029")

We can plot these results with ggplot as follows:

alameda_nitrate %>% 
  mutate(date = as.Date(SAMP_DATE)) %>% 
  ggplot(aes(date, FINDING)) +
  geom_point() +
  geom_smooth(se = FALSE) +
  labs(title = "NITRATE + NITRITE (AS N) at Alameda County Water District",
       y = "(MG/L)") +
  theme_minimal()

To query a specific date range provide a start_date and/or end_date argument(s) in YYYY-MM-DD format.

alameda_09_2018 <- get_data(psid = "0110001", start_date = "2018-09-01", end_date = "2018-09-30")

Built-in package objects

Full tables of metadata are also available:

water_systems
## # A tibble: 7,443 x 3
##    psid    water_system_name                    county 
##    <chr>   <chr>                                <chr>  
##  1 0103040 NORRIS CANYON PROPERTY OWNERS ASSN   ALAMEDA
##  2 0103041 TRAILER HAVEN MOBILE HOME PARK       ALAMEDA
##  3 0105002 RIVERS END MARINA                    ALAMEDA
##  4 0105003 CEMEX/ELIOT PLANT                    ALAMEDA
##  5 0105008 CASTLEWOOD DOMESTIC WATER SYSTEM     ALAMEDA
##  6 0105009 MOUNTAIN HOUSE SCHOOL                ALAMEDA
##  7 0105010 EBRPD - DEL VALLE REGIONAL PARK      ALAMEDA
##  8 0105012 EBRPD - SUNOL REGIONAL WILDERNESS    ALAMEDA
##  9 0105013 EBRPD - REDWOOD SPRING REGIONAL PARK ALAMEDA
## 10 0105016 MOUNTAIN HOUSE BAR                   ALAMEDA
## # … with 7,433 more rows
psid_analyte
## # A tibble: 562,977 x 6
##    psid    storet analyte                     start_date end_date       n
##    <chr>   <chr>  <chr>                       <date>     <date>     <int>
##  1 0103039 00081  COLOR                       2017-11-22 2017-11-22     1
##  2 0103039 00086  ODOR THRESHOLD @ 60 C       2017-11-22 2017-11-22     1
##  3 0103039 00095  SPECIFIC CONDUCTANCE        2017-11-22 2017-11-22     1
##  4 0103039 00403  PH, LABORATORY              2017-11-22 2017-11-22     1
##  5 0103039 00618  NITRATE (AS N)              2017-11-22 2017-11-22     1
##  6 0103040 00081  COLOR                       2020-05-19 2020-05-19     1
##  7 0103040 00086  ODOR THRESHOLD @ 60 C       2020-05-19 2020-05-19     1
##  8 0103040 00095  SPECIFIC CONDUCTANCE        2020-05-19 2020-05-19     1
##  9 0103040 00403  PH, LABORATORY              2020-05-19 2020-05-19     1
## 10 0103040 00410  ALKALINITY (TOTAL) AS CACO3 2020-05-19 2020-05-19     1
## # … with 562,967 more rows
analytes
## # A tibble: 441 x 2
##    storet analyte                    
##    <chr>  <chr>                      
##  1 00081  COLOR                      
##  2 00086  ODOR THRESHOLD @ 60 C      
##  3 00095  SPECIFIC CONDUCTANCE       
##  4 00403  PH, LABORATORY             
##  5 00618  NITRATE (AS N)             
##  6 71850  NITRATE (AS NO3)           
##  7 01032  CHROMIUM, HEXAVALENT       
##  8 77288  DICHLOROACETIC ACID (DCAA) 
##  9 82721  DIBROMOACETIC ACID (DBAA)  
## 10 82723  TRICHLOROACETIC ACID (TCAA)
## # … with 431 more rows