---
title: "Introduction"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{introduction}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  fig.height=5, fig.width=8, 
  message=FALSE, warning=FALSE,
  collapse = TRUE,
  comment = "#>"
)
```


The **USgas** package provides an overview of demand for natural gas in the US in a time-series format. That includes the following dataset:
* `usgas` - The monthly consumption of natural gas in the US/state level by end-use since 1973 for US level and 1989 for state level. It includes the following end-use categories:
   - Commercial Consumption
   - Delivered to Consumers
   - Electric Power Consumption
   - Industrial Consumption
   - Lease and Plant Fuel Consumption
   - Pipeline Fuel Consumption
   - Residential Consumption
   - Vehicle Fuel Consumption

The package also includes the following datasets, from previous release:

* `us_total` - The US annual natural gas consumption by state-level between 1997 and 2019, and aggregate level between 1949 and 2019
* `us_monthly` - The monthly demand for natural gas in the US between 2001 and 2020
* `us_residential` - The US monthly natural gas residential consumption by state and aggregate level between 1989 and 2020

The `us_total`, `us_monthly`, and `us_residential` can be derived out of the `usgas` dataset. Therefore, those datasets in the process of deprication and will be removed in the next release to CRAN.

Data source: The US Energy Information Administration [API](https://www.eia.gov/)

## The usgas dataset

The `usgas` dataset provides a 313 time series focusing on the consumption of natural gas by end use in the US (aggregated and state level). It includes the following fields:

- `date` - A Date, the month and year of the observation (the day set by default to 1st of the month)}
- `process` - The process type description
- `state` -  The US state name
- `state_abb` - The US state abbreviation
- `y` - A numeric, the monthly natural gas residential consumption in a million cubic feet

```{r}
library(USgas)

data("usgas")

str(usgas)

head(usgas)
```

The dataset includes the state level and the US aggregate level (labeled under `U.S.`):

```{r}
unique(usgas$state)
```

In the example below, we will subset and plot the consumption by end-use in the US:

```{r }
us_agg <- usgas[which(usgas$state == "U.S."),]

head(us_agg)
```

Let's now use **plotly** to plot those series:

```{r}
library(plotly)

plot_ly(data = us_agg,
        x = ~ date,
        y = ~ y,
        color = ~ process,
        type = "scatter",
        mode = "line") |> 
  layout(title = "US Monthly Consumption of Natural Gas by End Use",
         yaxis = list(title = "MMCF"),
         xaxis = list(title = "Source: EIA Website"),
         legend = list(x = 0,
                       y = 0.95))
```

Similarly, we can subset a couple of states and visualize them. For example, let’s visualize the residential consumption in the New England states. We will start by subsetting the corresponding states in New England, transform the data.frame to wide format, and reorder by date:

```{r}
ne <- c("Connecticut", "Maine", "Massachusetts",
        "New Hampshire", "Rhode Island", "Vermont")
ne_gas <-  usgas[which(usgas$state %in% ne),]

head(ne_gas)

```


Next, let's use the `process` column to extract the residential consumption and plot it:

```{r}
ne_gas[which(ne_gas$process == "Residential Consumption"),] |>
  plot_ly(x = ~ date,
          y = ~ y,
          color = ~ state,
          type = "scatter",
          mode = "line") |> 
  layout(title = "Monthly Residential Consumption of Natural Gas in New England",
         yaxis = list(title = "MMCF"),
         xaxis = list(title = "Source: EIA Website"),
         legend = list(x = 0,
                       y = 1))
```