ACT Basics#
Overview#
Welcome to the ARM/ASR Open Science Workshop Tutorial on the Atmospheric data Community Toolkit (ACT) In this tutorial, you will learn some of the basic features of ACT with a focus on how you can use it to better utilize ARM’s data quality information. This will include using ARM’s embedded quality control information that’s included in many ARM NetCDF files and Data Quality Reports (DQR) which can be accessed through a webservice.
ACT is built around the xarray data object, which you can learn about more in the xarray tutorial on Friday at 3 pm Eastern or through information from Project Pythia.
Installation#
If you don’t have ACT already installed, it can be installed using pip or conda using the commands below. Additional information on installation can be found in the ACT User Guide.
pip install act-atmos
conda install -c conda-forge act-atmos
Some features of ACT are only available if you have some optional dependencies installed. For example, Skew-T plots of radiosonde data will require that MetPy is installed. Additional optional dependencies are listed in ACT’s documentation
If you have ACT already installed, it will be important to ensure the latest version is installed. On the ADC JupyterHub, you can bring up a terminal and enter
pip install act-atmos --user --upgrade
After that, in the JupyterHub, select ‘Kernel’ from the menu and ‘Restart’
Imports#
First we are going to start by importing all the necessary python libraries that we need which is just act and matplotlib.
import act
import matplotlib.pyplot as plt
# Check the ACT version to ensure it's 1.1.6
act.__version__
Download ARM Data#
Next we are going to download the data we are going to use for this session using the ARM Live Data webservice.
Since you have an ARM user account, you should be able to utilize this webservice as well. All you need to do is login to get your token. In case you are not able to get that information, we have a username and token set up just for training sessions.
# Set your own username and token if you have it
username = 'YourUserName'
token = 'YourToken'
# ACT module for downloading data from the ARM web service
results = act.discovery.download_arm_data(username, token, 'sgpmfrsr7nchE11.b1', '2021-03-29', '2021-03-29')
print(results)
Reading in a NetCDF File#
Congratulations, you just downloaded a file from just the command line! Next up is to read the file into an xarray object using the ACT reader. We then can use Jupyter to print out an interactive listing of everything in the object.
obj = act.io.arm.read_arm_netcdf(results)
obj
Clean up the object to CF Standards#
In order to utilize all the ACT QC modules, we need to clean up the object to follow Climate and Forecast (CF) standards.
obj.clean.cleanup()
obj
First Visualization#
Let’s plot up some data to see what we’re working with. For this example, we’ll use diffuse_hemisp_narrowband_filter4
variable = 'diffuse_hemisp_narrowband_filter4'
# Create a plotting display object with 2 plots
display = act.plotting.TimeSeriesDisplay(obj, figsize=(15, 10), subplot_shape=(2,))
# Plot up the diffuse variable in the first plot
display.plot(variable, subplot_index=(0,))
# Plot up a day/night background
display.day_night_background(subplot_index=(0,))
# Plot up the QC variable in the second plot
display.qc_flag_block_plot(variable, subplot_index=(1,))
plt.show()
Filter Data#
Let’s try and filter some of those outliers out based on the embedded QC in the files.
# Now lets remove some of these outliers
obj.qcfilter.datafilter(variable, rm_tests=[2, 3], del_qc_var=False)
# And plot the data again
# Create a plotting display object with 2 plots
display = act.plotting.TimeSeriesDisplay(obj, figsize=(15, 10), subplot_shape=(2,))
# Plot up the diffuse variable in the first plot
display.plot(variable, subplot_index=(0,))
# Plot up a day/night background
display.day_night_background(subplot_index=(0,))
# Plot up the QC variable in the second plot
display.qc_flag_block_plot(variable, subplot_index=(1,))
plt.show()
Query the DQR Webservice#
Since the embedded QC is not removing all the outliers, let’s check to see if there are any Data Quality Reports (DQR) using ARMs DQR Webservice. The great thing is, that ACT has codes for working with this webservice.
In this example, we can see that there’s a DQRfor a shadowband misalignment and we can find out more information by looking at the actual DQR.
# Query the ARM DQR Webservice
obj = act.qc.arm.add_dqr_to_qc(obj, variable=variable)
#And plot again!
# Create a plotting display object with 2 plots
display = act.plotting.TimeSeriesDisplay(obj, figsize=(15, 10), subplot_shape=(2,))
# Plot up the diffuse variable in the first plot
display.plot(variable, subplot_index=(0,))
# Plot up a day/night background
display.day_night_background(subplot_index=(0,))
# Plot up the QC variable in the second plot
display.qc_flag_block_plot(variable, subplot_index=(1,))
plt.show()
Add QC Tests#
ACT has a number of additional QC tests that could be applied to the data. For this next example, let’s apply a new maximum test and bring that upper limit down a bit. We are also going to filter the data based on this new test and plot up the results.
# Add a new maximum tests
obj.qcfilter.add_greater_test(variable, 0.4, test_meaning='New maximum tests limit')
# Filter that test out
obj.qcfilter.datafilter(variable, rm_tests=[5], del_qc_var=False)
#And plot again!
# Create a plotting display object with 2 plots
display = act.plotting.TimeSeriesDisplay(obj, figsize=(15, 10), subplot_shape=(2,))
# Plot up the diffuse variable in the first plot
display.plot(variable, subplot_index=(0,))
# Plot up a day/night background
display.day_night_background(subplot_index=(0,))
# Plot up the QC variable in the second plot
display.qc_flag_block_plot(variable, subplot_index=(1,))
plt.show()
Instrument Specific QC Tests#
ACT has a growing library of instrument specific tests such as the fast-fourier transform test to detect shading which was adapted from Alexandrov et al 2007. The adaption is that it is applied in a moving window style approach. Note - Check out the webpage as an example of how we are including references to papers behind the codes
Let’s apply it and see how it compares with the DQR!
# Apply test
obj = act.qc.fft_shading_test(obj, variable=variable)
# Create a plotting display object with 2 plots
display = act.plotting.TimeSeriesDisplay(obj, figsize=(15, 10), subplot_shape=(2,))
# Plot up the diffuse variable in the first plot
display.plot(variable, subplot_index=(0,))
# Plot up a day/night background
display.day_night_background(subplot_index=(0,))
# Plot up the QC variable in the second plot
display.qc_flag_block_plot(variable, subplot_index=(1,))
plt.show()
Conclusion#
In this tutorial, we have shown you how to download data from ARM’s Data Live web service, visualize the data and QC information, query the DQR webservice, filter data based on the QC, and add new QC tests to the dataset. After all this work, you can easily save the xarray object to a NetCDF file using obj.to_netcdf('filename.nc')
and all that data will be saved and usable in Python and ACT.
Please checkout the ACT Github repository for the latest and greatest information, including our documentation which has examples that can be downloading in python or Jupyter Notebook formats.
Second ACT!#
But wait, there’s more to ACT that we can explore together or that you can do on your own! These examples are going to be more condensed than the above but should still provide you the insight you need to run and do your own things!
We are going to need some additional libraries to help out though!
Imports#
import numpy as np
Wind Roses#
# Let's download a month of surface meteorological data from the SGP central facility!
results = act.discovery.download_arm_data(username, token, 'sgpmetE13.b1', '2021-05-01', '2021-05-31')
# Read that data into an object (this will concatenate it all for you)
obj = act.io.arm.read_arm_netcdf(results)
# Now we can plot up a wind rose of that entire month's worth of data
windrose = act.plotting.WindRoseDisplay(obj, figsize=(10,8))
windrose.plot('wdir_vec_mean', 'wspd_vec_mean', spd_bins=np.linspace(0, 10, 5))
windrose.axes[0].legend()
plt.show()
Present Weather Detector Codes#
With the MET system at the main site, there’s also a present weather detector (PWD) deployed. This PWD reports the present weather in WMO codes but can be easily decoded using a utility in ACT. With this information, you can make fancy plots like the DQ Office plots for the PWD.
# Let's just use one of the files from the previous example
obj = act.io.arm.read_arm_netcdf(results[21])
# Pass it to the function to decode it along with the variable name
obj = act.utils.inst_utils.decode_present_weather(obj, variable='pwd_pw_code_inst')
# We're going to print out the first 10 decoded values that weren't 0
# This shows the utility of also being able to use the built-in xarray
# features like where!
print(list(obj['pwd_pw_code_inst_decoded'].where(obj.pwd_pw_code_inst.compute() > 0, drop=True).values[0:10]))
Doppler Lidar Wind Retrievals#
This will show you how you can process the doppler lidar PPI scans to produce wind profiles based on Newsom et al 2016.
# We're going to use some test data that already exists within ACT
obj = act.io.arm.read_arm_netcdf(act.tests.sample_files.EXAMPLE_DLPPI_MULTI)
# Returns the wind retrieval information in a new object by default
# Note that the default snr_threshold of 0.008 was too high for the first profile
# Reducing it to 0.002 makes it show up but the quality of the data is likely suspect.
wind_obj = act.retrievals.compute_winds_from_ppi(obj, snr_threshold=0.002)
# Plot it up
display = act.plotting.TimeSeriesDisplay(wind_obj)
display.plot_barbs_from_spd_dir('wind_direction', 'wind_speed', invert_y_axis=False)
#Update the x-limits to make sure both wind profiles are shown
display.axes[0].set_xlim([np.datetime64('2019-10-15T11:45'), np.datetime64('2019-10-15T12:30')])
plt.show()
Radiosonde Plotting and More!#
This will take you through how to plot up a Skew-T plot along with a geographic plot of the radiosonde track on a map. Additionally, will run this through a retrieval to calculate the PBL height using the Liu Liang method.
# Import MetPy if possible
import metpy
# Read in sample radiosonde data and plot up a Skew-T
obj = act.io.arm.read_arm_netcdf(act.tests.EXAMPLE_SONDE1)
skewt = act.plotting.SkewTDisplay(obj, figsize=(10, 8))
skewt.plot_from_u_and_v('u_wind', 'v_wind', 'pres', 'tdry', 'dp')
plt.show()
# Now let's plot up the radiosonde path on a map!
display = act.plotting.GeographicPlotDisplay(obj)
display.geoplot(data_field='pres', title='Radiosonde Path')
plt.show()
# We need to update the units on temperature before running the retrieval
obj.utils.change_units(variables='tdry', desired_unit='degree_Celsius')
obj = act.retrievals.calculate_pbl_liu_liang(obj)
print('Regime = ', obj['pblht_regime_liu_liang'].values, '\nPBL Height = ', int(obj['pblht_liu_liang'].values))
Mimic ARM Data Files#
ARM’s NetCDF files are based around what we call a data object definition or DOD. These DOD’s essentially create the structure of the file and are what you see in the NetCDF file as the header. We can use this information to create an xarray object, filled with missing value, that one can populated with data and then write it out to a NetCDF file that looks exactly like an ARM file.
The user is able to set up the size of the datasets ahead of time by passing in the dimension sizes as shown below with {'time': 1440}
This could greatly streamline and improve the usability of PI-submitted datasets.
Note, that this does take some time for datastreams like the MET that have a lot of versions.
obj = act.io.arm.create_ds_from_arm_dod('met.b1', {'time': 1440}, scalar_fill_dim='time')
# Create some random data and set it to the variable in the obect like normal
obj['temp_mean'].values = np.random.rand(1440)
obj